Hardware development for CRYSTALS-Kyber is essential to prevent future quantum computer attacks. However, existing CRYSTALS-Kyber hardware often has low performance and lacks the flexibility to support the three opera...
详细信息
Virtual try-on can significantly improve the garment shopping experiences in both online and in-store scenarios, attracting broad interest in computer vision. However, to achieve high-fidelity try-on performance, most...
详细信息
ISBN:
(纸本)9798350344868;9798350344851
Virtual try-on can significantly improve the garment shopping experiences in both online and in-store scenarios, attracting broad interest in computer vision. However, to achieve high-fidelity try-on performance, most state-of-the-art methods still rely on accurate segmentation masks, which are often produced by near-perfect parsers or manual labeling. To overcome the bottleneck, we propose a parser-free virtual try-on method based on the diffusion model (PFDM). Given two images, PFDM can "wear" garments on the target person seamlessly by implicitly warping without any other information. To learn the model effectively, we synthesize many pseudo-images and construct sample pairs by wearing various garments on persons. Supervised by the large-scale expanded dataset, we fuse the person and garment features using a proposed Garment Fusion Attention (GFA) mechanism. Experiments demonstrate that our proposed PFDM can successfully handle complex cases, synthesize high-fidelity images, and outperform both state-of-the-art parser-free and parser-based models.
We present ChatBLAS, the first AI-generated and portable Basic Linear Algebra Subprograms (BLAS) library on different CPU/GPU configurations. The purpose of this study is (i) to evaluate the capabilities of current la...
详细信息
This study conducts an in-depth examination of the effectiveness of robotic automation systems in the testing processes of POS devices used in payment systems. It compares the shortcomings of traditional manual testin...
详细信息
Reliable computer vision object classification is important for security applications that make high stakes decisions based on automated algorithms. In real world scenarios, it is often impractical to meet the implici...
详细信息
ISBN:
(纸本)9781510673977;9781510673960
Reliable computer vision object classification is important for security applications that make high stakes decisions based on automated algorithms. In real world scenarios, it is often impractical to meet the implicit assumption that all relevant, labelled data may be attained prior to training. To avoid performance degradation, a recently developed open-set detection framework is applied to the classification of ships from clutter in satellite, Electro-Optical (EO) imagery and is shown to reliably identify data that is out of distribution from training data. A Binary Classifier (BC) and Category-aware Binary Classifier (CBC) model were compared to OpenMax and found to provide improvements in identifying unknown imagery. This enables an operator to know whether to believe classification results from a deep learning-based algorithm.
high-resolution images enable neural networks to learn richer visual representations. However, this improved performance comes at the cost of growing computational complexity, hindering their usage in latency-sensitiv...
详细信息
ISBN:
(纸本)9798350301298
high-resolution images enable neural networks to learn richer visual representations. However, this improved performance comes at the cost of growing computational complexity, hindering their usage in latency-sensitive applications. As not all pixels are equal, skipping computations for less-important regions offers a simple and effective measure to reduce the computation. This, however, is hard to be translated into actual speedup for CNNs since it breaks the regularity of the dense convolution workload. In this paper, we introduce SparseViT that revisits activation sparsity for recent window-based vision transformers (ViTs). As window attentions are naturally batched over blocks, actual speedup with window activation pruning becomes possible: i.e., similar to 50% latency reduction with 60% sparsity. Different layers should be assigned with different pruning ratios due to their diverse sensitivities and computational costs. We introduce sparsity-aware adaptation and apply the evolutionary search to efficiently find the optimal layerwise sparsity configuration within the vast search space. SparseViT achieves speedups of 1.5x, 1.4x, and 1.3x compared to its dense counterpart in monocular 3D object detection, 2D instance segmentation, and 2D semantic segmentation, respectively, with negligible to no loss of accuracy.
This study is a preliminary one aims to find out the performance of the attitudes and behaviors of Binus University students towards sustainable development to gain the competitive advantages in ASEAN business competi...
详细信息
Prior approaches to the neural rendering of global illumination typically rely on complex network architectures and training strategies to model the global effects. This often leads to impractically high overheads for...
详细信息
ISBN:
(纸本)9798400711312
Prior approaches to the neural rendering of global illumination typically rely on complex network architectures and training strategies to model the global effects. This often leads to impractically high overheads for both training and inference. The neural radiosity technique marks a significant advancement by injecting the radiometric prior into the training process, allowing for efficient modeling of the global radiance fields using a lightweight network and grid-based representations. However, this method encounters difficulties in modeling dynamic scenes, as the high-dimensional feature space quickly becomes unmanageable as the number of varying scene parameters grows. In this work, we extend neural radiosity for variable scenes through a novel neural decomposition method. To achieve this, we first parameterize the animated scene with an explicit vector v, which conditions a high-dimensional radiance field L-theta. We then develop a practical representation for L-theta by decomposing the high-dimensional feature grid into 3D grids, 2D feature planes, and lightweight MLPs. This strategy effectively models the correlation between 3D spatial features and dynamic scene variables, while maintaining a practical memory and computational cost. Experimental results show that our method facilitates efficient dynamic global illumination rendering with practical runtime performance, outperforming previous state-of-the-art techniques with both reduced training and inference costs.
In recent years, quantization technology has proven to be very effective in the field of supervised image retrieval, owing to its capacity to provide both high accuracy and swift retrieval speeds. However, the challen...
详细信息
暂无评论