This study reveals a cutting-edge re-balanced contrastive learning strategy aimed at strengthening face anti-spoofing capabilities within facial recognition systems, with a focus on countering the challenges posed by ...
详细信息
ISBN:
(数字)9798350365474
ISBN:
(纸本)9798350365481
This study reveals a cutting-edge re-balanced contrastive learning strategy aimed at strengthening face anti-spoofing capabilities within facial recognition systems, with a focus on countering the challenges posed by printed photos, and highly realistic silicone or latex masks. Leveraging the HySpeFAS dataset, which benefits from Snapshot Spectral Imaging technology to provide hyperspectral images, our approach harmonizes class-level contrastive learning with data resampling and an innovative real-face oriented reweighting technique. This method effectively mitigates dataset imbalances and reduces identity-related biases. Notably, our strategy achieved an unprecedented 0.0000% Average Classification Error Rate (ACER) on the HySpeFAS dataset, ranking first at the Chalearn Snapshot Spectral Imaging Face Anti-spoofing Challenge on CVPR 2024.
Automated test generation tools often produce assertions that reflect implemented behavior, limiting their usage to regression testing. In this paper, we propose LLMProphet, a black-box approach that applies Few-Shot ...
详细信息
The Geosynchronous Equatorial Orbit (GEO) is home to many important space assets such as telecommunication and navigational satellites. Monitoring Resident Space Objects (RSOs) in GEO is a crucial aspect in achieving ...
详细信息
ISBN:
(纸本)9781665448994
The Geosynchronous Equatorial Orbit (GEO) is home to many important space assets such as telecommunication and navigational satellites. Monitoring Resident Space Objects (RSOs) in GEO is a crucial aspect in achieving Space Situational Awareness (SSA) and in protecting critical space assets. However, ground-based GEO object detection is challenging due to the extreme distance of the targets, as well as nuisance factors including cloud coverage, atmospheric/weather effects, light pollution, sensor noise/defects, and star occlusions. The Kelvins SpotGEO Challenge is designed to establish to what extent images coming from a low-cost ground-based telescope can be used to detect GEO and near-GEO RSOs solely from photometric signals that are without any additional meta-data. At the same time, the SpotGEO dataset also addresses the lack of publicly available datasets from a computervision perspective on the satellite detection problem;by assembling and releasing such a dataset, we hope to spur more efforts on the optical detection of RSOs and enable objective bench-marking for existing and future methods. In this work, we present details of the SpotGEO dataset development, challenge design, evaluation metric, and result analysis.
The rapid advancement of machine learning (ML) technologies has driven the development of specialized hardware accelerators designed to facilitate more efficient model training. This paper introduces the CARAML benchm...
详细信息
In recent years, deep learning has achieved innovative advancements in various fields, including the analysis of human emotions and behaviors. Initiatives such as the Affective Behavior Analysis in-the-wild (ABAW) com...
详细信息
ISBN:
(数字)9798350365474
ISBN:
(纸本)9798350365481
In recent years, deep learning has achieved innovative advancements in various fields, including the analysis of human emotions and behaviors. Initiatives such as the Affective Behavior Analysis in-the-wild (ABAW) competition have been particularly instrumental in driving research in this area by providing diverse and challenging datasets that enable precise evaluation of complex emotional states. This study leverages the vision Transformer (ViT) and Transformer models to focus on the estimation of Valence-Arousal (VA), which signifies the positivity and intensity of emotions, recognition of various facial expressions, and detection of Action Units (AU) representing fundamental muscle movements. This approach transcends traditional Convolutional Neural Networks (CNN) and Long Short-Term Memory (LSTM) based methods, proposing a new Transformer-based framework that maximizes the understanding of temporal and spatial features. The core contributions of this research include the introduction of a learning technique through random frame masking and the application of Focal loss adapted for imbalanced data, enhancing the accuracy and applicability of emotion and behavior analysis in real-world settings. This approach is expected to contribute to the advancement of emotional computing and deep learning methodologies.
Experience precedes understanding. Humans constantly explore and learn about their environment out of curiosity, gather information, and update their models of the world. On the other hand, machines are either trained...
详细信息
One-shot action recognition aims to recognize new action categories from a single reference example, typically referred to as the anchor example. This work presents a novel approach for one-shot action recognition in ...
详细信息
ISBN:
(纸本)9781665448994
One-shot action recognition aims to recognize new action categories from a single reference example, typically referred to as the anchor example. This work presents a novel approach for one-shot action recognition in the wild that computes motion representations robust to variable kinematic conditions. One-shot action recognition is then performed by evaluating anchor and target motion representations. We also develop a set of complementary steps that boost the action recognition performance in the most challenging scenarios. Our approach is evaluated on the public NTU-120 one-shot action recognition benchmark, outperforming previous action recognition models. Besides, we evaluate our framework on a real use-case of therapy with autistic people. These recordings are particularly challenging due to high-level artifacts from the patient motion. Our results provide not only quantitative but also online qualitative measures, essential for the patient evaluation and monitoring during the actual therapy.
Image datasets in specialized fields of science, such as biomedicine, are typically smaller than traditional machine learning datasets. As such, they present a problem for training many models. To address this challen...
详细信息
ISBN:
(数字)9798350365474
ISBN:
(纸本)9798350365481
Image datasets in specialized fields of science, such as biomedicine, are typically smaller than traditional machine learning datasets. As such, they present a problem for training many models. To address this challenge, researchers often attempt to incorporate priors, i.e., external knowledge, to help the learning procedure. Geometric priors, for example, offer to restrict the learning process to the manifold to which the data belong. However, learning on manifolds is sometimes computationally intensive to the point of being prohibitive. Here, we ask a provocative question: is machine learning on manifolds really more accurate than its linear counterpart to the extent that it is worth sacrificing significant speedup in computation? We answer this question through an extensive theoretical and experimental study of one of the most common learning methods for manifold-valued data: geodesic regression.
In this paper, we propose a novel layer based on fast Walsh-Hadamard transform (WHT) and smooth-thresholding to replace 1 x 1 convolution layers in deep neural networks. In the WHT domain, we denoise the transform dom...
详细信息
ISBN:
(纸本)9781665448994
In this paper, we propose a novel layer based on fast Walsh-Hadamard transform (WHT) and smooth-thresholding to replace 1 x 1 convolution layers in deep neural networks. In the WHT domain, we denoise the transform domain coefficients using the new smooth-thresholding non-linearity, a smoothed version of the well-known soft-thresholding operator. We also introduce a family of multiplication-free operators from the basic 2x2 Hadamard transform to implement 3 x 3 depthwise separable convolution layers. Using these two types of layers, we replace the bottleneck layers in MobileNet-V2 to reduce the network's number of parameters with a slight loss in accuracy. For example, by replacing the final third bottleneck layers, we reduce the number of parameters from 2.270M to 947K. This reduces the accuracy from 95.21% to 92.88% on the CIFAR-10 dataset. Our approach significantly improves the speed of data processing. The fast Walsh-Hadamard transform has a computational complexity of O(mlog(2)m). As a result, it is computationally more efficient than the 1 x 1 convolution layer. The fast Walsh-Hadamard layer processes a tensor in R-10x32x32x1024 about 2 times faster than 1 x 1 convolution layer on NVIDIA Jetson Nano computer board.
In this study, we demonstrate the possibility of finding interpretable, domain-appropriate models of biological images, and propose that such a strategy can be used to derive scientific insight in domains involving ra...
详细信息
ISBN:
(数字)9798350365474
ISBN:
(纸本)9798350365481
In this study, we demonstrate the possibility of finding interpretable, domain-appropriate models of biological images, and propose that such a strategy can be used to derive scientific insight in domains involving raw data. This is achieved by the novel, concerted application of existing methods, namely, disentangled representation learning, sparse deep neural network training and symbolic regression. We demonstrate their relevance to the field of bioimaging using a well-studied test problem of classifying cell states in microscopy data. We find that such methods can produce highly parsimonious models that achieve ~ 98% of the accuracy of black-box benchmark models, with a tiny fraction of the complexity, and greater domain-appropriateness, as tested by adversarial attacks. As such, we provide proof of concept that interpretable, high-performing models can be used to produce scientific explanations of some underlying biological phenomenon.
暂无评论