The proceedings contain 698 papers. The topics discussed include: learning unbiased classifiers from biased data with meta-learning;robustness against gradient based attacks through cost effective network fine-tuning;...
ISBN:
(纸本)9798350302493
The proceedings contain 698 papers. The topics discussed include: learning unbiased classifiers from biased data with meta-learning;robustness against gradient based attacks through cost effective network fine-tuning;gradient attention balance network: mitigating face recognition racial bias via gradient attention;estimating and maximizing mutual information for knowledge distillation;synthetic sample selection for generalized zero-shot learning;training strategies for vision transformers for object detection;does image anonymization impact computervision training?;ultra-sonic sensor based object detection for autonomous vehicles;improvements to image reconstruction-based performance prediction for semantic segmentation in highly automated driving;zero-shot classification at different levels of granularity;difficulty estimation with action scores for computervision tasks;detail-preserving self-supervised monocular depth with self-supervised structural sharpening;isolated sign language recognition based on tree structure skeleton images;deep prototypical-parts ease morphological kidney stone identification and are competitively robust to photometric perturbations;wildlife image generation from scene graphs;towards characterizing the semantic robustness of face recognition;high-level context representation for emotion recognition in images;and mitigating catastrophic interference using unsupervised multi-part attention for RGB-IR face recognition.
The proceedings contain 561 papers. The topics discussed include: CORE: consistent representation learning for face forgery detection;aria: adversarially robust image attribution for content provenance;the reliability...
ISBN:
(纸本)9781665487399
The proceedings contain 561 papers. The topics discussed include: CORE: consistent representation learning for face forgery detection;aria: adversarially robust image attribution for content provenance;the reliability of forensic body-shape identification;detecting real-time deep-fake videos using active illumination;on the exploitation of deepfake model recognition;is synthetic voice detection research going into the right direction?;on improving cross-dataset generalization of deepfake detectors;rethinking adversarial examples in wargames;privacy leakage of adversarial training models in federated learning systems;towards comprehensive testing on the robustness of cooperative multi-agent reinforcement learning;robustness and adaptation to hidden factors of variation;adversarial robustness through the lens of convolutional filters;RODD: a self-supervised approach for robust out-of-distribution detection;an empirical study of data-free quantization’s tuning robustness;exploring robustness connection between artificial and natural adversarial examples;and adversarial machine learning attacks against video anomaly detection systems.
The proceedings contain 2072 papers. The topics discussed include: clipped hyperbolic classifiers are super-hyperbolic classifiers;efficient deep embedded subspace clustering;noise is also useful: negative correlation...
ISBN:
(纸本)9781665469463
The proceedings contain 2072 papers. The topics discussed include: clipped hyperbolic classifiers are super-hyperbolic classifiers;efficient deep embedded subspace clustering;noise is also useful: negative correlation-steered latent contrastive learning;active learning for open-set annotation;understanding and increasing efficiency of Frank-Wolfe adversarial training;robust optimization as data augmentation for large-scale graphs;a re-balancing strategy for class-imbalanced classification based on instance difficulty;the devil is in the margin: margin-based label smoothing for network calibration;towards better plasticity-stability trade-off in incremental learning: a simple linear connector;learning Bayesian sparse networks with full experience replay for continual learning;a variational Bayesian method for similarity learning in non-rigid image registration;learning to learn by jointly optimizing neural architecture and weights;learning to prompt for continual learning;multi-frame self-supervised depth with transformers;and rethinking Bayesian deep learning methods for semi-supervised volumetric medical image segmentation.
The continuous expansion of neural network sizes is a notable trend in machine learning, with transformer models exceeding 20 billion parameters in computervision. This growth comes with rising demands for computatio...
详细信息
The proceedings contain 166 papers. The topics discussed include: applying computervision to analyze self-injurious behaviors in children with autism spectrum disorder;underwater image enhancement and object detectio...
ISBN:
(纸本)9798331536626
The proceedings contain 166 papers. The topics discussed include: applying computervision to analyze self-injurious behaviors in children with autism spectrum disorder;underwater image enhancement and object detection: are poor object detection results on enhanced images due to missing human labels?;enhancing weakly-supervised object detection on static images through (hallucinated) motion;a zero-shot learning approach for ephemeral gully detection from remote sensing using vision language models;Attrivision: advancing generalization in pedestrian attribute recognition using CLIP;human gaze improves vision transformers by token masking;SSTAR: skeleton-based spatio-temporal action recognition for intelligent video surveillance and suicide prevention in metro stations;and offline signature verification in the banking domain.
Facial expression recognition (FER) plays a crucial role in domains such as healthcare and access security. Traditional models primarily utilize convolutional networks to extract features like facial landmarks and pos...
详细信息
Facial expression recognition (FER) plays a crucial role in domains such as healthcare and access security. Traditional models primarily utilize convolutional networks to extract features like facial landmarks and positions of facial features. However, these methods often result in feature maps with significant redundancy, contributing minimally to network performance enhancement. To address this limitation, we propose the DPConv module, which innovatively segments the channel dimension and applies dual convolutional kernel sizes. This module replaces several convolutional blocks within the POSTER++ (Mao et al. in POSTER++: A Simpler and Stronger Facial Expression recognition Network. arXiv:2301.12149, 2023) architecture, leading to a reduction in parameters while simultaneously enhancing network efficiency and accuracy. Moreover, we propose a sliding window multi-head cross-self-attention mechanism, which is based on the sliding window multi-head self-attention (Liu et al. in Proceedings of the ieee/cvf International conference on computervision, 2021) mechanism, which substitutes the conventional attention mechanism, facilitating the modeling of global dependencies and further optimizing the network's overall performance. Our model, DPPOSTER, was tested on the RAF-DB, FERPlus and SFEW datasets, and experimental comparisons were conducted with different combinations of convolution kernel sizes and channel segmentation ratios. The results showed that DPPOSTER achieved performance improvements of 0.59%, 0.37% and 2.32% over POSTER++ on the RAF-DB, FERPlus and SFEW datasets, respectively.
Face recognition technology has dramatically trans-formed the landscape of security, surveillance, and authentication systems, offering a user-friendly and non-invasive biometric solution. However, despite its signifi...
详细信息
The proceedings contain 929 papers. The topics discussed include: image adaptation for color vision deficient viewers using vision transformers;a regional-level resource-saving model for winter road surface snow detec...
ISBN:
(纸本)9798331510831
The proceedings contain 929 papers. The topics discussed include: image adaptation for color vision deficient viewers using vision transformers;a regional-level resource-saving model for winter road surface snow detection in extreme weathers;beyond grids: exploring elastic input sampling for vision transformers;loose social-interaction recognition in real-world therapy scenarios;adversarial attention deficit: fooling deformable vision transformers with collaborative adversarial patches;enhancing scene graph generation with hierarchical relationships and commonsense knowledge;bandit-based attention mechanism in vision transformers;pre-capture privacy via adaptive single-pixel imaging;and context-aware outlier rejection for robust multi-view 3d tracking of similar small birds in an outdoor aviary.
Despite advancements in human motion generation models, their performance drops in infant motion generation due to limited data available and lack of 3D skeleton ground truth. To address this, we introduce the infant ...
详细信息
Despite recent significant advancements in Handwritten Document recognition (HDR), the efficient and accurate recognition of text against complex backgrounds, diverse handwriting styles, and varying document layouts r...
详细信息
暂无评论