The proceedings contain 802 papers. The topics discussed include: X-VARS: introducing explainability in football refereeing with multi-modal large language models;a hybrid ANN-SNN architecture for low-power and low-la...
ISBN:
(纸本)9798350365474
The proceedings contain 802 papers. The topics discussed include: X-VARS: introducing explainability in football refereeing with multi-modal large language models;a hybrid ANN-SNN architecture for low-power and low-latency visual perception;pseudo-label based unsupervised fine-tuning of a monocular 3D pose estimation model for sports motions;towards efficient audio-visual learners via empowering pre-trained vision transformers with cross-modal adaptation;a dual-mode approach for vision-based navigation in a lunar landing scenario;class similarity transition: decoupling class similarities and imbalance from generalized few-shot segmentation;ReweightOOD: loss reweighting for distance-based OOD detection;Hinge-Wasserstein: estimating multimodal aleatoric uncertainty in regression tasks;and ConPro: learning severity representation for medical images using contrastive learning and preference optimization.
The proceedings contain 698 papers. The topics discussed include: learning unbiased classifiers from biased data with meta-learning;robustness against gradient based attacks through cost effective network fine-tuning;...
ISBN:
(纸本)9798350302493
The proceedings contain 698 papers. The topics discussed include: learning unbiased classifiers from biased data with meta-learning;robustness against gradient based attacks through cost effective network fine-tuning;gradient attention balance network: mitigating face recognition racial bias via gradient attention;estimating and maximizing mutual information for knowledge distillation;synthetic sample selection for generalized zero-shot learning;training strategies for vision transformers for object detection;does image anonymization impact computervision training?;ultra-sonic sensor based object detection for autonomous vehicles;improvements to image reconstruction-based performance prediction for semantic segmentation in highly automated driving;zero-shot classification at different levels of granularity;difficulty estimation with action scores for computervision tasks;detail-preserving self-supervised monocular depth with self-supervised structural sharpening;isolated sign language recognition based on tree structure skeleton images;deep prototypical-parts ease morphological kidney stone identification and are competitively robust to photometric perturbations;wildlife image generation from scene graphs;towards characterizing the semantic robustness of face recognition;high-level context representation for emotion recognition in images;and mitigating catastrophic interference using unsupervised multi-part attention for RGB-IR face recognition.
The proceedings contain 561 papers. The topics discussed include: CORE: consistent representation learning for face forgery detection;aria: adversarially robust image attribution for content provenance;the reliability...
ISBN:
(纸本)9781665487399
The proceedings contain 561 papers. The topics discussed include: CORE: consistent representation learning for face forgery detection;aria: adversarially robust image attribution for content provenance;the reliability of forensic body-shape identification;detecting real-time deep-fake videos using active illumination;on the exploitation of deepfake model recognition;is synthetic voice detection research going into the right direction?;on improving cross-dataset generalization of deepfake detectors;rethinking adversarial examples in wargames;privacy leakage of adversarial training models in federated learning systems;towards comprehensive testing on the robustness of cooperative multi-agent reinforcement learning;robustness and adaptation to hidden factors of variation;adversarial robustness through the lens of convolutional filters;RODD: a self-supervised approach for robust out-of-distribution detection;an empirical study of data-free quantization’s tuning robustness;exploring robustness connection between artificial and natural adversarial examples;and adversarial machine learning attacks against video anomaly detection systems.
The proceedings contain 516 papers. The topics discussed include: OmniLayout: room layout reconstruction from indoor spherical panoramas;boosting adversarial robustness using feature level stochastic smoothing;beyond ...
ISBN:
(纸本)9781665448994
The proceedings contain 516 papers. The topics discussed include: OmniLayout: room layout reconstruction from indoor spherical panoramas;boosting adversarial robustness using feature level stochastic smoothing;beyond joint demosaicking and denoising: an image processing pipeline for a pixel-bin image sensor;assessment of deep learning based blood pressure prediction from PPG and rPPG signals;towards domain-specific explainable AI: model interpretation of a skin image classifier using a human approach;DAMSL: domain agnostic meta score-based learning;deep learning based spatial-temporal in-loop filtering for versatile video coding;automated tackle injury risk assessment in contact-based sports - a rugby union example;two-stage network for single image super-resolution;and ***: dataset for automatic mapping of buildings, woodlands, water and roads from aerial imagery.
The proceedings contain 523 papers. The topics discussed include: latent fingerprint image enhancement based on progressive generative adversarial network;zero-shot learning in the presence of hierarchically coarsened...
ISBN:
(纸本)9781728193601
The proceedings contain 523 papers. The topics discussed include: latent fingerprint image enhancement based on progressive generative adversarial network;zero-shot learning in the presence of hierarchically coarsened labels;multivariate confidence calibration for object detection;context-guided super-class inference for zero-shot detection;learning sparse ternary neural networks with entropy-constrained trained ternarization (EC2T);now that i can see, i can improve: enabling data-driven finetuning of CNNs on the edge;enhancing facial data diversity with style-based face aging;a simplified framework for zero-shot cross-modal sketch data retrieval;unsupervised single image super-resolution network (USISResNet) for real-world data using generative adversarial network;cross-regional oil palm tree detection;and leaf spot attention network for apple leaf disease identification.
The proceedings contain 355 papers. The topics discussed include: MultiNet++: multi-stream feature aggregation and geometric loss strategy for multi-task learning;privacy-preserving action recognition using coded aper...
ISBN:
(纸本)9781728125060
The proceedings contain 355 papers. The topics discussed include: MultiNet++: multi-stream feature aggregation and geometric loss strategy for multi-task learning;privacy-preserving action recognition using coded aperture videos;evading face recognition via partial tampering of faces;privacy-preserving annotation of face images through attribute-preserving face synthesis;towards deep neural network training on encrypted data;fooling automated surveillance cameras: adversarial patches to attack person detection;anonymousnet: natural face de-identification with measurable privacy;regularizer to mitigate gradient masking effect during single-step adversarial training;privacy preserving group membership verification and identification;defending against adversarial attacks using random forest;intersection to overpass: instance segmentation on filamentous structures with an orientation-aware neural network and terminus pairing algorithm;and surface parameterization and registration for statistical multiscale atlasing of organ development.
Neural Radiance Fields (NeRFs) have emerged as a standard framework for representing 3D scenes and objects, introducing a novel data type for information exchange and storage. Concurrently, significant progress has be...
详细信息
ISBN:
(纸本)9798350365474
Neural Radiance Fields (NeRFs) have emerged as a standard framework for representing 3D scenes and objects, introducing a novel data type for information exchange and storage. Concurrently, significant progress has been made in multimodal representation learning for text and image data. This paper explores a novel research direction that aims to connect the NeRF modality with other modalities, similar to established methodologies for images and text. To this end, we propose a simple framework that exploits pre-trained models for NeRF representations alongside multimodal models for text and image processing. Our framework learns a bidirectional mapping between NeRF embeddings and those obtained from corresponding images and text. This mapping unlocks several novel and useful applications, including NeRF zero-shot classification and NeRF retrieval from images or text.
Low-rank adaptation (LoRA) and its variants are widely employed in fine-tuning large models, including large language models for natural language processing and diffusion models for computervision. This paper propose...
详细信息
ISBN:
(纸本)9798350365474
Low-rank adaptation (LoRA) and its variants are widely employed in fine-tuning large models, including large language models for natural language processing and diffusion models for computervision. This paper proposes a generalized framework called SuperLoRA that unifies and extends different LoRA variants, which can be realized under different hyper-parameter settings. Introducing new options with grouping, folding, shuffling, projection, and tensor decomposition, SuperLoRA offers high flexibility and demonstrates superior performance, with up to 10-fold gain in parameter efficiency for transfer learning tasks.
Neuromorphic cameras feature asynchronous event-based pixel-level processing and are particularly useful for object tracking in dynamic environments. Current approaches for feature extraction and optical flow with hig...
详细信息
ISBN:
(纸本)9798350365474
Neuromorphic cameras feature asynchronous event-based pixel-level processing and are particularly useful for object tracking in dynamic environments. Current approaches for feature extraction and optical flow with high-performing hybrid RGB-events vision systems require large computational models and supervised learning, which impose challenges for embedded vision and require annotated datasets. In this work, we propose ED-DCFNet, a small and efficient (< 72k) unsupervised multidomain learning framework, which extracts events-frames shared features without requiring annotations, with comparable performance. Furthermore, we introduce an open-sourced event and frame-based dataset that captures indoor scenes with various lighting and motion-type conditions in realistic scenarios, which can be used for model building and evaluation. The dataset is available at https://***/NBELab/UnsupervisedTracking.
Multi-camera tracking (MCT) plays a crucial role in various computervision applications. However, accurate tracking of individuals across multiple cameras faces challenges, particularly with identity switches. In thi...
详细信息
ISBN:
(纸本)9798350365474
Multi-camera tracking (MCT) plays a crucial role in various computervision applications. However, accurate tracking of individuals across multiple cameras faces challenges, particularly with identity switches. In this paper, we present an efficient online MCT system that tackles these challenges through online processing. Our system leverages memory-efficient accumulated appearance features to provide stable representations of individuals across cameras and time. By incorporating trajectory validation using hierarchical agglomerative clustering (HAC) in overlapping regions, ID transfers are identified and rectified. Evaluation on the 2024 AI City Challenge Track 1 dataset [39] demonstrates the competitive performance of our system, achieving accurate tracking in both overlapping and non-overlapping camera networks. With a 40.3% HOTA score [29], our system ranked 9th in the challenge. The integration of trajectory validation enhances performance by 8% over the baseline, and the accumulated appearance features further contribute to a 17% improvement.
暂无评论