The proceedings contain 2356 papers. The topics discussed include: exploring discontinuity for video frame interpolation;two-view geometry scoring without correspondences;language-guided audio-visual source separation...
ISBN:
(纸本)9798350301298
The proceedings contain 2356 papers. The topics discussed include: exploring discontinuity for video frame interpolation;two-view geometry scoring without correspondences;language-guided audio-visual source separation via trimodal consistency;handwritten text generation from visual archetypes;Bayesian posterior approximation with stochastic ensembles;ERM-KTP: knowledge-level machine unlearning via knowledge transfer;PlenVDB: memory efficient VDB-based radiance fields for fast training and rendering;learning and aggregating lane graphs for urban automated driving;teaching matters: investigating the role of supervision in vision transformers;NeuralField-LDM: scene generation with hierarchical latent diffusion models;cut and learn for unsupervised object detection and instance segmentation;probabilistic debiasing of scene graphs;and unifying layout generation with a decoupled diffusion model.
The proceedings contain 16 papers. The special focus in this conference is on Segment Anything in Medical Images on Laptop. The topics include: Filters, Thresholds, and Geodesic Distances for Scribble-Based ...
ISBN:
(纸本)9783031818530
The proceedings contain 16 papers. The special focus in this conference is on Segment Anything in Medical Images on Laptop. The topics include: Filters, Thresholds, and Geodesic Distances for Scribble-Based Interactive Segmentation of Medical Images;Rep-MedSAM: Towards Real-Time and Universal Medical Image Segmentation;Swin-LiteMedSAM: A Lightweight Box-Based Segment Anything Model for Large-Scale Medical Image Datasets;a Light-Weight Universal Medical Segmentation Network for Laptops Based on Knowledge Distillation;taking a Step Back: Revisiting Classical Approaches for Efficient Interactive Segmentation of Medical Images;ExpertsMedSAM: Faster Medical Image Segment Anything with Mixture-of-Experts;efficient Quantization-Aware Training on Segment Anything Model in Medical Images and Its Deployment;Lite Class-Prompt Tiny-VIT for Multi-modality Medical Image Segmentation;Segment Anything in Medical Images with nnUNet;SwiftMedSAM: An Ultra-lightweight Prompt-Based Universal Medical Image Segmentation Model for Highly Constrained Environments;RepViT-MedSAM: Efficient Segment Anything in the Medical Images;U-MedSAM: Uncertainty-Aware MedSAM for Medical Image Segmentation;Modality-Specific Strategies for Medical Image Segmentation Using Lightweight SAM Architectures;gray’s Anatomy for Segment Anything Model: Optimizing Grayscale Medical Images for Fast and Lightweight Segmentation.
The proceedings contain 2 papers. The topics discussed include: attention mechanism exploits temporal contexts: real-time 3D human pose reconstruction;and cascaded deep monocular 3D human pose estimation with evolutio...
ISBN:
(纸本)9781728171685
The proceedings contain 2 papers. The topics discussed include: attention mechanism exploits temporal contexts: real-time 3D human pose reconstruction;and cascaded deep monocular 3D human pose estimation with evolutionary training data.
The proceedings contain 2715 papers. The topics discussed include: revisiting adversarial training at scale;SPIDeRS: structured polarization for invisible depth and reflectance sensing;MA-LMM: memory-augmented large m...
ISBN:
(纸本)9798350353006
The proceedings contain 2715 papers. The topics discussed include: revisiting adversarial training at scale;SPIDeRS: structured polarization for invisible depth and reflectance sensing;MA-LMM: memory-augmented large multimodal model for long-term video understanding;geometrically-driven aggregation for zero-shot 3D point cloud understanding;TextCraftor: your text encoder can be image quality controller;ViLa-MIL: dual-scale vision-language multiple instance learning for whole slide image classification;HumanNorm: learning normal diffusion model for high-quality and realistic 3D human generation;AnEmpirical study of scaling law for scene text recognition;improving image restoration through removing degradations in textual representations;and steganographic passport: an owner and user verifiable credential for deep model ip protection without retraining.
The proceedings contain 1658 papers. The topics discussed include: single-stage instance shadow detection with bidirectional relation learning;learning Delaunay surface elements for mesh reconstruction;fusing the old ...
ISBN:
(纸本)9781665445092
The proceedings contain 1658 papers. The topics discussed include: single-stage instance shadow detection with bidirectional relation learning;learning Delaunay surface elements for mesh reconstruction;fusing the old with the new: learning relative camera pose with geometry-guided uncertainty;uncertainty guided collaborative training for weakly supervised temporal action detection;privacy-preserving collaborative learning with automatic transformation search;rethinking and improving the robustness of image style transfer;style-aware normalized loss for improving arbitrary style transfer;faster meta update strategy for noise-robust deep learning;a hyperbolic-to-hyperbolic graph convolutional network;training networks in null space of feature covariance for continual learning;and exponential moving average normalization for self-supervised and semi-supervised learning.
The proceedings contain 1294 papers. The topics discussed include: finding task-relevant features for few-shot learning by category traversal;edge-labeling graph neural network for few-shot learning;generating classif...
ISBN:
(纸本)9781728132938
The proceedings contain 1294 papers. The topics discussed include: finding task-relevant features for few-shot learning by category traversal;edge-labeling graph neural network for few-shot learning;generating classification weights with GNN denoising autoencoders for few-shot learning;kervolutional neural networks;why ReLU networks yield high-confidence predictions far away from the training data and how to mitigate the problem;on the structural sensitivity of deep convolutional networks to the directions of fourier basis functions;hardness-aware deep metric learning;auto-deeplab: hierarchical neural architecture search for semantic image segmentation;striking the right balance with uncertainty;and SDRSAC: semidefinite-based randomized approach for robust point cloud registration without correspondences.
The proceedings contain 2072 papers. The topics discussed include: clipped hyperbolic classifiers are super-hyperbolic classifiers;efficient deep embedded subspace clustering;noise is also useful: negative correlation...
ISBN:
(纸本)9781665469463
The proceedings contain 2072 papers. The topics discussed include: clipped hyperbolic classifiers are super-hyperbolic classifiers;efficient deep embedded subspace clustering;noise is also useful: negative correlation-steered latent contrastive learning;active learning for open-set annotation;understanding and increasing efficiency of Frank-Wolfe adversarial training;robust optimization as data augmentation for large-scale graphs;a re-balancing strategy for class-imbalanced classification based on instance difficulty;the devil is in the margin: margin-based label smoothing for network calibration;towards better plasticity-stability trade-off in incremental learning: a simple linear connector;learning Bayesian sparse networks with full experience replay for continual learning;a variational Bayesian method for similarity learning in non-rigid image registration;learning to learn by jointly optimizing neural architecture and weights;learning to prompt for continual learning;multi-frame self-supervised depth with transformers;and rethinking Bayesian deep learning methods for semi-supervised volumetric medical image segmentation.
The proceedings contain 698 papers. The topics discussed include: learning unbiased classifiers from biased data with meta-learning;robustness against gradient based attacks through cost effective network fine-tuning;...
ISBN:
(纸本)9798350302493
The proceedings contain 698 papers. The topics discussed include: learning unbiased classifiers from biased data with meta-learning;robustness against gradient based attacks through cost effective network fine-tuning;gradient attention balance network: mitigating face recognition racial bias via gradient attention;estimating and maximizing mutual information for knowledge distillation;synthetic sample selection for generalized zero-shot learning;training strategies for vision transformers for object detection;does image anonymization impact computervision training?;ultra-sonic sensor based object detection for autonomous vehicles;improvements to image reconstruction-based performance prediction for semantic segmentation in highly automated driving;zero-shot classification at different levels of granularity;difficulty estimation with action scores for computervision tasks;detail-preserving self-supervised monocular depth with self-supervised structural sharpening;isolated sign language recognition based on tree structure skeleton images;deep prototypical-parts ease morphological kidney stone identification and are competitively robust to photometric perturbations;wildlife image generation from scene graphs;towards characterizing the semantic robustness of face recognition;high-level context representation for emotion recognition in images;and mitigating catastrophic interference using unsupervised multi-part attention for RGB-IR face recognition.
Facial expression recognition (FER) plays a crucial role in domains such as healthcare and access security. Traditional models primarily utilize convolutional networks to extract features like facial landmarks and pos...
详细信息
Facial expression recognition (FER) plays a crucial role in domains such as healthcare and access security. Traditional models primarily utilize convolutional networks to extract features like facial landmarks and positions of facial features. However, these methods often result in feature maps with significant redundancy, contributing minimally to network performance enhancement. To address this limitation, we propose the DPConv module, which innovatively segments the channel dimension and applies dual convolutional kernel sizes. This module replaces several convolutional blocks within the POSTER++ (Mao et al. in POSTER++: A Simpler and Stronger Facial Expression recognition Network. arXiv:2301.12149, 2023) architecture, leading to a reduction in parameters while simultaneously enhancing network efficiency and accuracy. Moreover, we propose a sliding window multi-head cross-self-attention mechanism, which is based on the sliding window multi-head self-attention (Liu et al. in Proceedings of the ieee/cvf International conference on computervision, 2021) mechanism, which substitutes the conventional attention mechanism, facilitating the modeling of global dependencies and further optimizing the network's overall performance. Our model, DPPOSTER, was tested on the RAF-DB, FERPlus and SFEW datasets, and experimental comparisons were conducted with different combinations of convolution kernel sizes and channel segmentation ratios. The results showed that DPPOSTER achieved performance improvements of 0.59%, 0.37% and 2.32% over POSTER++ on the RAF-DB, FERPlus and SFEW datasets, respectively.
Medical Image Foundation Models have proven to be powerful tools for mask prediction across various datasets. However, accurately assessing the uncertainty of their predictions remains a significant challenge. To addr...
详细信息
暂无评论