检索结果-内蒙古大学图书馆

2023 ieee/cvf conference on computer vision and pattern recognition Workshops, CVPRW 2023

ISBN: (纸本)9798350302493

The proceedings contain 698 papers. The topics discussed include: learning unbiased classifiers from biased data with meta-learning;robustness against gradient based attacks through cost effective network fine-tuning;gradient attention balance network: mitigating face recognition racial bias via gradient attention;estimating and maximizing mutual information for knowledge distillation;synthetic sample selection for generalized zero-shot learning;training strategies for vision transformers for object detection;does image anonymization impact computer vision training?;ultra-sonic sensor based object detection for autonomous vehicles;improvements to image reconstruction-based performance prediction for semantic segmentation in highly automated driving;zero-shot classification at different levels of granularity;difficulty estimation with action scores for computer vision tasks;detail-preserving self-supervised monocular depth with self-supervised structural sharpening;isolated sign language recognition based on tree structure skeleton images;deep prototypical-parts ease morphological kidney stone identification and are competitively robust to photometric perturbations;wildlife image generation from scene graphs;towards characterizing the semantic robustness of face recognition;high-level context representation for emotion recognition in images;and mitigating catastrophic interference using unsupervised multi-part attention for RGB-IR face recognition.

关键词：

来源：评论

学校读者我要写书评

暂无评论

Proceedings - 2022 ieee/cvf conference on computer vision and pattern recognition Workshops, CVPRW 2022

Proceedings - 2022 IEEE/CVF Conference on Computer Vision an...

引用

2022 ieee/cvf conference on computer vision and pattern recognition Workshops, CVPRW 2022

ISBN: (纸本)9781665487399

The proceedings contain 561 papers. The topics discussed include: CORE: consistent representation learning for face forgery detection;aria: adversarially robust image attribution for content provenance;the reliability of forensic body-shape identification;detecting real-time deep-fake videos using active illumination;on the exploitation of deepfake model recognition;is synthetic voice detection research going into the right direction?;on improving cross-dataset generalization of deepfake detectors;rethinking adversarial examples in wargames;privacy leakage of adversarial training models in federated learning systems;towards comprehensive testing on the robustness of cooperative multi-agent reinforcement learning;robustness and adaptation to hidden factors of variation;adversarial robustness through the lens of convolutional filters;RODD: a self-supervised approach for robust out-of-distribution detection;an empirical study of data-free quantization’s tuning robustness;exploring robustness connection between artificial and natural adversarial examples;and adversarial machine learning attacks against video anomaly detection systems.

关键词：

来源：评论

学校读者我要写书评

暂无评论

Proceedings - 2022 ieee/cvf conference on computer vision and pattern recognition, CVPR 2022

Proceedings - 2022 IEEE/CVF Conference on Computer Vision an...

引用

2022 ieee/cvf conference on computer vision and pattern recognition, CVPR 2022

ISBN: (纸本)9781665469463

The proceedings contain 2072 papers. The topics discussed include: clipped hyperbolic classifiers are super-hyperbolic classifiers;efficient deep embedded subspace clustering;noise is also useful: negative correlation-steered latent contrastive learning;active learning for open-set annotation;understanding and increasing efficiency of Frank-Wolfe adversarial training;robust optimization as data augmentation for large-scale graphs;a re-balancing strategy for class-imbalanced classification based on instance difficulty;the devil is in the margin: margin-based label smoothing for network calibration;towards better plasticity-stability trade-off in incremental learning: a simple linear connector;learning Bayesian sparse networks with full experience replay for continual learning;a variational Bayesian method for similarity learning in non-rigid image registration;learning to learn by jointly optimizing neural architecture and weights;learning to prompt for continual learning;multi-frame self-supervised depth with transformers;and rethinking Bayesian deep learning methods for semi-supervised volumetric medical image segmentation.

关键词：

来源：评论

学校读者我要写书评

暂无评论

Low-Rank Adaptation vs. Fine-Tuning for Handwritten Text recognition

Low-Rank Adaptation vs. Fine-Tuning for Handwritten Text Rec...

引用

2025 ieee/cvf Winter conference on Applications of computer vision Workshops, WACVW 2025

作者： Huttner, Lukas Mayr, Martin Gorges, Thomas Wu, Fei Seuret, Mathias Maier, Andreas Christlein, Vincent Friedrich-Alexander Universität Erlangen-Nürnberg Pattern Recognition Lab Erlangen91058 Germany

ISBN: (纸本)9798331536626

The continuous expansion of neural network sizes is a notable trend in machine learning, with transformer models exceeding 20 billion parameters in computer vision. This growth comes with rising demands for computational resources and large-scale datasets. Efficient techniques for transfer learning thus become an attractive option in setups with limited data, as in handwriting recognition. Recently, parameter-efficient fine-tuning (PEFT) methods, such as low-rank adaptation (LoRA) and weight-decomposed low-rank adaptation (DoRA), have gained wide-spread interest. In this paper, we explore tradeoffs in parameter-efficient transfer learning using the synthetically pretrained Transformer-Based Optical Character recognition (TrOCR) model for handwritten text recognition with LoRA and DoRA. Additionally, we analyze the performance of full fine-tuning with a limited number of samples, scaling from a few-shot learning scenario up to using the whole dataset. We conduct experiments on the popular IAM Handwriting database as well as the historical READ 2016 dataset. We find that (a) LoRA/DoRA does not outperform full fine-tuning as opposed to a recent paper and (b) LoRA/DoRA is not substantially faster than full fine-tuning of TrOCR. © 2025 ieee.

关键词： Optical character recognition

来源：评论

学校读者我要写书评

暂无评论

Proceedings - 2025 ieee/cvf Winter conference on Applications of computer vision Workshops, WACVW 2025

Proceedings - 2025 IEEE/CVF Winter Conference on Application...

引用

2025 ieee/cvf Winter conference on Applications of computer vision Workshops, WACVW 2025

ISBN: (纸本)9798331536626

The proceedings contain 166 papers. The topics discussed include: applying computer vision to analyze self-injurious behaviors in children with autism spectrum disorder;underwater image enhancement and object detection: are poor object detection results on enhanced images due to missing human labels?;enhancing weakly-supervised object detection on static images through (hallucinated) motion;a zero-shot learning approach for ephemeral gully detection from remote sensing using vision language models;Attrivision: advancing generalization in pedestrian attribute recognition using CLIP;human gaze improves vision transformers by token masking;SSTAR: skeleton-based spatio-temporal action recognition for intelligent video surveillance and suicide prevention in metro stations;and offline signature verification in the banking domain.

关键词：

来源：评论

学校读者我要写书评

暂无评论

Expression recognition method based on feature redundancy optimization

引用

SIGNAL IMAGE AND VIDEO PROCESSING 2025年第4期19卷 1-11页

作者： Shao, Dangguo Zhuang, Luwei Ma, Lei Yi, Sanli Kunming Univ Sci & Technol Fac Informat Engn & Automat Kunming Peoples R China Yunnan Key Lab Comp Technol Applicat Kunming Peoples R China

Facial expression recognition (FER) plays a crucial role in domains such as healthcare and access security. Traditional models primarily utilize convolutional networks to extract features like facial landmarks and positions of facial features. However, these methods often result in feature maps with significant redundancy, contributing minimally to network performance enhancement. To address this limitation, we propose the DPConv module, which innovatively segments the channel dimension and applies dual convolutional kernel sizes. This module replaces several convolutional blocks within the POSTER++ (Mao et al. in POSTER++: A Simpler and Stronger Facial Expression recognition Network. arXiv:2301.12149, 2023) architecture, leading to a reduction in parameters while simultaneously enhancing network efficiency and accuracy. Moreover, we propose a sliding window multi-head cross-self-attention mechanism, which is based on the sliding window multi-head self-attention (Liu et al. in Proceedings of the ieee/cvf International conference on computer vision, 2021) mechanism, which substitutes the conventional attention mechanism, facilitating the modeling of global dependencies and further optimizing the network's overall performance. Our model, DPPOSTER, was tested on the RAF-DB, FERPlus and SFEW datasets, and experimental comparisons were conducted with different combinations of convolution kernel sizes and channel segmentation ratios. The results showed that DPPOSTER achieved performance improvements of 0.59%, 0.37% and 2.32% over POSTER++ on the RAF-DB, FERPlus and SFEW datasets, respectively.

关键词： Deep learning Facial expression recognition computer vision Feature redundancy Attention mechanism

来源：评论

学校读者我要写书评

暂无评论

Unified Face Matching and Physical-Digital Spoofing Attack Detection

Unified Face Matching and Physical-Digital Spoofing Attack D...

引用

2025 ieee/cvf Winter conference on Applications of computer vision Workshops, WACVW 2025

作者： Kunwar, Arun Rattani, Ajita University of North Texas Dept. of Computer Science and Engineering Denton United States

ISBN: (纸本)9798331536626

Face recognition technology has dramatically trans-formed the landscape of security, surveillance, and authentication systems, offering a user-friendly and non-invasive biometric solution. However, despite its significant advantages, face recognition systems face increasing threats from physical and digital spoofing attacks. Current research typically treats face recognition and attack detection as distinct classification challenges. This approach necessitates the implementation of separate models for each task, leading to considerable computational complexity, particularly on devices with limited resources. Such inefficiencies can stifle scalability and hinder performance. In response to these challenges, this paper introduces an innovative unified model designed for face recognition and detection of physical and digital attacks. By leveraging the advanced Swin Transformer backbone and incorporating Hil:o attention in a convolutional neural network framework, we address unified face recognition and spoof attack detection more effectively. Moreover, we introduce augmentation techniques that replicate the traits of physical and digital spoofing cues, significantly enhancing our model ro-bustness. Through comprehensive experimental evaluation across various datasets, we showcase the effectiveness of our model in unified face recognition and spoof detection. Additionally, we confirm its resilience against unseen physical and digital spoofing attacks, underscoring its potential for real-world applications. © 2025 ieee.

关键词： digital attacks face matching physical attacks unified models

来源：评论

学校读者我要写书评

暂无评论

Proceedings - 2025 ieee Winter conference on Applications of computer vision, WACV 2025

Proceedings - 2025 IEEE Winter Conference on Applications of...

引用

2025 ieee/cvf Winter conference on Applications of computer vision, WACV 2025

ISBN: (纸本)9798331510831

The proceedings contain 929 papers. The topics discussed include: image adaptation for color vision deficient viewers using vision transformers;a regional-level resource-saving model for winter road surface snow detection in extreme weathers;beyond grids: exploring elastic input sampling for vision transformers;loose social-interaction recognition in real-world therapy scenarios;adversarial attention deficit: fooling deformable vision transformers with collaborative adversarial patches;enhancing scene graph generation with hierarchical relationships and commonsense knowledge;bandit-based attention mechanism in vision transformers;pre-capture privacy via adaptive single-pixel imaging;and context-aware outlier rejection for robust multi-view 3d tracking of similar small birds in an outdoor aviary.

关键词：

来源：评论

学校读者我要写书评

暂无评论

Infant Action Generative Modeling

Infant Action Generative Modeling

引用

2025 ieee/cvf Winter conference on Applications of computer vision, WACV 2025

作者： Huang, Xiaofei Hatamimajoumerd, Elaheh Mathew, Amal Ostadabbas, Sarah Department of Electrical and Computer Engineering BostonMA United States

ISBN: (纸本)9798331510831

Despite advancements in human motion generation models, their performance drops in infant motion generation due to limited data available and lack of 3D skeleton ground truth. To address this, we introduce the infant action generation and classification (InfAGenC) pipeline, which combines a transformer-based variational autoencoder (VAE) with a spatial-temporal graph convolutional network (STGCN) to create synthetic infant action samples. By iterative refinement of the generative model with diverse and accurate data, we improve the realism of synthetic data, leading to more precise infant action recognition models. Our results show significant improvements in action recognition performance on real-world data, demonstrating that synthetic data can enhance small training datasets and advance infant action recognition. Our pipeline increases action recognition accuracy up to 88.58% on the infant action dataset and up to 98% on an adult action dataset11The InfAGenC code and our infant skeletal data available at https://***/ostadabbas/Infant-Action-Generative-Modeling.. © 2025 ieee.

关键词： Image coding

来源：评论

学校读者我要写书评

暂无评论

DocTTT: Test-Time Training for Handwritten Document recognition Using Meta-Auxiliary Learning

DocTTT: Test-Time Training for Handwritten Document Recognit...

引用

2025 ieee/cvf Winter conference on Applications of computer vision, WACV 2025

作者： Gu, Wenhao Gu, Li Wang, Ziqiang Suen, Ching Yee Wang, Yang Concordia University Department of Computer Science and Software Engineering Canada

ISBN: (纸本)9798331510831

Despite recent significant advancements in Handwritten Document recognition (HDR), the efficient and accurate recognition of text against complex backgrounds, diverse handwriting styles, and varying document layouts remains a practical challenge. Moreover, this issue is seldom addressed in academic research, particularly in scenarios with minimal annotated data available. In this paper, we introduce the DocTTT framework to address these challenges. The key innovation of our approach is that it uses test-time training to adapt the model to each specific input during testing. We propose a novel Meta-Auxiliary learning approach that combines Meta-learning and self-supervised Masked Autoencoder (MAE). During testing, we adapt the visual representation parameters using a self-supervised MAE loss. During training, we learn the model parameters using a meta-learning framework, so that the model parameters are learned to adapt to a new input effectively. Experimental results show that our proposed method significantly outperforms existing state-of-the-art approaches on benchmark datasets. © 2025 ieee.

关键词： Self-supervised learning

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：