检索结果-内蒙古大学图书馆

ieee Winter Applications and computer vision Workshops (WACVW)

作者： Edoardo Bianchi Oswald Lanz Free University of Bozen-Bolzano Piazza Università 1 Bolzano Italy

ISBN: (数字)9798331536626

ISBN: (纸本)9798331536633

This paper introduces Gate-Shift-Pose, an enhanced version of Gate-Shift-Fuse networks, designed for athlete fall classification in figure skating by integrating skeleton pose data alongside RGB frames. We evaluate two fusion strategies: early-fusion, which combines RGB frames with Gaussian heatmaps of pose keypoints at the input stage, and latefusion, which employs a multi-stream architecture with attention mechanisms to combine RGB and pose features. Experiments on the FR-FS dataset demonstrate that Gate-Shift-Pose significantly outperforms the RGB-only baseline, improving accuracy by up to 40% with ResNet18 and 20% with ResNet50. Early-fusion achieves the highest accuracy (98.08%) with ResNet50, leveraging the model's capacity for effective multimodal integration, while latefusion is better suited for lighter backbones like ResNet18. These results highlight the potential of multimodal architectures for sports action recognition and the critical role of skeleton pose information in capturing complex motion patterns.

关键词： Heating systems Accuracy conferences computer architecture Logic gates Skeleton Multisensory integration pattern recognition Sports Residual neural networks

来源：评论

学校读者我要写书评

暂无评论

Pre-capture Privacy via Adaptive Single-Pixel Imaging

Pre-capture Privacy via Adaptive Single-Pixel Imaging

引用

ieee Workshop on Applications of computer vision (WACV)

作者： Yoko Sogabe Shiori Sugimoto Ayumi Matsumoto Masaki Kitahara NTT Corporation Japan

ISBN: (数字)9798331510831

ISBN: (纸本)9798331510848

As cameras become ubiquitous in our living environment, invasion of privacy is becoming a significant concern. A common approach to privacy preservation is to remove personally identifiable information from a captured image, but there is a risk of the original image being leaked. In this paper, we propose a pre-capture privacy-aware imaging method that captures images from which the details of a pre-specified anonymized target have been eliminated. The proposed method applies a single-pixel imaging frame-work in which we introduce a feedback mechanism called an aperture pattern generator (APG). The introduced APG adaptively outputs the next aperture pattern to avoid sampling the anonymized target by using already acquired data as a clue. Furthermore, the anonymized target can be set to any object without changing hardware. Except for the removed detailed features of the anonymized target, the captured images are of comparable quality to those captured by a general camera and can be used for various computer vision applications. We target faces and license plates and experimentally show that the proposed method can capture clear images in which detailed features of the anonymized target are eliminated, achieving both privacy and utility.

关键词： Privacy computer vision Apertures Cameras Hardware Generators Sensors License plate recognition Faces Identification of persons

来源：评论

学校读者我要写书评

暂无评论

Guest Editors' Introduction to the Special Section on Award Winning Papers from the ieee CS conference on computer vision and pattern recognition (CVPR)

引用

ieee TRANSACTIONS ON pattern ANALYSIS AND MACHINE INTELLIGENCE 2009年第12期31卷 2113-2114页

作者： Boyer, Kim Shah, Mubarak Syeda-Mahmood, Tanveer Rensselaer Polytech Inst Dept Elect Comp & Syst Engn Troy NY 12180 USA Univ Cent Florida Sch Elect Engn & Comp Sci Orlando FL 32816 USA IBM Almaden Res Ctr San Jose CA 95120 USA

The three articles in this special section are selected papers from the ieee CS conference on computer vision and pattern recognition that was held in Anchorage, AL, in June 2008.

关键词： computer vision pattern recognition Awards Stereo Image Processing Stereo vision Image Reconstruction computer Society History Surface Reconstruction Layout

来源：评论

学校读者我要写书评

暂无评论

Enhancing Predictive Imaging Biomarker Discovery Through Treatment Effect Analysis

Enhancing Predictive Imaging Biomarker Discovery Through Tre...

引用

2025 ieee/cvf Winter conference on Applications of computer vision, WACV 2025

作者： Xiao, Shuhan Klein, Lukas Petersen, Jens Vollmuth, Philipp Jaeger, Paul F. Maier-Hein, Klaus H. Heidelberg Division of Medical Image Computing Germany Heidelberg University Faculty of Mathematics and Computer Science Germany Dkfz Heidelberg Interactive Machine Learning Group Germany Institute for Machine Learning Eth Zürich Switzerland Dkfz Heidelberg Helmholtz Imaging Germany Germany University of Bonn Medical Faculty Bonn Germany Heidelberg University Hospital Pattern Analysis and Learning Group Department of Radiation Oncology Germany

ISBN: (纸本)9798331510831

Identifying predictive covariates, which forecast individual treatment effectiveness, is crucial for decision-making across different disciplines such as personalized medicine. These covariates, referred to as biomarkers, are extracted from pretreatment data, often within randomized controlled trials, and should be distinguished from prognostic biomarkers, which are independent of treatment assignment. Our study focuses on discovering predictive imaging biomarkers, specific image features, by leveraging pretreatment images to uncover new causal relationships. Unlike laborintensive approaches relying on handcrafted features prone to bias, we present a novel task of directly learning predictive features from images. We propose an evaluation protocol to assess a model's ability to identify predictive imaging biomarkers and differentiate them from purely prognostic ones by employing statistical testing and a comprehensive analysis of image feature attribution. We explore the suitability of deep learning models originally developed for estimating the conditional average treatment effect (CATE) for this task, which have been assessed primarily for their precision of CATE estimation while overlooking the evaluation of imaging biomarker discovery. Our proof-of-concept analysis demonstrates the feasibility and potential of our approach in discovering and validating predictive imaging biomarkers from synthetic outcomes and real-world image datasets. Our code is available at https://***/MIC-DKFZ/predictive_image_biomarker_analysis. © 2025 ieee.

关键词： Deep learning

来源：评论

学校读者我要写书评

暂无评论

Special Editors' Introduction to the Special Issue on Award-Winning Papers from the ieee conference on computer vision and pattern recognition 2010 (CVPR 2010)

引用

ieee TRANSACTIONS ON pattern ANALYSIS AND MACHINE INTELLIGENCE 2012年第9期34卷 1665-1666页

作者： Darrell, Trevor Hogg, David Jacobs, David Univ Calif Berkeley Div Comp Sci Berkeley CA 94704 USA Int Comp Sci Inst Berkeley CA 94704 USA Univ Leeds Sch Comp Leeds W Yorkshire England Univ Maryland Dept Comp Sci College Pk MD 20742 USA

The nine award-winning papers in this special section were presented at the ieee conference on computer vision and pattern recognition (CVPR 2010)that was held 13-18 June 2010 in San Francisco, CA.

关键词： Special Issues And Sections Meetings computer vision pattern recognition Awards

来源：评论

学校读者我要写书评

暂无评论

Treading Towards Privacy-Preserving Table Structure recognition

Treading Towards Privacy-Preserving Table Structure Recognit...

引用

ieee Workshop on Applications of computer vision (WACV)

作者： Sachin Raja Ajoy Mandal C V Jawahar IIIT Hyderabad

ISBN: (数字)9798331510831

ISBN: (纸本)9798331510848

We present TabGuard, a privacy-preserving framework for an end-to-end secure Table Structure recognition. Tab-Guard masks all the contents of the table locally and utilizes the masked table image for structure recognition. Our method is simple yet effective for detecting table cells while preserving the inherent table alignment characteristics to reconstruct tables. Our approach benefits from inductive bias, expressed through an approximated table grid which helps alleviate challenges in the detection of cells that are small or have extreme aspect ratios. Experimental results demonstrate that our solution not only establishes a new state-of-the-art on several benchmark datasets but also effectively addresses long-standing challenges associated with dense tables having complex layouts. We make our code publically available at https://***/sachinraja13/TabGuard.

关键词： Privacy computer vision Image recognition Codes Layout Government Benchmark testing Image reconstruction

来源：评论

学校读者我要写书评

暂无评论

Modality-Specific Strategies for Medical Image Segmentation Using Lightweight SAM Architectures

Modality-Specific Strategies for Medical Image Segmentation...

引用

International Challenge on Segment Anything in Medical Images on Laptop held in conjunction with the ieee/cvf conference on computer vision and pattern recognition, CVPR 2024

作者： Dao, Thuy Ye, Xincheng Scarsbrook, Joshua Balarupan, Gowrienanthan Ribeiro, Fernanda L. Bollmann, Steffen School of Electrical Engineering and Computer Science University of Queensland Brisbane Australia Queensland Digital Health Centre University of Queensland Brisbane Australia

ISBN: (纸本)9783031818530

Medical image segmentation tasks are often intricate and require medical domain expertise. Recent advancements in deep learning have expedited these demanding tasks, transitioning from specialized models tailored to each task to versatile foundation models capable of accommodating various image modalities. However, many of these foundation models are optimized for GPU computation, necessitating significant computational resources and constraining their practical utility in clinical settings. Furthermore, their variable accuracy across modalities and novel domains undermines their reliability in clinical practice. To address these limitations, we undertake a comparative investigation into deploying medical image segmentation models on CPU, focusing on accuracy and runtime efficiency, as part of the "CVPR 2024: Segment Anything In Medical Images On Laptop" challenge. Our methodology employs different models customized for each modality, including pre-trained EfficientViT-SAM and LiteMedSAM to yield the most precise and efficient outcomes. Additionally, to bolster model performance for datasets featuring small regions of interest, such as PET scans, we integrate a majority voting mechanism. We optimize runtime using the OpenVINO format within a C++ inference script. This approach improves inference runtime while maintaining competitive accuracy, achieving an average DSC score of 0.86 on the validation set and 0.75 on the testing set with an average runtime of 4.61 s on the testing set. Notably, given that most modalities are evaluated in a zero-shot manner, our findings suggest that the zero-shot capability of foundation models can be further refined through dataset-specific inference strategies. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2025.

关键词： C++ (programming language)

来源：评论

学校读者我要写书评

暂无评论

Enriching Local patterns with Multi-Token Attention for Broad-Sight Neural Networks

Enriching Local Patterns with Multi-Token Attention for Broa...

引用

ieee Workshop on Applications of computer vision (WACV)

作者： Hankyul Kang Jongbin Ryu Ajou University

ISBN: (数字)9798331510831

ISBN: (纸本)9798331510848

In neural networks, recognizing visual patterns is challenging because global average pooling disregards local patterns and solely relies on over-concentrated activation. Global average pooling enforces the network to learn objects regardless of their location, so features tend to be activated only in specific regions. To support this claim, we provide a novel analysis of the problems that over-concentration brings about in networks with extensive experiments. We analyze the over-concentration through problems arising from feature variance and dead neurons that are not activated. Based on our analysis, we introduce a multi-token attention pooling layer to alleviate the over-concentration problem. Our attention-pooling layer captures broad-sight local patterns by learning multiple tokens with the proposed distillation algorithm. It resolves the high bias and high variance errors of learned multi-tokens, which is crucial when aggregating local patterns with multi-tokens. Our method applies to various vision tasks and network architectures such as CNN, ViT, and MLP-Mixer. The proposed method improves baselines with few extra resources, and a network employing our pooling method works favorably against state-of-the-art networks. We open-source the code at https://***/Lab-LVM/imagenet-models.

关键词： Visualization computer vision Codes Neurons Object detection Network architecture pattern recognition Object recognition Biological neural networks

来源：评论

学校读者我要写书评

暂无评论

Learning Visual-Semantic Hierarchical Attribute Space for Interpretable Open-Set recognition

Learning Visual-Semantic Hierarchical Attribute Space for In...

引用

ieee Workshop on Applications of computer vision (WACV)

作者： Zhuo Xu Xiang Xiang National Key Lab of Multi-Spectral Information Intelligent Processing Technology School of Artificial Intelligence and Automation Huazhong University of Science and Technology Wuhan China

ISBN: (数字)9798331510831

ISBN: (纸本)9798331510848

In the field of open-set recognition, conventional models often focus on addressing challenges within a single hierarchical category, and these methods frequently lack inter-pretability. In this paper, we propose a novel solution that utilizes attributes and hierarchical relationships to achieve interpretable open-set recognition. Our method is centered around the visual-semantic attribute space. By leveraging hierarchy division, we can decompose the attributes into more granular components, thereby yielding additional performance improvements. When confronted with an unfamiliar object, our method not only classifies it as an unknown category but also provides insights into the broader category and its associated attributes. This capability enhances interpretability by offering valuable information regarding the potential category and characteristics of the object. Experimental results demonstrate great performance improvements compared to existing methods.

关键词： computer vision Computational modeling

来源：评论

学校读者我要写书评

暂无评论

Guest Editors' Introduction to the Special Section on Award-Winning Papers from the ieee conference on computer vision and pattern recognition 2009 (CVPR 2009)

引用

ieee TRANSACTIONS ON pattern ANALYSIS AND MACHINE INTELLIGENCE 2011年第12期33卷 2339-2340页

作者： Essa, Irfan Kang, Sing Bing Pollefeys, Marc Georgia Inst Technol Sch Interact Comp Atlanta GA 30332 USA Microsoft Corp Redmond WA 98052 USA ETH Dept Comp Sci CH-8092 Zurich Switzerland

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：