检索结果-内蒙古大学图书馆

您好，读者！请登录

内蒙古大学图书馆

首页
概况
党建
资源
服务
科研支持
- 论文收录引用证明
- 科技查新
知识产权
档案馆
帮助

咨询与建议

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

您的常用邮箱：*

您的手机号码：*

问题描述：

当前已输入0个字，您还可以输入200个字

全部搜索
期刊论文
图书
学位论文
标准
纸本馆藏
外文资源发现
数据库导航
超星发现

高级检索

分类表

所选分类

>> <<

限定检索结果

标题

标题
作者
主题词
出版物名称
出版社
机构
学科分类号
摘要
ISBN
ISSN
基金资助
索书号

作者

作者
标题
主题词
出版物名称
出版社
机构
学科分类号
摘要
ISBN
ISSN
基金资助
索书号

文献类型

6,421 篇 会议
25 篇 期刊文献
3 册 图书

馆藏范围

6,448 篇 电子文献
1 种 纸本馆藏

日期分布

学科分类号

3,849 篇 工学
- 3,647 篇 计算机科学与技术...
- 1,431 篇 软件工程
- 790 篇 光学工程
- 302 篇 信息与通信工程
- 242 篇 控制科学与工程
- 219 篇 电气工程
- 201 篇 机械工程
- 80 篇 生物医学工程（可授...
- 68 篇 生物工程
- 67 篇 电子科学与技术（可...
- 64 篇 仪器科学与技术
- 36 篇 建筑学
- 33 篇 力学（可授工学、理...
- 33 篇 土木工程
- 33 篇 航空宇航科学与技...
- 26 篇 安全科学与工程
- 22 篇 交通运输工程
- 20 篇 材料科学与工程（可...
- 18 篇 化学工程与技术
1,453 篇 理学
- 945 篇 物理学
- 890 篇 数学
- 352 篇 统计学（可授理学、...
- 134 篇 生物学
- 38 篇 系统科学
- 23 篇 化学
160 篇 管理学
- 110 篇 图书情报与档案管...
- 52 篇 管理科学与工程(可...
- 25 篇 工商管理
112 篇 医学
- 112 篇 临床医学
17 篇 法学
- 17 篇 社会学
12 篇 农学
8 篇 教育学
7 篇 艺术学
6 篇 经济学
2 篇 军事学

主题

2,288 篇 computer vision
789 篇 pattern recognit...
637 篇 cameras
629 篇 computer science
568 篇 face recognition
555 篇 layout
510 篇 image segmentati...
509 篇 conferences
498 篇 shape
445 篇 robustness
439 篇 object recogniti...
388 篇 humans
332 篇 feature extracti...
321 篇 training
303 篇 object detection
262 篇 image recognitio...
257 篇 application soft...
246 篇 lighting
238 篇 image reconstruc...
237 篇 computational mo...

机构

41 篇 microsoft resear...
26 篇 department of co...
21 篇 swiss fed inst t...
21 篇 school of comput...
20 篇 department of co...
19 篇 swiss fed inst t...
19 篇 carnegie mellon ...
18 篇 department of co...
17 篇 department of in...
17 篇 the robotics ins...
17 篇 institute of com...
16 篇 univ sci & techn...
16 篇 robotics institu...
15 篇 tsinghua univ pe...
14 篇 department of el...
14 篇 school of comput...
14 篇 school of comput...
13 篇 univ maryland co...
13 篇 microsoft resear...
13 篇 microsoft resear...

作者

39 篇 timofte radu
28 篇 s.k. nayar
24 篇 huang thomas s.
23 篇 xiaoou tang
22 篇 t. kanade
20 篇 t.s. huang
19 篇 van gool luc
19 篇 t. darrell
19 篇 chellappa rama
18 篇 nayar shree k.
17 篇 a.k. jain
17 篇 a. zisserman
17 篇 jain anil k.
16 篇 g. healey
16 篇 torralba antonio
16 篇 heung-yeung shum
16 篇 zisserman andrew
16 篇 l. van gool
15 篇 m. shah
15 篇 ji qiang

语言

6,447 篇 英文
2 篇 其他

检索条件"任意字段=1992 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR 1992"

共 6449 条记录，以下是41-50 订阅

全选清除本页清除全部题录导出标记到"检索档案"

详细简洁

排序：

相关度排序

相关度排序
时效性降序
时效性升序

Federated Learning with a Single Shared Image

Federated Learning with a Single Shared Image

引用

ieee/CVF conference on computer vision and pattern recognition (cvpr)

作者： Soni, Sunny Saeed, Aaqib Asano, Yuki M. Univ Amsterdam Amsterdam Netherlands TU Eindhoven Eindhoven Netherlands

ISBN: (纸本)9798350365474

Federated Learning (FL) enables multiple machines to collaboratively train a machine learning model without sharing of private training data. Yet, especially for heterogeneous models, a key bottleneck remains the transfer of knowledge gained from each client model with the server. One popular method, FedDF, uses distillation to tackle this task with the use of a common, shared dataset on which predictions are exchanged. However, in many contexts such a dataset might be difficult to acquire due to privacy and the clients might not allow for storage of a large shared dataset. To this end, in this paper, we introduce a new method that improves this knowledge distillation method to only rely on a single shared image between clients and server. In particular, we propose a novel adaptive dataset pruning algorithm that selects the most informative crops generated from only a single image. With this, we show that federated learning with distillation under a limited shared dataset budget works better by using a single image compared to multiple individual ones. Finally, we extend our approach to allow for training heterogeneous client architectures by incorporating a non-uniform distillation schedule and client-model mirroring on the server side.

关键词： computer vision Federated Learning Limited Data Representation Learning

来源：评论

学校读者我要写书评

暂无评论

Multi-View Action recognition for Distracted Driver Behavior Localization

Multi-View Action Recognition for Distracted Driver Behavior...

引用

ieee/CVF conference on computer vision and pattern recognition (cvpr)

作者： Xu, Yuehuan Jiang, Shuai Cui, Zhe Su, Fei Beijing Univ Posts & Telecommun Beijing Peoples R China Beijing Key Lab Network Syst & Network Culture Beijing Peoples R China

ISBN: (纸本)9798350365474

The detection and recognition of distracted driving behaviors has emerged as a new vision task with the rapid development of computer vision, which is considered as a challenging temporal action localization (TAL) problem in computer vision. The primary goal of temporal localization is to determine the start and end time of actions in untrimmed videos. Currently, most state-of-the-art temporal localization methods adopt complex architectures, which are cumbersome and time-consuming. In this paper, we propose a robust and efficient two-stage framework for distracted behavior classification-localization based on the sliding window approach, which is suitable for untrimmed naturalistic driving videos. To address the issues of high similarity among different behaviors and interference from background classes, we propose a multi-view fusion and adaptive thresholding algorithm, which effectively reduces missing detections. To address the problem of fuzzy behavior boundary localization, we design a post-processing procedure that achieves fine localization from coarse localization through post connection and candidate behavior merging criteria. In the AICITY2024 Task3 TestA, our method performs well, achieving Average Intersection over Union(AIOU) of 0.6080 and ranking eighth in AICITY2024 Task3. Our code will be released in the near future.

关键词： Action recognition Deep learning Distracted driver behavior

来源：评论

学校读者我要写书评

暂无评论

Using Language-Aligned Gesture Embeddings for Understanding Gestures Accompanying Math Terms

Using Language-Aligned Gesture Embeddings for Understanding ...

引用

ieee/CVF conference on computer vision and pattern recognition (cvpr)

作者： Maidment, Tristan Patel, Purav J. Walker, Erin Kovashka, Adriana Univ Pittsburgh Intelligent Syst Program Pittsburgh PA 15260 USA Univ Pittsburgh Comp Sci Pittsburgh PA 15260 USA Univ Pittsburgh Learning Res & Dev Ctr Pittsburgh PA 15260 USA Univ Maryland College Pk MD 20742 USA

ISBN: (纸本)9798350365474

In this paper, we introduce an approach for recognizing and classifying gestures that accompany mathematical terms, in a new collection we name the "GAMT" dataset. Our method uses language as a means of providing context to classify gestures. Specifically, we use a CLIP-style framework to construct a shared embedding space for gestures and language, experimenting with various methods for encoding gestures within this space. We evaluate our method on our new dataset containing a wide array of gestures associated with mathematical terms. The shared embedding space leads to a substantial improvement in gesture classification. Furthermore, we identify an efficient model that excelled at classifying gestures from our unique dataset, thus contributing to the further development of gesture recognition in diverse interaction scenarios.

关键词： action clip cnn comunication dataset embeddings gesture hci language-aligned multimodal non-verbal recognition semantic tcn transformers

来源：评论

学校读者我要写书评

暂无评论

Zero-Shot Audio-Visual Compound Expression recognition Method based on Emotion Probability Fusion

Zero-Shot Audio-Visual Compound Expression Recognition Metho...

引用

ieee/CVF conference on computer vision and pattern recognition (cvpr)

作者： Ryumina, Elena Markitantov, Maxim Ryumin, Dmitry Kaya, Heysem Karpov, Alexey Russian Acad Sci St Petersburg Fed Res Ctr St Petersburg Russia Univ Utrecht Dept Informat & Comp Sci Utrecht Netherlands

ISBN: (纸本)9798350365474

A Compound Expression recognition (CER) as a sub-field of affective computing is a novel task in intelligent human-computer interaction and multimodal user interfaces. We propose a novel audio-visual method for CER. Our method relies on emotion recognition models that fuse modalities at the emotion probability level, while decisions regarding the prediction of compound expressions are based on the pair-wise sum of weighted emotion probability distributions. Notably, our method does not use any training data specific to the target task. Thus, the problem is a zero-shot classification task. The method is evaluated in multi-corpus training and cross-corpus validation setups. We achieved F1 scores of 32.15% and 25.56% for the AffWild2 and C-EXPR-DB test subsets without training on target corpus and target task, respectively. Therefore, our method is on par with methods developed training target corpus or target task. The source code is publicly available from https: //***/AVCER/.

关键词： audio-visual emotion recognition compound expression recognition zero-shot classification

来源：评论

学校读者我要写书评

暂无评论

Multi-Modal Hit Detection and Positional Analysis in Padel Competitions

Multi-Modal Hit Detection and Positional Analysis in Padel C...

引用

ieee/CVF conference on computer vision and pattern recognition (cvpr)

作者： Decorte, Robbe Pare, Martin Vanhaeverbeke, Jelle Taelman, Joachim Slembrouck, Maarten Verstockt, Steven Univ Ghent imec IDLab Ghent Belgium

ISBN: (纸本)9798350365474

Padel is a rapidly growing racquet sport and has gained popularity globally due to its accessibility and exciting gameplay dynamics. Effective coordination between teammates hinges on maintaining an appropriate distance, allowing for seamless transitions between offensive and defensive maneuvers. A balanced inter-player distance and distance to the net not only facilitates efficient communication but also enhances the team's ability to exploit openings in the opponent's defense while minimizing vulnerabilities. We introduce a new open dataset of padel rallies with annotations for hits and player-ball interactions, a predictive model for detecting hits based on audio signals, a reidentification algorithm for pose tracking, and a framework for calculating inter-player and player-net distances during rallies. Our predictive model achieves an average F1-score of 92% for hit detection, demonstrating robust performance across different match conditions. Furthermore, we develop a system for accurately assigning hits to individual players, achieving an overall accuracy of 83.70% for player-specific assignment and 86.83% for team-based assignment.

关键词： computer vision Deep Learning Player Tracking Sound Event Detection Sports Data Science

来源：评论

学校读者我要写书评

暂无评论

An End-to-End vision Transformer Approach for Image Copy Detection

An End-to-End Vision Transformer Approach for Image Copy Det...

引用

ieee/CVF conference on computer vision and pattern recognition (cvpr)

作者： Lee, Jiahe Steven Hsu, Wynne Lee, Mong Li Natl Univ Singapore Inst Data Sci Singapore Singapore Natl Univ Singapore Ctr Trusted Internet & Community Singapore Singapore

ISBN: (纸本)9798350365474

Image copy detection is one of the pivotal tools to safeguard online information integrity. The challenge lies in determining whether a query image is an edited copy, which necessitates the identification of candidate source images through a retrieval process. The process requires discriminative features comprising of both global descriptors that are designed to be augmentation-invariant and local descriptors that can capture salient foreground objects to assess whether a query image is an edited copy of some source reference image. This work describes an end-to-end solution that leverage a vision Transformer model to learn such discriminative features and perform implicit matching between the query image and the reference image. Experimental results on two benchmark datasets demonstrate that the proposed solution outperforms state-of-the-art methods. Case studies illustrate the effectiveness of our approach in matching reference images from which the query images have been copy-edited.

关键词： image copy detection image retrieval

来源：评论

学校读者我要写书评

暂无评论

An End-to-End Approach for Handwriting recognition: From Handwritten Text Lines to Complete Pages

An End-to-End Approach for Handwriting Recognition: From Han...

引用

ieee/CVF conference on computer vision and pattern recognition (cvpr)

作者： Castro, Dayvid Bezerra, Byron Leite Dantas Zanchettin, Cleber Univ Fed Pernambuco Recife PE Brazil

ISBN: (纸本)9798350365474

Handwritten Document recognition (HDR) has emerged as a challenging task integrating text and layout information recognition to tackle manuscripts end-to-end. Despite advancements, the computational efficiency of processing entire documents remains a critical challenge, limiting the practical applicability of these models. This paper presents the Document Attention Network for Computationally Efficient recognition (DANCER). The model differs from existing approaches with its unique encoder-decoder structure, where the encoder reduces spatial redundancy and enhances spatial attention, and the decoder, comprising transformer layers, efficiently decodes the text using optimized attention operations. This design results in a fast, memory-efficient model capable of effectively transcribing and understanding complex manuscript layouts. We evaluated DANCER's efficacy on the ICFHR 2016 READ competition dataset, focusing on recognizing single and doublepage historical documents. We demonstrate how DANCER can triple the training batch size compared to prior models within the same memory limits and reduce memory usage by up to 65% without compromising recognition quality. The proposed approach sets new standards in efficiency and accuracy for HDR solutions, paving the way for practical and scalable applications in diverse contexts.

关键词： Attention Networks Full-Page recognition Handwriting recognition Handwritten Document recognition Transformer Network

来源：评论

学校读者我要写书评

暂无评论

Prompt Learning with One-Shot Setting based Feature Space Analysis in vision-and-Language Models

Prompt Learning with One-Shot Setting based Feature Space An...

引用

ieee/CVF conference on computer vision and pattern recognition (cvpr)

作者： Hirohashi, Yuki Hirakawa, Tsubasa Yamashita, Takayoshi Fujiyoshi, Hironobu OMRON Corp Kyoto Japan Chubu Univ Kasugai Aichi Japan

ISBN: (纸本)9798350365474

By using few-shot data and labels, prompt learning obtains optimal prompts that are capable of achieving high performance on downstream tasks. Existing prompt learning methods generate high-quality prompts that are suitable for downstream tasks but tend to perform poorly in scenarios where only very limited data (e.g., one-shot) is available. We address on this challenging one-shot scenario and propose a novel architecture for prompt learning, called Image-Text Feature Alignment Branch (ITFAB). ITFAB aligns text features closer to the centroids of image features and separates text features with different classes to resolve misalignment in the feature space, thereby facilitating the acquisition of high-quality prompts with very limited data. In one-shot setting, our method outperforms the existing CoOp and CoCoOp methods and in some cases even surpasses CoCoOp's 16-shot performance. Testing on different datasets and domain, show that ITFAB almost matches CoCoOp's effectiveness. It also works with current prompt learning methods like MapLe and PromptSRC, improving their performance in one-shot setting.

关键词： Prompt Learning vision-and-Language Model

来源：评论

学校读者我要写书评

暂无评论

Video Representation Learning for Conversational Facial Expression recognition Guided by Multiple View Reconstruction

Video Representation Learning for Conversational Facial Expr...

引用

ieee/CVF conference on computer vision and pattern recognition (cvpr)

作者： Strizhkova, Valeriya Ferrari, Laura M. Kachmar, Hadi Dantcheva, Antitza Bremond, Francois INRIA Paris France Univ Cote dAzur Nice France

ISBN: (纸本)9798350365474

Conversational facial expression recognition entails challenges such as handling of facial dynamics, small available datasets, low-intensity and fine-grained emotional expressions and extreme face angle. Towards addressing these challenges, we propose the Masking Action Units and Reconstructing multiple Angles (MAURA) pre-training. MAURA is an efficient self-supervised method that permits the use of small datasets, while preserving end-toend conversational facial expression recognition with vision Transformer. MAURA masks videos using the location with active Action Units and reconstructs synchronized multi-view videos, thus learning the dependencies between muscle movements and encoding information, which might only be visible in few frames and/or in certain views. Based on one view (e.g., frontal), the encoder reconstructs other views (e.g., top, down, laterals). Such masking and reconstructing strategy provides a powerful representation, beneficial in facial expression downstream tasks. Our experimental analysis shows that we consistently outperform the state-of-the-art in the challenging settings of low-intensity and fine-grained conversational facial expression recognition on four datasets including in-the-wild DFEW, CMU-MOSEI, MFA and multi-view MEAD. Our results suggest that MAURA is able to learn robust and generic video representations.

关键词： Signal encoding

来源：评论

学校读者我要写书评

暂无评论

GRAFIQS: Face Image Quality Assessment Using Gradient Magnitudes

GRAFIQS: Face Image Quality Assessment Using Gradient Magnit...

引用

ieee/CVF conference on computer vision and pattern recognition (cvpr)

作者： Kolf, Jan Niklas Damer, Naser Boutros, Fadi Fraunhofer Inst Comp Graph Res IGD Darmstadt Germany Tech Univ Darmstadt Darmstadt Germany

ISBN: (纸本)9798350365474

Face Image Quality Assessment (FIQA) estimates the utility of face images for automated face recognition (FR) systems. We propose in this work a novel approach to assess the quality of face images based on inspecting the required changes in the pre-trained FR model weights to minimize differences between testing samples and the distribution of the FR training dataset. To achieve that, we propose quantifying the discrepancy in Batch Normalization statistics (BNS), including mean and variance, between those recorded during FR training and those obtained by processing testing samples through the pretrained FR model. We then generate gradient magnitudes of pretrained FR weights by backpropagating the BNS through the pretrained model. The cumulative absolute sum of these gradient magnitudes serves as the FIQ for our approach. Through comprehensive experimentation, we demonstrate the effectiveness of our training-free and quality labeling-free approach, achieving competitive performance to recent state-of-the-art FIQA approaches without relying on quality labeling, the need to train regression networks, specialized architectures, or designing and optimizing specific loss functions.

关键词： Biometrics computer vision Face Image Quality Assessment Face recognition

来源：评论

学校读者我要写书评

暂无评论

没有更多数据了...

全选清除本页清除全部题录导出标记到“检索档案”

共500页 << < 1 2 3 4 5 6 7 8 9 10 > >>

检索报告对象比较合并检索0

隐藏清空

合并搜索

回到顶部

执行限定条件

内容：

评分：

请选择保存的检索档案：

请选择收藏分类：

订阅名称：

通借通还

温馨提示：

图书名称：

借书校区：

取书校区：

手机号码：

邮箱地址：

一卡通帐号：

电话和邮箱必须正确填写，我们会与您联系确认。

联系人：

所在院系：

联系邮箱：

联系电话：

内蒙古自治区呼和浩特市赛罕区大学西街235号邮编: 010021

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：