检索结果-内蒙古大学图书馆

您好，读者！请登录

内蒙古大学图书馆

首页
概况
党建
资源
服务
科研支持
- 论文收录引用证明
- 科技查新
知识产权
档案馆
帮助

咨询与建议

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

您的常用邮箱：*

您的手机号码：*

问题描述：

当前已输入0个字，您还可以输入200个字

全部搜索
期刊论文
图书
学位论文
标准
纸本馆藏
外文资源发现
数据库导航
超星发现

高级检索

分类表

所选分类

>> <<

限定检索结果

标题

标题
作者
主题词
出版物名称
出版社
机构
学科分类号
摘要
ISBN
ISSN
基金资助
索书号

作者

作者
标题
主题词
出版物名称
出版社
机构
学科分类号
摘要
ISBN
ISSN
基金资助
索书号

文献类型

12,844 篇 会议
13 篇 期刊文献
2 册 图书

馆藏范围

12,859 篇 电子文献
0 种 纸本馆藏

日期分布

学科分类号

7,573 篇 工学
- 6,863 篇 计算机科学与技术...
- 880 篇 机械工程
- 814 篇 软件工程
- 435 篇 控制科学与工程
- 360 篇 光学工程
- 306 篇 电气工程
- 209 篇 仪器科学与技术
- 124 篇 信息与通信工程
- 91 篇 生物工程
- 62 篇 生物医学工程（可授...
- 39 篇 电子科学与技术（可...
- 34 篇 安全科学与工程
- 26 篇 化学工程与技术
- 21 篇 交通运输工程
- 20 篇 建筑学
- 18 篇 土木工程
2,957 篇 医学
- 2,956 篇 临床医学
- 15 篇 基础医学(可授医学...
- 12 篇 药学(可授医学、理...
700 篇 理学
- 359 篇 物理学
- 225 篇 数学
- 175 篇 系统科学
- 95 篇 统计学（可授理学、...
- 93 篇 生物学
- 22 篇 化学
201 篇 艺术学
- 201 篇 设计学（可授艺术学...
84 篇 管理学
- 59 篇 图书情报与档案管...
- 25 篇 管理科学与工程(可...
- 14 篇 工商管理
23 篇 法学
- 21 篇 社会学
5 篇 农学
4 篇 教育学
2 篇 经济学
1 篇 军事学

主题

6,464 篇 computer vision
2,688 篇 training
2,437 篇 pattern recognit...
1,780 篇 computational mo...
1,522 篇 visualization
1,348 篇 three-dimensiona...
1,091 篇 computer archite...
1,063 篇 semantics
997 篇 benchmark testin...
976 篇 codes
970 篇 conferences
854 篇 feature extracti...
830 篇 cameras
771 篇 task analysis
707 篇 deep learning
646 篇 image segmentati...
611 篇 object detection
595 篇 shape
554 篇 transformers
538 篇 neural networks

机构

132 篇 univ sci & techn...
122 篇 carnegie mellon ...
120 篇 tsinghua univ pe...
114 篇 univ chinese aca...
113 篇 chinese univ hon...
94 篇 tsinghua univers...
91 篇 zhejiang univ pe...
91 篇 swiss fed inst t...
85 篇 peng cheng lab p...
81 篇 university of ch...
80 篇 zhejiang univers...
77 篇 shanghai ai lab ...
77 篇 peng cheng labor...
75 篇 university of sc...
69 篇 shanghai jiao to...
68 篇 shanghai jiao to...
67 篇 alibaba grp peop...
67 篇 stanford univ st...
66 篇 univ hong kong p...
64 篇 sensetime res pe...

作者

77 篇 timofte radu
63 篇 van gool luc
45 篇 zhang lei
36 篇 yang yi
36 篇 luc van gool
34 篇 tao dacheng
31 篇 loy chen change
29 篇 chen chen
28 篇 sun jian
28 篇 qi tian
25 篇 li xin
24 篇 liu yang
24 篇 tian qi
24 篇 ying shan
23 篇 wang xinchao
23 篇 zha zheng-jun
23 篇 boxin shi
21 篇 zhou jie
21 篇 vasconcelos nuno
20 篇 luo ping

语言

12,856 篇 英文
2 篇 其他
1 篇 中文

检索条件"任意字段=IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops"

共 12859 条记录，以下是4421-4430 订阅

全选清除本页清除全部题录导出标记到"检索档案"

详细简洁

排序：

相关度排序

相关度排序
时效性降序
时效性升序

Language-guided Multi-modal Emotional Mimicry Intensity Estimation

Language-guided Multi-modal Emotional Mimicry Intensity Esti...

引用

ieee computer Society conference on computer vision and pattern recognition workshops (CVPRW)

作者： Feng Qiu Wei Zhang Chen Liu Lincheng Li Heming Du Tianchen Guo Xin Yu Netease Fuxi AI Lab The University of Queensland

ISBN: (数字)9798350365474

ISBN: (纸本)9798350365481

Emotional Mimicry Intensity (EMI) estimation aims to identify the intensity of mimicry exhibited by individuals in response to observed emotions. The challenge in EMI estimation lies in discerning nuanced facial expression cues on mimicry behaviors based on the seed video and the text instructions. In this paper, we propose a multi-modal EMI estimation framework by leveraging visual, auditory, and textual modalities to capture a comprehensive emotional profile. We first extract representations for each modality separately and then fuse the modality-specific representations via a Temporal Segment Network, optimizing for temporal coherence and emotional context. Furthermore, we find that participants demonstrate notable proficiency in mimicking text instructions, yet exhibit less effectiveness in replicating facial expressions and vocal tones. In light of this, we design a contrastive learning mechanism to refine the extracted feature based on textual guidance. By doing so, features derived from similar text instructions are closely aligned, enhancing the estimation of emotional mimicry intensity by leveraging the dominant textual modality. Experiments conducted on the Hume-Vidmimic2 dataset illustrate the effectiveness of our framework in EMI estimation. Our framework is recognized as the leading solution in the Emotional Mimicry Intensity (EMI) Estimation Challenge at the 6th Workshop and Competition on Affective Behavior Analysis in-the-wild (ABAW). More information for the Competition can be found in: 6th ABAW.

关键词： Visualization Emotion recognition computer vision Fuses conferences Electromagnetic interference Estimation

来源：评论

学校读者我要写书评

暂无评论

FACESEC: A Fine-grained Robustness Evaluation Framework for Face recognition Systems

FACESEC: A Fine-grained Robustness Evaluation Framework for ...

引用

ieee/cvf conference on computer vision and pattern recognition (CVPR)

作者： Tong, Liang Chen, Zhengzhang Ni, Jingchao Cheng, Wei Song, Dongjin Chen, Haifeng Vorobeychik, Yevgeniy Washington Univ St Louis MO 14263 USA NEC Labs Amer Princeton NJ 08540 USA Univ Connecticut Storrs CT USA

ISBN: (纸本)9781665445092

We present FACESEC, a framework for fine-grained robustness evaluation of face recognition systems. FACESEC evaluation is performed along four dimensions of adversarial modeling: the nature of perturbation (e.g., pixel-level or face accessories), the attacker's system knowledge (about training data and learning architecture), goals (dodging or impersonation), and capability (tailored to individual inputs or across sets of these). We use FACESEC to study five face recognition systems in both closed-set and open-set settings, and to evaluate the state-of-the-art approach for defending against physically realizable attacks on these. We find that accurate knowledge of neural architecture is significantly more important than knowledge of the training data in black-box attacks. Moreover, we observe that open-set face recognition systems are more vulnerable than closed-set systems under different types of attacks. The efficacy of attacks for other threat model variations, however, appears highly dependent on both the nature of perturbation and the neural network architecture. For example, attacks that involve adversarial face masks are usually more potent, even against adversarially trained models, and the ArcFace architecture tends to be more robust than the others.

关键词： computer vision Systematics Face recognition Perturbation methods Neural networks Training data computer architecture

来源：评论

学校读者我要写书评

暂无评论

The Blessings of Unlabeled Background in Untrimmed Videos

The Blessings of Unlabeled Background in Untrimmed Videos

引用

ieee/cvf conference on computer vision and pattern recognition (CVPR)

作者： Liu, Yuan Chen, Jingyuan Chen, Zhenfang Deng, Bing Huang, Jianqiang Zhang, Hanwang Alibaba Grp Hangzhou Peoples R China Univ Hong Kong Hong Kong Peoples R China Nanyang Technol Univ Singapore Singapore

ISBN: (纸本)9781665445092

Weakly-supervised Temporal Action Localization (WTAL) aims to detect the action segments with only video-level action labels in training. The key challenge is how to distinguish the action of interest segments from the background, which is unlabelled even on the video-level. While previous works treat the background as "curses", we consider it as "blessings". Specifically, we first use causal analysis to point out that the common localization errors are due to the unobserved confounder that resides ubiquitously in visual recognition. Then, we propose a Temporal Smoothing PCA-based (TS-PCA) deconfounder, which exploits the unlabelled background to model an observed substitute for the unobserved confounder, to remove the confounding effect. Note that the proposed deconfounder is model-agnostic and non-intrusive, and hence can be applied in any WTAL method without model re-designs. Through extensive experiments on four state-of-the-art WTAL methods, we show that the deconfounder can improve all of them on the public datasets: THUMOS-14 and ActivityNet-1.3(1).

关键词： Location awareness Training Visualization computer vision Smoothing methods Computational modeling pattern recognition

来源：评论

学校读者我要写书评

暂无评论

PointPrompt: A Multi-modal Prompting Dataset for Segment Anything Model

PointPrompt: A Multi-modal Prompting Dataset for Segment Any...

引用

ieee computer Society conference on computer vision and pattern recognition workshops (CVPRW)

作者： Jorge Quesada Mohammad Alotaibi Mohit Prabhushankar Ghassan AlRegib OLIVES Lab Georgia Institute of Technology Atlanta GA USA

ISBN: (数字)9798350365474

ISBN: (纸本)9798350365481

The capabilities of foundation models, most recently the Segment Anything Model, have gathered a large degree of attention for providing a versatile framework for tackling a wide array of image segmentation tasks. However, the interplay between human prompting strategies and the segmentation performance of these models remains understudied, as does the role played by the domain knowledge that humans (by previous exposure) and models (by pretraining) bring to the prompting process. To bridge this gap, we present the PointPrompt dataset compiled across multiple image modalities as well as multiple prompting annotators per modality. We collected a total of 16 image datasets from the natural, underwater, medical and seismic domain in order to create a comprehensive resource to facilitate the study of prompting behavior and agreement across modalities. Overall, our prompting dataset contains 158880 inclusion points and 52594 exclusion points over a total of 6000 images. Our analysis highlights the following: (i) viability of prompts across heterogeneous data, (ii) that point prompts are a valuable resource in the effort for enhancing the robustness and generalizability of segmentation models across diverse domains, (iii) prompts facilitate an understanding of the dynamics between annotation strategies and neural network outcomes. Information on downloading the dataset, images, and prompting tool is provided on our project website https://***/pointprompt/.

关键词： Bridges Image segmentation computer vision Computational modeling conferences Neural networks Dynamic scheduling

来源：评论

学校读者我要写书评

暂无评论

Conceptual 12M: Pushing Web-Scale Image-Text Pre-Training To Recognize Long-Tail Visual Concepts

Conceptual 12M: Pushing Web-Scale Image-Text Pre-Training To...

引用

ieee/cvf conference on computer vision and pattern recognition (CVPR)

作者： Changpinyo, Soravit Sharma, Piyush Ding, Nan Soricut, Radu Google Res Mountain View CA 94043 USA

ISBN: (纸本)9781665445092

The availability of large-scale image captioning and visual question answering datasets has contributed significantly to recent successes in vision-and-language pre-training. However, these datasets are often collected with overrestrictive requirements inherited from their original target tasks (e.g., image caption generation), which limit the resulting dataset scale and diversity. We take a step further in pushing the limits of vision-and-language pretraining data by relaxing the data collection pipeline used in Conceptual Captions 3M (CC3M) [54] and introduce the Conceptual 12M (CC12M), a dataset with 12 million image-text pairs specifically meant to be used for vision-and-language pre-training. We perform an analysis of this dataset and benchmark its effectiveness against CC3M on multiple downstream tasks with an emphasis on long-tail visual recognition. Our results clearly illustrate the benefit of scaling up pre-training data for vision-and-language tasks, as indicated by the new state-of-the-art results on both the neaps and Conceptual Captions benchmarks.(1)

关键词： Visualization computer vision Image recognition Pipelines Benchmark testing Data collection Knowledge discovery

来源：评论

学校读者我要写书评

暂无评论

LQF: Linear Quadratic Fine-Tuning

LQF: Linear Quadratic Fine-Tuning

引用

ieee/cvf conference on computer vision and pattern recognition (CVPR)

作者： Achille, Alessandro Golatkar, Aditya Ravichandran, Avinash Polito, Marzia Soatto, Stefano Amazon Web Serv Seattle WA 98109 USA Univ Calif Los Angeles Los Angeles CA 90024 USA

ISBN: (纸本)9781665445092

Classifiers that are linear in their parameters, and trained by optimizing a convex loss function, have predictable behavior with respect to changes in the training data, initial conditions, and optimization. Such desirable properties are absent in deep neural networks (DNNs), typically trained by non-linear fine-tuning of a pre-trained model. Previous attempts to linearize DNNs have led to interesting theoretical insights, but have not impacted the practice due to the substantial performance gap compared to standard non-linear optimization. We present the first method for linearizing a pre-trained model that achieves comparable performance to non-linear fine-tuning on most of real-world image classification tasks tested, thus enjoying the interpretability of linear models without incurring punishing losses in performance. LQF consists of simple modifications to the architecture, loss function and optimization typically used for classification: Leaky-ReLU instead of ReLU, mean squared loss instead of cross-entropy, and pre-conditioning using Kronecker factorization. None of these changes in isolation is sufficient to approach the performance of non-linear fine-tuning. When used in combination, they allow us to reach comparable performance, and even superior in the low-data regime, while enjoying the simplicity, robustness and interpretability of linear-quadratic optimization.

关键词： Deep learning computer vision Training data computer architecture Robustness pattern recognition Task analysis

来源：评论

学校读者我要写书评

暂无评论

LFNAT 2023 Challenge on Light Field Depth Estimation: Methods and Results

LFNAT 2023 Challenge on Light Field Depth Estimation: Method...

引用

ieee computer Society conference on computer vision and pattern recognition workshops (CVPRW)

作者： Hao Sheng Yebin Liu Jingyi Yu Gaochang Wu Wei Xiong Ruixuan Cong Rongshan Chen Longzhao Guo Yanlin Xie Shuo Zhang Song Chang Youfang Lin Wentao Chao Xuechun Wang Guanghui Wang Fuqing Duan Tun Wang Da Yang Zhenglong Cui Sizhe Wang Mingyuan Zhao Qiong Wang Qianyu Chen Zhengyu Liang Yingqian Wang Jungang Yang Xueting Yang Junli Deng LFNAT 2023 Challenge State Key Laboratory of Virtual Reality Technology and Systems School of Computer Science and Engineering Beihang University and Beihang Hangzhou Innovation Institute Yuhang

This paper reviews the 1st LFNAT challenge on light field depth estimation, which aims at predicting disparity information of central view image in a light field (i.e., pixel offset between central view image and adjacent view image). Compared to multi-view stereo matching, light field depth estimation emphasizes efficient utilization of the 2D angular information from multiple regularly varying views. This challenge specifies UrbanLF [20] light field dataset as the sole data source. There are two phases in total: submission phase and final evaluation phase, in which 75 registered participants successfully submit their predicted results in the first phase and 7 eligible teams compete in the second phase. The performance of all submissions is carefully reviewed and shown in this paper as a new standard for the current state-of-the-art in light field depth estimation. Moreover, the implementation details of these methods are also provided to stimulate related advanced research.

关键词：

来源：评论

学校读者我要写书评

暂无评论

PatchMatch-Based Neighborhood Consensus for Semantic Correspondence

PatchMatch-Based Neighborhood Consensus for Semantic Corresp...

引用

ieee/cvf conference on computer vision and pattern recognition (CVPR)

作者： Lee, Jae Yong DeGol, Joseph Fragoso, Victor Sinha, Sudipta N. Univ Illinois Chicago IL 60680 USA Microsoft Washington DC USA

ISBN: (纸本)9781665445092

We address estimating dense correspondences between two images depicting different but semantically related scenes. End-to-end trainable deep neural networks incorporating neighborhood consensus cues are currently the best methods for this task. However, these architectures require exhaustive matching and 4D convolutions over matching costs for all pairs of feature map pixels. This makes them computationally expensive. We present a more efficient neighborhood consensus approach based on Patch-Match. For higher accuracy, we propose to use a learned local 4D scoring function for evaluating candidates during the PatchMatch iterations. We have devised an approach to jointly train the scoring function and the feature extraction modules by embedding them into a proxy model which is end-to-end differentiable. The modules are trained in a supervised setting using a cross-entropy loss to directly incorporate sparse keypoint supervision. Our evaluation on PF- PASCAL and SPAIR-71K shows that our method significantly outperforms the state-of-the-art on both datasets while also being faster and using less memory.

关键词： Deep learning computer vision Costs Computational modeling Semantics Memory management Feature extraction

来源：评论

学校读者我要写书评

暂无评论

Fashion IQ: A New Dataset Towards Retrieving Images by Natural Language Feedback

Fashion IQ: A New Dataset Towards Retrieving Images by Natur...

引用

ieee/cvf conference on computer vision and pattern recognition (CVPR)

作者： Wu, Hui Gao, Yupeng Guo, Xiaoxiao Al-Halah, Ziad Rennie, Steven Grauman, Kristen Feris, Rogerio MIT IBM Watson AI Lab Cambridge MA 02142 USA IBM Res Armonk NY 10504 USA UT Austin Austin TX USA Pryon New York NY USA

ISBN: (纸本)9781665445092

Conversational interfaces for the detail-oriented retail fashion domain are more natural, expressive, and user friendly than classical keyword-based search interfaces. In this paper, we introduce the Fashion IQ dataset to support and advance research on interactive fashion image retrieval. Fashion IQ is the first fashion dataset to provide human-generated captions that distinguish similar pairs of garment images together with side-information consisting of real-world product descriptions and derived visual attribute labels for these images. We provide a detailed analysis of the characteristics of the Fashion IQ data, and present a transformer-based user simulator and interactive image retriever that can seamlessly integrate visual attributes with image features, user feedback, and dialog history, leading to improved performance over the state of the art in dialogbased image retrieval. We believe that our dataset will encourage further work on developing more natural and realworld applicable conversational shopping assistants.(1)

关键词： Visualization computer vision Image retrieval Natural languages Clothing Buildings Ontologies

来源：评论

学校读者我要写书评

暂无评论

Variational Prototype Learning for Deep Face recognition

Variational Prototype Learning for Deep Face Recognition

引用

ieee/cvf conference on computer vision and pattern recognition (CVPR)

作者： Deng, Jiankang Guo, Jia Yang, Jing Lattas, Alexandros Zafeiriou, Stefanos Huawei Shenzhen Peoples R China Imperial Coll London England InsightFace London England Univ Nottingham Nottingham England

ISBN: (纸本)9781665445092

Deep face recognition has achieved remarkable improvements due to the introduction of margin-based soft-max loss, in which the prototype stored in the last linear layer represents the center of each class. In these methods, training samples are enforced to be close to positive prototypes and far apart from negative prototypes by a clear margin. However, we argue that prototype learning only employs sample-to-prototype comparisons without considering sample-to-sample comparisons during training and the low loss value gives us an illusion of perfect feature embedding, impeding the further exploration of SGD. To this end, we propose Variational Prototype Learning (VPL), which represents every class as a distribution instead of a point in the latent space. By identifying the slow feature drift phenomenon, we directly inject memorized features into prototypes to approximate variational prototype sampling. The proposed VPL can simulate sample-to-sample comparisons within the classification framework, encouraging the SGD solver to be more exploratory, while boosting performance. Moreover, VPL is conceptually simple, easy to implement, computationally efficient and memory saving. We present extensive experimental results on popular benchmarks, which demonstrate the superiority of the proposed VPL method over the state-of-the-art competitors.

关键词： Training Learning systems computer vision Face recognition Memory management Prototypes Benchmark testing

来源：评论

学校读者我要写书评

暂无评论

没有更多数据了...

全选清除本页清除全部题录导出标记到“检索档案”

共500页 << < 439 440 441 442 443 444 445 446 447 448 > >>

检索报告对象比较合并检索0

隐藏清空

合并搜索

回到顶部

执行限定条件

内容：

评分：

请选择保存的检索档案：

请选择收藏分类：

订阅名称：

通借通还

温馨提示：

图书名称：

借书校区：

取书校区：

手机号码：

邮箱地址：

一卡通帐号：

电话和邮箱必须正确填写，我们会与您联系确认。

联系人：

所在院系：

联系邮箱：

联系电话：

内蒙古自治区呼和浩特市赛罕区大学西街235号邮编: 010021

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：