检索结果-内蒙古大学图书馆

您好，读者！请登录

内蒙古大学图书馆

首页
概况
党建
资源
服务
科研支持
- 论文收录引用证明
- 科技查新
知识产权
档案馆
帮助

咨询与建议

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

您的常用邮箱：*

您的手机号码：*

问题描述：

当前已输入0个字，您还可以输入200个字

全部搜索
期刊论文
图书
学位论文
标准
纸本馆藏
外文资源发现
数据库导航
超星发现

高级检索

分类表

所选分类

>> <<

限定检索结果

标题

标题
作者
主题词
出版物名称
出版社
机构
学科分类号
摘要
ISBN
ISSN
基金资助
索书号

作者

作者
标题
主题词
出版物名称
出版社
机构
学科分类号
摘要
ISBN
ISSN
基金资助
索书号

文献类型

50,636 篇 会议
1,423 册 图书
1,044 篇 期刊文献
1 篇 学位论文

馆藏范围

53,101 篇 电子文献
3 种 纸本馆藏

日期分布

学科分类号

31,927 篇 工学
- 24,897 篇 计算机科学与技术...
- 12,629 篇 软件工程
- 5,176 篇 光学工程
- 4,760 篇 电气工程
- 4,463 篇 信息与通信工程
- 4,261 篇 机械工程
- 3,980 篇 控制科学与工程
- 2,477 篇 生物工程
- 1,736 篇 生物医学工程（可授...
- 1,583 篇 仪器科学与技术
- 1,314 篇 电子科学与技术（可...
- 795 篇 化学工程与技术
- 715 篇 安全科学与工程
- 560 篇 交通运输工程
- 383 篇 建筑学
- 335 篇 土木工程
11,899 篇 理学
- 6,481 篇 物理学
- 5,426 篇 数学
- 2,765 篇 生物学
- 1,915 篇 统计学（可授理学、...
- 804 篇 化学
- 669 篇 系统科学
5,313 篇 医学
- 5,103 篇 临床医学
- 731 篇 基础医学(可授医学...
- 459 篇 药学(可授医学、理...
3,369 篇 管理学
- 1,964 篇 图书情报与档案管...
- 1,554 篇 管理科学与工程(可...
- 485 篇 工商管理
720 篇 艺术学
- 718 篇 设计学（可授艺术学...
434 篇 法学
- 406 篇 社会学
302 篇 农学
198 篇 教育学
166 篇 经济学
63 篇 文学
48 篇 军事学

主题

17,404 篇 computer vision
9,026 篇 pattern recognit...
4,196 篇 training
3,830 篇 feature extracti...
3,134 篇 cameras
2,876 篇 computational mo...
2,794 篇 image segmentati...
2,622 篇 visualization
2,574 篇 shape
2,535 篇 face recognition
2,176 篇 robustness
2,124 篇 computer science
1,975 篇 object detection
1,960 篇 computer archite...
1,882 篇 layout
1,853 篇 object recogniti...
1,801 篇 three-dimensiona...
1,725 篇 neural networks
1,705 篇 humans
1,697 篇 image recognitio...

机构

165 篇 univ chinese aca...
144 篇 tsinghua univers...
135 篇 national laborat...
106 篇 univ sci & techn...
104 篇 zhejiang univers...
101 篇 shanghai jiao to...
95 篇 university of sc...
95 篇 microsoft resear...
85 篇 zhejiang univ pe...
84 篇 shanghai ai lab ...
74 篇 school of comput...
69 篇 computer vision ...
68 篇 peking univ peop...
68 篇 chinese acad sci...
66 篇 chinese univ hon...
63 篇 institute of inf...
62 篇 google res mount...
61 篇 univ oxford oxfo...
59 篇 univ toronto on
57 篇 swiss fed inst t...

作者

92 篇 van gool luc
87 篇 umapada pal
78 篇 zhang lei
64 篇 lee seong-whan
50 篇 vittorio murino
42 篇 yang yi
34 篇 nassir navab
34 篇 ling haibin
33 篇 li xin
33 篇 jie yang
32 篇 liu yang
31 篇 loy chen change
30 篇 escalera sergio
30 篇 h. bischof
29 篇 zhou jie
29 篇 vasconcelos nuno
29 篇 jan-michael frah...
28 篇 blumenstein mich...
27 篇 jia yunde
27 篇 luo ping

语言

50,122 篇 英文
2,746 篇 其他
252 篇 中文
22 篇 土耳其文
4 篇 西班牙文
2 篇 日文
2 篇 葡萄牙文
2 篇 俄文

检索条件"任意字段=IEEE Conference on Computer Vision and Pattern Recognition"

共 53104 条记录，以下是121-130 订阅

全选清除本页清除全部题录导出标记到"检索档案"

详细简洁

排序：

相关度排序

相关度排序
时效性降序
时效性升序

Federated Learning with a Single Shared Image

Federated Learning with a Single Shared Image

引用

ieee/CVF conference on computer vision and pattern recognition (CVPR)

作者： Soni, Sunny Saeed, Aaqib Asano, Yuki M. Univ Amsterdam Amsterdam Netherlands TU Eindhoven Eindhoven Netherlands

ISBN: (纸本)9798350365474

Federated Learning (FL) enables multiple machines to collaboratively train a machine learning model without sharing of private training data. Yet, especially for heterogeneous models, a key bottleneck remains the transfer of knowledge gained from each client model with the server. One popular method, FedDF, uses distillation to tackle this task with the use of a common, shared dataset on which predictions are exchanged. However, in many contexts such a dataset might be difficult to acquire due to privacy and the clients might not allow for storage of a large shared dataset. To this end, in this paper, we introduce a new method that improves this knowledge distillation method to only rely on a single shared image between clients and server. In particular, we propose a novel adaptive dataset pruning algorithm that selects the most informative crops generated from only a single image. With this, we show that federated learning with distillation under a limited shared dataset budget works better by using a single image compared to multiple individual ones. Finally, we extend our approach to allow for training heterogeneous client architectures by incorporating a non-uniform distillation schedule and client-model mirroring on the server side.

关键词： computer vision Federated Learning Limited Data Representation Learning

来源：评论

学校读者我要写书评

暂无评论

Unveiling the Anomalies in an Ever-ChangingWorld: A Benchmark for Pixel-Level Anomaly Detection in Continual Learning

Unveiling the Anomalies in an Ever-ChangingWorld: A Benchmar...

引用

ieee/CVF conference on computer vision and pattern recognition (CVPR)

作者： Bugarin, Nikola Bugaric, Jovana Barusco, Manuel Pezze, Davide Dalle Susto, Gian Antonio Univ Padua Padua Italy

ISBN: (纸本)9798350365474

Anomaly Detection is a relevant problem in numerous real-world applications, especially when dealing with images. However, little attention has been paid to the issue of changes over time in the input data distribution, which may cause a significant decrease in performance. In this study, we investigate the problem of Pixel-Level Anomaly Detection in the Continual Learning setting, where new data arrives over time and the goal is to perform well on new and old data. We implement several state-of-the-art techniques to solve the Anomaly Detection problem in the classic setting and adapt them to work in the Continual Learning setting. To validate the approaches, we use a real-world dataset of images with pixel-based anomalies to provide a reliable benchmark and serve as a foundation for further advancements in the field. We provide a comprehensive analysis, discussing which Anomaly Detection methods and which families of approaches seem more suitable for the Continual Learning setting.

关键词： Anomaly Detection computer vision Continual Learning

来源：评论

学校读者我要写书评

暂无评论

Semantics-aware Motion Retargeting with vision-Language Models

Semantics-aware Motion Retargeting with Vision-Language Mode...

引用

ieee/CVF conference on computer vision and pattern recognition (CVPR)

作者： Zhang, Haodong Chen, Zhike Xu, Haocheng Hao, Lei Wu, Xiaofei Xu, Songcen Zhang, Zhensong Wang, Yue Xiong, Rong Zhejiang Univ Hangzhou Peoples R China Huawei Noahs Ark Lab Montreal PQ Canada

ISBN: (纸本)9798350353013;9798350353006

Capturing and preserving motion semantics is essential to motion retargeting between animation characters. However, most of the previous works neglect the semantic information or rely on human-designed joint-level representations. Here, we present a novel Semantics-aware Motion reTargeting (SMT) method with the advantage of vision-language models to extract and maintain meaningful motion semantics. We utilize a differentiable module to render 3D motions. Then the high-level motion semantics are incorporated into the motion retargeting process by feeding the vision-language model with the rendered images and aligning the extracted semantic embeddings. To ensure the preservation of fine-grained motion details and high-level semantics, we adopt a two-stage pipeline consisting of skeleton-aware pre-training and fine-tuning with semantics and geometry constraints. Experimental results show the effectiveness of the proposed method in producing high-quality motion retargeting results while accurately preserving motion semantics. Project page can be found at https://***/view/smtnet.

关键词： Animation Motion Retargeting vision-Language Model

来源：评论

学校读者我要写书评

暂无评论

MIPI 2024 Challenge on Nighttime Flare Removal: Methods and Results

MIPI 2024 Challenge on Nighttime Flare Removal: Methods and ...

引用

ieee/CVF conference on computer vision and pattern recognition (CVPR)

作者： Dai, Yuekun Zhang, Dafeng Li, Xiaoming Yue, Zongsheng Li, Chongyi Zhou, Shangchen Feng, Ruicheng Yang, Peiqing Jin, Zhezhu Liu, Guanqun Loy, Chen Change Nanyang Technol Univ S Lab Singapore Singapore Samsung Res China Nanjing Peoples R China Nankai Univ Tianjin Peoples R China

ISBN: (纸本)9798350365474

The increasing demand for computational photography and imaging on mobile platforms has led to the widespread development and integration of advanced image sensors with novel algorithms in camera systems. However, the scarcity of high-quality data for research and the rare opportunity for in-depth exchange of views from industry and academia constrain the development of mobile intelligent photography and imaging (MIPI). Building on the achievements of the previous MIPI Workshops held at ECCV 2022 and CVPR 2023, we introduce our third MIPI challenge including three tracks focusing on novel image sensors and imaging algorithms. In this paper, we summarize and review the Nighttime Flare Removal track on MIPI 2024. In total, 170 participants were successfully registered, and 14 teams submitted results in the final testing phase. The developed solutions in this challenge achieved state-of-the-art performance on Nighttime Flare Removal. More details of this challenge and the link to the dataset can be found at https://***/MIPI2024.

关键词： computer vision flare removal image restoration

来源：评论

学校读者我要写书评

暂无评论

Repurposing Diffusion-Based Image Generators for Monocular Depth Estimation

Repurposing Diffusion-Based Image Generators for Monocular D...

引用

ieee/CVF conference on computer vision and pattern recognition (CVPR)

作者： Ke, Bingxin Obukhov, Anton Huang, Shengyu Metzger, Nando Daudt, Rodrigo Caye Schindler, Konrad Swiss Fed Inst Technol Photogrammetry & Remote Sensing Zurich Switzerland

ISBN: (纸本)9798350353006

Monocular depth estimation is a fundamental computer vision task. Recovering 3D depth from a single image is geometrically ill-posed and requires scene understanding, so it is not surprising that the rise of deep learning has led to a breakthrough. The impressive progress of monocular depth estimators has mirrored the growth in model capacity, from relatively modest CNNs to large Transformer architectures. Still, monocular depth estimators tend to struggle when presented with images with unfamiliar content and layout, since their knowledge of the visual world is restricted by the data seen during training, and challenged by zero-shot generalization to new domains. This motivates us to explore whether the extensive priors captured in recent generative diffusion models can enable better, more generalizable depth estimation. We introduce Marigold, a method for affine-invariant monocular depth estimation that is derived from Stable Diffusion and retains its rich prior knowledge. The estimator can be fine-tuned in a couple of days on a single GPU using only synthetic training data. It delivers state-of-the-art performance across a wide range of datasets, including over 20% performance gains in specific cases. Project page: https://***.

关键词： ddim ddpm depth estimation diffusion generative LDM vision

来源：评论

学校读者我要写书评

暂无评论

GSNeRF: Generalizable Semantic Neural Radiance Fields with Enhanced 3D Scene Understanding

GSNeRF: Generalizable Semantic Neural Radiance Fields with E...

引用

ieee/CVF conference on computer vision and pattern recognition (CVPR)

作者： Chou, Zi-Ting Huang, Sheng-Yu Liu, I-Jieh Wang, Yu-Chiang Frank Natl Taiwan Univ Grad Inst Commun Engn Taipei Taiwan NVIDIA Taipei Taiwan

ISBN: (纸本)9798350353006

Utilizing multi-view inputs to synthesize novel-view images, Neural Radiance Fields (NeRF) have emerged as a popular research topic in 3D vision. In this work, we introduce a Generalizable Semantic Neural Radiance Fields ( GSNeRF), which uniquely takes image semantics into the synthesis process so that both novel view image and the associated semantic maps can be produced for unseen scenes. Our GSNeRF is composed of two stages: Semantic GeoReasoning and Depth-Guided Visual rendering. The former is able to observe multi- view image inputs to extract semantic and geometry features from a scene. Guided by the resulting image geometry information, the latter performs both image and semantic rendering with improved performances. Our experiments not only confirm that GSNeRF performs favorably against prior works on both novel-view image and semantic segmentation synthesis but the effectiveness of our sampling strategy for visual rendering is further verified.

关键词： 3D computer vision generalizable nerf NeRF segmentation

来源：评论

学校读者我要写书评

暂无评论

JRDB-Social: A Multifaceted Robotic Dataset for Understanding of Context and Dynamics of Human Interactions Within Social Groups

JRDB-Social: A Multifaceted Robotic Dataset for Understandin...

引用

ieee/CVF conference on computer vision and pattern recognition (CVPR)

作者： Jahangard, Simindokht Cai, Zhixi Wen, Shiki Rezatofighi, Hamid Monash Univ Clayton Vic Australia

ISBN: (纸本)9798350353006

Understanding human social behaviour is crucial in computer vision and robotics. Micro-level observations like individual actions fall short, necessitating a comprehensive approach that considers individual behaviour, intra-group dynamics, and social group levels for a thorough understanding. To address dataset limitations, this paper introduces JRDB-Social, an extension of JRDB [2]. Designed to fill gaps in human understanding across diverse indoor and outdoor social contexts, JRDB-Social provides annotations at three levels: individual attributes, intra-group interactions, and social group context. This dataset aims to enhance our grasp of human social dynamics for robotic applications. Utilizing the recent cutting-edge multi-modal large language models, we evaluated our benchmark to explore their capacity to decipher social human behaviour.

关键词： dataset human attributes human human interaction human social behaviour understanding interaction large language model multifaceted robotic dataset social group social robot vision language model visual question answering visual reasoning

来源：评论

学校读者我要写书评

暂无评论

Rugby Scene Classification Enhanced by vision Language Model

Rugby Scene Classification Enhanced by Vision Language Model

引用

ieee/CVF conference on computer vision and pattern recognition (CVPR)

作者： Nonaka, Naoki Fujihira, Ryo Koshiba, Toshiki Maeda, Akira Seita, Jun RIKEN Informat R&D & Strategy Headquarters Adv Data Sci Project Wako Saitama Japan Hakata Knee & Sports Clin Fukuoka Japan

ISBN: (纸本)9798350365474

This study investigates the integration of vision language models (VLM) to enhance the classification of situations within rugby match broadcasts. The importance of accurately identifying situations in sports videos is emphasized for understanding game dynamics and facilitating downstream tasks like performance evaluation and injury prevention. Utilizing a dataset comprising 18, 000 labeled images extracted at 0.2-second intervals from 100 minutes of rugby match broadcasts, scene classification tasks including contact plays (scrums, mauls, rucks, tackles, lineouts), rucks, tackles, lineouts, and multiclass classification were performed. The study aims to validate the utility of VLM outputs in improving classification performance compared to using solely image data. Experimental results demonstrate substantial performance improvements across all tasks with the incorporation of VLM outputs. Our analysis of prompts suggests that, when provided with appropriate contextual information through natural language, VLMs can effectively capture the context of a given image. The findings of our study indicate that leveraging VLMs in the domain of sports analysis holds promise for developing image processing models capable of incorpolating the tacit knowledge encoded within language models, as well as information conveyed through natural language descriptions.

关键词： Rugby Scene classification vision language model

来源：评论

学校读者我要写书评

暂无评论

Towards Efficient Audio-Visual Learners via Empowering Pre-trained vision Transformers with Cross-Modal Adaptation

Towards Efficient Audio-Visual Learners via Empowering Pre-t...

引用

ieee/CVF conference on computer vision and pattern recognition (CVPR)

作者： Wang, Kai Tian, Yapeng Hatzinakos, Dimitrios Univ Toronto Toronto ON Canada Univ Texas Dallas Richardson TX 75083 USA

ISBN: (纸本)9798350365474

In this paper, we explore the cross-modal adaptation of pre-trained vision Transformers (ViTs) for the audio-visual domain by incorporating a limited set of trainable parameters. To this end, we propose a Spatial-Temporal-Global Cross-Modal Adaptation (STG-CMA) to gradually equip the frozen ViTs with the capability for learning audio-visual representation, consisting of the modality-specific temporal adaptation for temporal reasoning of each modality, the cross-modal spatial adaptation for refining the spatial information with the cue from counterpart modality, and the cross-modal global adaptation for global interaction between audio and visual modalities. Our STG-CMA presents a meaningful finding that only leveraging the shared pre-trained image model with inserted lightweight adapters is enough for spatial-temporal modeling and feature interaction of audio-visual modality. Extensive experiments indicate that our STG-CMA achieves state-of-the-art performance on various audio-visual understanding tasks including AVE, AVS, and AVQA while containing significantly reduced tunable parameters. The code is available at https://***/kaiw7/STG-CMA.

关键词： Audio-visual Lenarning Cross-modal Adaptation Pre-trained vision Transformers Reduced Tunanle Parameters Spatial-temporal-global Modeling

来源：评论

学校读者我要写书评

暂无评论

ArGue: Attribute-Guided Prompt Tuning for vision-Language Models

ArGue: Attribute-Guided Prompt Tuning for Vision-Language Mo...

引用

ieee/CVF conference on computer vision and pattern recognition (CVPR)

作者： Tian, Xinyu Zou, Shu Yang, Zhaoyuan Zhang, Jing Australian Natl Univ Canberra ACT Australia GE Res Niskayuna NY USA

ISBN: (纸本)9798350353006

Although soft prompt tuning is effective in efficiently adapting vision-Language (V&L) models for downstream tasks, it shows limitations in dealing with distribution shifts. We address this issue with Attribute-Guided Prompt Tuning (ArGue), making three key contributions. 1) In contrast to the conventional approach of directly appending soft prompts preceding class names, we align the model with primitive visual attributes generated by Large Language Models (LLMs). We posit that a model's ability to express high confidence in these attributes signifies its capacity to discern the correct class rationales. 2) We introduce attribute sampling to eliminate disadvantageous attributes, thus only semantically meaningful attributes are preserved. 3) We propose negative prompting, explicitly enumerating class-agnostic attributes to activate spurious correlations and encourage the model to generate highly orthogonal probability distributions in relation to these negative features. In experiments, our method significantly outperforms current state-of-the-art prompt tuning methods on both novel class prediction and out-of-distribution generalization tasks. The code is available https://***/Liam-Tian/ArGue.

关键词： few-shot adaptation prompt tuning vision-language model

来源：评论

学校读者我要写书评

暂无评论

没有更多数据了...

全选清除本页清除全部题录导出标记到“检索档案”

共500页 << < 9 10 11 12 13 14 15 16 17 18 > >>

检索报告对象比较合并检索0

隐藏清空

合并搜索

回到顶部

执行限定条件

内容：

评分：

请选择保存的检索档案：

请选择收藏分类：

订阅名称：

通借通还

温馨提示：

图书名称：

借书校区：

取书校区：

手机号码：

邮箱地址：

一卡通帐号：

电话和邮箱必须正确填写，我们会与您联系确认。

联系人：

所在院系：

联系邮箱：

联系电话：

内蒙古自治区呼和浩特市赛罕区大学西街235号邮编: 010021

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：