检索结果-内蒙古大学图书馆

您好，读者！请登录

内蒙古大学图书馆

首页
概况
党建
资源
服务
科研支持
- 论文收录引用证明
- 科技查新
知识产权
档案馆
帮助

咨询与建议

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

您的常用邮箱：*

您的手机号码：*

问题描述：

当前已输入0个字，您还可以输入200个字

全部搜索
期刊论文
图书
学位论文
标准
纸本馆藏
外文资源发现
数据库导航
超星发现

高级检索

分类表

所选分类

>> <<

限定检索结果

标题

标题
作者
主题词
出版物名称
出版社
机构
学科分类号
摘要
ISBN
ISSN
基金资助
索书号

作者

作者
标题
主题词
出版物名称
出版社
机构
学科分类号
摘要
ISBN
ISSN
基金资助
索书号

文献类型

23,008 篇 会议
126 册 图书
94 篇 期刊文献

馆藏范围

23,227 篇 电子文献
1 种 纸本馆藏

日期分布

学科分类号

13,631 篇 工学
- 11,116 篇 计算机科学与技术...
- 3,481 篇 软件工程
- 2,445 篇 机械工程
- 1,716 篇 光学工程
- 1,080 篇 电气工程
- 1,014 篇 控制科学与工程
- 788 篇 信息与通信工程
- 411 篇 仪器科学与技术
- 352 篇 生物工程
- 251 篇 生物医学工程（可授...
- 196 篇 电子科学与技术（可...
- 114 篇 化学工程与技术
- 109 篇 安全科学与工程
- 100 篇 测绘科学与技术
- 88 篇 建筑学
- 88 篇 交通运输工程
- 84 篇 土木工程
3,495 篇 医学
- 3,482 篇 临床医学
- 82 篇 基础医学(可授医学...
3,246 篇 理学
- 1,941 篇 物理学
- 1,643 篇 数学
- 563 篇 统计学（可授理学、...
- 500 篇 生物学
- 249 篇 系统科学
- 106 篇 化学
521 篇 管理学
- 311 篇 图书情报与档案管...
- 223 篇 管理科学与工程(可...
- 76 篇 工商管理
276 篇 艺术学
- 276 篇 设计学（可授艺术学...
66 篇 法学
- 63 篇 社会学
38 篇 农学
28 篇 教育学
22 篇 经济学
10 篇 军事学
3 篇 文学

主题

10,186 篇 computer vision
3,967 篇 pattern recognit...
3,005 篇 training
2,007 篇 computational mo...
1,818 篇 visualization
1,815 篇 cameras
1,515 篇 feature extracti...
1,481 篇 shape
1,455 篇 three-dimensiona...
1,438 篇 image segmentati...
1,287 篇 robustness
1,206 篇 computer archite...
1,155 篇 semantics
1,147 篇 conferences
1,107 篇 layout
1,092 篇 computer science
1,088 篇 object detection
1,025 篇 benchmark testin...
970 篇 codes
922 篇 face recognition

机构

136 篇 univ sci & techn...
121 篇 univ chinese aca...
118 篇 chinese univ hon...
105 篇 carnegie mellon ...
101 篇 tsinghua univers...
101 篇 microsoft resear...
95 篇 swiss fed inst t...
93 篇 zhejiang univ pe...
82 篇 university of sc...
81 篇 zhejiang univers...
79 篇 university of ch...
77 篇 shanghai ai lab ...
72 篇 shanghai jiao to...
69 篇 national laborat...
67 篇 microsoft res as...
67 篇 alibaba grp peop...
64 篇 adobe research
60 篇 peking univ peop...
60 篇 tsinghua univ pe...
59 篇 univ oxford oxfo...

作者

81 篇 van gool luc
72 篇 timofte radu
65 篇 zhang lei
47 篇 luc van gool
40 篇 yang yi
40 篇 li stan z.
37 篇 loy chen change
35 篇 chen chen
33 篇 xiaoou tang
32 篇 liu yang
32 篇 qi tian
31 篇 tian qi
31 篇 sun jian
30 篇 murino vittorio
29 篇 ling haibin
29 篇 darrell trevor
29 篇 pascal fua
29 篇 li fei-fei
28 篇 li xin
28 篇 ying shan

语言

22,989 篇 英文
210 篇 其他
22 篇 中文
5 篇 土耳其文
2 篇 日文

检索条件"任意字段=IEEE Conference on Computer Vision and Pattern Recognition Workshops"

共 23228 条记录，以下是451-460 订阅

全选清除本页清除全部题录导出标记到"检索档案"

详细简洁

排序：

相关度排序

相关度排序
时效性降序
时效性升序

Progressive Semantic-Guided vision Transformer for Zero-Shot Learning

Progressive Semantic-Guided Vision Transformer for Zero-Shot...

引用

ieee/CVF conference on computer vision and pattern recognition (CVPR)

作者： Chen, Shiming Hou, Wenjin Khan, Salman Khan, Fahad Shahbaz Mohamed Bin Zayed Univ AI Abu Dhabi U Arab Emirates Huazhong Univ Sci & Technol Wuhan Peoples R China Australian Natl Univ Canberra ACT Australia Linkoping Univ Linkoping Sweden

ISBN: (纸本)9798350353006

Zero-shot learning (ZSL) recognizes the unseen classes by conducting visual-semantic interactions to transfer semantic knowledge from seen classes to unseen ones, supported by semantic information (e.g., attributes). However, existing ZSL methods simply extract visual features using a pre-trained network backbone (i.e., CNN or ViT), which fail to learn matched visual-semantic correspondences for representing semantic-related visual features as lacking of the guidance of semantic information, resulting in undesirable visual-semantic interactions. To tackle this issue, we propose a progressive semantic-guided vision transformer for zero-shot learning (dubbed ZSLViT). ZSLViT mainly considers two properties in the whole network: i) discover the semantic-related visual representations explicitly, and ii) discard the semantic-unrelated visual information. Specifically, we first introduce semantic-embedded token learning to improve the visual-semantic correspondences via semantic enhancement and discover the semantic-related visual tokens explicitly with semantic-guided token attention. Then, we fuse low semantic-visual correspondence visual tokens to discard the semantic-unrelated visual information for visual enhancement. These two operations are integrated into various encoders to progressively learn semantic-related visual representations for accurate visual-semantic interactions in ZSL. The extensive experiments show that our ZSLViT achieves significant performance gains on three popular benchmark datasets, i.e., CUB, SUN, and AWA2.

关键词： zero-shot learning

来源：评论

学校读者我要写书评

暂无评论

GeneAvatar: Generic Expression-Aware Volumetric Head Avatar Editing from a Single Image

GeneAvatar: Generic Expression-Aware Volumetric Head Avatar ...

引用

ieee/CVF conference on computer vision and pattern recognition (CVPR)

作者： Bao, Chong Zhang, Yinda Li, Yuan Zhang, Xiyu Yang, Bangbang Bao, Hujun Pollefeys, Marc Zhang, Guofeng Cui, Zhaopeng Zhejiang Univ State Key Lab CAD & CG Hangzhou Peoples R China Google Mountain View CA 94043 USA Swiss Fed Inst Technol Zurich Switzerland ByteDance Beijing Peoples R China

ISBN: (纸本)9798350353013;9798350353006

Recently, we have witnessed the explosive growth of various volumetric representations in modeling animatable head avatars. However, due to the diversity of frameworks, there is no practical method to support high-level applications like 3D head avatar editing across different representations. In this paper, we propose a generic avatar editing approach that can be universally applied to various 3DMM-driving volumetric head avatars. To achieve this goal, we design a novel expression-aware modification generative model, which enables lift 2D editing from a single image to a consistent 3D modification field. To ensure the effectiveness of the generative modification process, we develop several techniques, including an expression-dependent modification distillation scheme to draw knowledge from the large-scale head avatar model and 2D facial texture editing tools, implicit latent space guidance to enhance model convergence, and a segmentation-based loss reweight strategy for fine-grained texture inversion. Extensive experiments demonstrate that our method delivers high-quality and consistent results across multiple expression and viewpoints. Project page: https://***/geneavatar/.

关键词： 3D computer vision avatar editing head avatar neural rendering

来源：评论

学校读者我要写书评

暂无评论

Continual Forgetting for Pre-trained vision Models

Continual Forgetting for Pre-trained Vision Models

引用

ieee/CVF conference on computer vision and pattern recognition (CVPR)

作者： Zhao, Hongbo Ni, Bolin Fang, Junsong Wang, Yuxi Chen, Yuntao Meng, Gaofeng Zhang, Zhaoxiang Chinese Acad Sci Inst Automat State Key Lab Multimodal Artificial Intelligence Beijing Peoples R China Chinese Acad Sci Ctr Artificial Intelligence & Robot Hong Kong Inst Sci & Innovat Beijing Peoples R China Univ Chinese Acad Sci Beijing Peoples R China Shanghai Artificial Intelligence Lab Shanghai Peoples R China

ISBN: (纸本)9798350353006

For privacy and security concerns, the need to erase unwanted information from pre-trained vision models is becoming evident nowadays. In real-world scenarios, erasure requests originate at any time from both users and model owners. These requests usually form a sequence. Therefore, under such a setting, selective information is expected to be continuously removed from a pre-trained model while maintaining the rest. We define this problem as continual forgetting and identify two key challenges. (i) For unwanted knowledge, efficient and effective deleting is crucial. (ii) For remaining knowledge, the impact brought by the forgetting procedure should be minimal. To address them, we propose Group Sparse LoRA (GS-LoRA). Specifically, towards (i), we use LoRA modules to fine-tune the FFN layers in Transformer blocks for each forgetting task independently, and towards (ii), a simple group sparse regularization is adopted, enabling automatic selection of specific LoRA groups and zeroing out the others. GS-LoRA is effective, parameter-efficient, data-efficient, and easy to implement. We conduct extensive experiments on face recognition, object detection and image classification and demonstrate that GS-LoRA manages to forget specific classes with minimal impact on other classes. Codes will be released on https://***/bjzhb666/GS-LoRA.

关键词： Continual Forgetting Machine Unlearning

来源：评论

学校读者我要写书评

暂无评论

RegionPLC: Regional Point-Language Contrastive Learning for Open-World 3D Scene Understanding

RegionPLC: Regional Point-Language Contrastive Learning for ...

引用

ieee/CVF conference on computer vision and pattern recognition (CVPR)

作者： Yang, Jihan Ding, Runyu Deng, Weipeng Wang, Zhe Qi, Xiaojuan Univ Hong Kong Hong Kong Peoples R China SenseTime Res Hong Kong Peoples R China

ISBN: (纸本)9798350353006

We propose a lightweight and scalable Regional Point-Language Contrastive learning framework, namely RegionPLC, for open-world 3D scene understanding, aiming to identify and recognize open-set objects and categories. Specifically, based on our empirical studies, we introduce a 3D-aware SFusion strategy that fuses 3D vision-language pairs derived from multiple 2D foundation models, yielding high-quality, dense region-level language descriptions without human 3D annotations. Subsequently, we devise a region-aware point-discriminative contrastive learning objective to enable robust and effective 3D learning from dense regional language supervision. We carry out extensive experiments on ScanNet, ScanNet200, and nuScenes datasets, and our model outperforms prior 3D open-world scene understanding approaches by an average of 17.2% and 9.1% for semantic and instance segmentation, respectively, while maintaining greater scalability and lower resource demands. Furthermore, our method has the flexibility to be effortlessly integrated with language models to enable open-ended grounded 3D reasoning without extra task-specific training. Code will be released at github.

关键词： 3D vision embodied AI Open-world

来源：评论

学校读者我要写书评

暂无评论

Emergent Open-Vocabulary Semantic Segmentation from Off-the-shelf vision-Language Models

Emergent Open-Vocabulary Semantic Segmentation from Off-the-...

引用

ieee/CVF conference on computer vision and pattern recognition (CVPR)

作者： Luo, Jiayun Khandelwal, Siddhesh Sigal, Leonid Li, Boyang Nanyang Technol Univ Singapore Singapore Univ British Columbia Vector Inst AI Vancouver BC Canada

ISBN: (纸本)9798350353013;9798350353006

From image-text pairs, large-scale vision-language models (VLMs) learn to implicitly associate image regions with words, which prove effective for tasks like visual question answering. However, leveraging the learned association for open-vocabulary semantic segmentation remains a challenge. In this paper, we propose a simple, yet extremely effective, training-free technique, Plug-and-Play Open-Vocabulary Semantic Segmentation (PnP-OVSS) for this task. PnP-OVSS leverages a VLM with direct text-to-image cross-attention and an image-text matching loss. To balance between over-segmentation and under-segmentation, we introduce Salience Dropout;by iteratively dropping patches that the model is most attentive to, we are able to better resolve the entire extent of the segmentation mask. PnP-OVSS does not require any neural network training and performs hyperparameter tuning without the need for any segmentation annotations, even for a validation set. PnP-OVSS demonstrates substantial improvements over comparable baselines (+29.4% mIoU on Pascal VOC, +13.2% mIoU on Pascal Context, +14.0% mIoU on MS COCO, +2.4% mIoU on COCO Stuff) and even outperforms most baselines that conduct additional network training on top of pretrained VLMs. Our codebase is at https://***/letitiabanana/PnP-OVSS.

关键词： open-vocabulary semantic segmentation training-free

来源：评论

学校读者我要写书评

暂无评论

The Neglected Tails in vision-Language Models

The Neglected Tails in Vision-Language Models

引用

ieee/CVF conference on computer vision and pattern recognition (CVPR)

作者： Parashar, Shubham Lin, Zhiqiu Liu, Tian Dong, Xiangjue Li, Yanan Ramanan, Deva Caverlee, James Kong, Shu Texas A&M Univ College Stn TX 77840 USA Carnegie Mellon Univ Pittsburgh PA 15213 USA Zhejiang Lab Hangzhou Peoples R China Univ Macau Taipa Macao Peoples R China

ISBN: (纸本)9798350353006

vision-language models (VLMs) excel in zero-shot recognition but their performance varies greatly across different visual concepts. For example, although CLIP achieves impressive accuracy on ImageNet (60-80%), its performance drops below 10% for more than ten concepts like night snake, presumably due to their limited presence in the pretraining data. However, measuring the frequency of concepts in VLMs' large-scale datasets is challenging. We address this by using large language models (LLMs) to count the number of pretraining texts that contain synonyms of these concepts. Our analysis confirms that popular datasets, such as LAION, exhibit a long-tailed concept distribution, yielding biased performance in VLMs. We also find that downstream applications of VLMs, including visual chatbots (e.g., GPT-4V) and text-to-image models (e.g., Stable Diffusion), often fail to recognize or generate images of rare concepts identified by our method. To mitigate the imbalanced performance of zero-shot VLMs, we propose REtrieval-Augmented Learning (REAL). First, instead of prompting VLMs using the original class names, REAL uses their most frequent synonyms found in pretraining texts. This simple change already outperforms costly human-engineered and LLM-enriched prompts over nine benchmark datasets. Second, REAL trains a linear classifier on a small yet balanced set of pretraining data re-trieved using concept synonyms. REAL surpasses the previous zero-shot SOTA, using 400x less storage and 10,000x less training time!

关键词： Long tailed recognition vision-Language Models Zero-Shot recognition

来源：评论

学校读者我要写书评

暂无评论

Instance-aware Exploration-Verification-Exploitation for Instance ImageGoal Navigation

Instance-aware Exploration-Verification-Exploitation for Ins...

引用

ieee/CVF conference on computer vision and pattern recognition (CVPR)

作者： Lei, Xiaohan Wang, Min Zhou, Wengang Li, Li Li, Houqiang Univ Sci & Technol China MoE Key Lab Brain Inspired Intelligent Percept & Hefei Anhui Peoples R China Hefei Comprehens Natl Sci Ctr Inst Artificial Intelligence Hefei Anhui Peoples R China

ISBN: (纸本)9798350353006

As a new embodied vision task, Instance ImageGoal Navigation (IIN) aims to navigate to a specified object depicted by a goal image in an unexplored environment. The main challenge of this task lies in identifying the target object from different viewpoints while rejecting similar distractors. Existing ImageGoal Navigation methods usually adopt the simple Exploration-Exploitation framework and ignore the identification of specific instance during navigation. In this work, we propose to imitate the human behaviour of "getting closer to confirm" when distinguishing objects from a distance. Specifically, we design a new modular navigation framework named Instance-aware Exploration-Verification- Exploitation (IEVE) for instance-level image goal navigation. Our method allows for active switching among the exploration, verification, and exploitation actions, thereby facilitating the agent in making reasonable decisions under different situations. On the challenging HabitatMatterport 3D semantic (HM3D-SEM) dataset, our method surpasses previous state-of-the-art work, with a classical segmentation model (0.684 vs. 0.561 success) or a robust model (0.702 vs. 0.561 success). Our code will be made publicly available at https://***/XiaohanLei/IEVE.

关键词： Embodied vision Verification Visual Navigation

来源：评论

学校读者我要写书评

暂无评论

PAND: Precise Action recognition on Naturalistic Driving

PAND: Precise Action Recognition on Naturalistic Driving

引用

ieee/CVF conference on computer vision and pattern recognition (CVPR)

作者： Zhao, Hangyue Xiao, Yuchao Zhao, Yanyun Beijing Univ Posts & Telecommun Beijing Peoples R China Beijing Key Lab Network Syst & Network Culture Beijing Peoples R China

ISBN: (数字)9781665487399

ISBN: (纸本)9781665487399

Temporal action localization for untrimmed videos is a difficult problem in computer vision. It is challenge to infer the start and end of activity instances on small-scale datasets covering multi-view information accurately. In this paper, we propose an effective activity temporal localization and classification method to localize the temporal boundaries and predict the class label of activities for naturalistic driving. Our approach includes (i) a distraction behavior recognition and localization method in naturalistic driving videos on small-scale data sets, (ii) a strategy that uses multi-branch network to make full use of information from different channels, (iii)a post-processing method for selecting and correcting temporal range to ensure that our system finds accurate boundaries. In addition, the frame-level object detection information is also utilized. Extensive experiments prove the effectiveness of our method and we rank the 6th on the Test-A2 of the 6th AI City Challenge track 3.

关键词： Location awareness computer vision Urban areas Pipelines Object detection Data models pattern recognition

来源：评论

学校读者我要写书评

暂无评论

Perceptual in-Loop Filter for Image and Video Compression

Perceptual in-Loop Filter for Image and Video Compression

引用

ieee/CVF conference on computer vision and pattern recognition (CVPR)

作者： Wang, Huairui Ren, Guangjie Ouyang, Tong Zhang, Junxi Han, Wenwei Liu, Zizheng Chen, Zhenzhong Wuhan Univ Wuhan Peoples R China Tencent Media Lab Shenzhen Peoples R China

ISBN: (数字)9781665487399

ISBN: (纸本)9781665487399

In this paper, we introduce our hybrid image and video compression scheme enhanced by CNN-optimized in-loop filter. Specifically, a Structure Preserving in-Loop Filter (SPiLF) is incorporated in the hybrid video codec Enhanced Compression Model (ECM), where two branches, i.e., gradient branch and pixel branch, are developed based on the dense residual unit (DRU). To provide pleasant visual quality, the Generative adversarial networks (GAN) loss and LPIPS loss are further considered. Therefore, the proposal is mainly focusing on perceptual-friendly image compression for human vision, whilst video compression could be further investigated. The experiments show that the proposed method achieves advanced visual quality when compared to the traditional methods.

关键词： Visualization computer vision Image coding conferences Focusing Video compression Generative adversarial networks

来源：评论

学校读者我要写书评

暂无评论

Temporal Driver Action Localization using Action Classification Methods

Temporal Driver Action Localization using Action Classificat...

引用

ieee/CVF conference on computer vision and pattern recognition (CVPR)

作者： Alyahya, Munirah Alghannam, Shahad Alhussan, Taghreed Saudi Technol & Secur Comprehens Control Co Riyadh Saudi Arabia

ISBN: (数字)9781665487399

ISBN: (纸本)9781665487399

Driver distraction recognition is an essential computer vision task that can play a key role in increasing traffic safety and reducing traffic accidents. In this paper, we propose a temporal driver action localization (TDAL) framework for classifying driver distraction actions, as well as identifying the start and end time of a given driver action. The TDAL framework consists of three stages: preprocessing, 'which takes untrimmed video as input and generates multiple clips;action classification, which classifies the clips;and finally, the classifier output is sent to the temporal action localization to generate the start and end times of the distracted actions. The proposed framework achieves an F1 score of 27.06% on Track 3A2 dataset of NVIDIA AI City 2022 Challenge. The findings show that the TDAL framework contributes to fine-grained driver distraction recognition and paves the way for the development of smart and safe transportation. Code will be available soon.

关键词： Location awareness computer vision Urban areas Data preprocessing Transportation Classification algorithms Safety

来源：评论

学校读者我要写书评

暂无评论

没有更多数据了...

全选清除本页清除全部题录导出标记到“检索档案”

共500页 << < 42 43 44 45 46 47 48 49 50 51 > >>

检索报告对象比较合并检索0

隐藏清空

合并搜索

回到顶部

执行限定条件

内容：

评分：

请选择保存的检索档案：

请选择收藏分类：

订阅名称：

通借通还

温馨提示：

图书名称：

借书校区：

取书校区：

手机号码：

邮箱地址：

一卡通帐号：

电话和邮箱必须正确填写，我们会与您联系确认。

联系人：

所在院系：

联系邮箱：

联系电话：

内蒙古自治区呼和浩特市赛罕区大学西街235号邮编: 010021

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：