检索结果-内蒙古大学图书馆

您好，读者！请登录

内蒙古大学图书馆

首页
概况
党建
资源
服务
科研支持
- 论文收录引用证明
- 科技查新
知识产权
档案馆
帮助

咨询与建议

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

您的常用邮箱：*

您的手机号码：*

问题描述：

当前已输入0个字，您还可以输入200个字

全部搜索
期刊论文
图书
学位论文
标准
纸本馆藏
外文资源发现
数据库导航
超星发现

高级检索

时间限定

出版年份：

文献类型

图书期刊文献学位论文多媒体

馆藏选择

电子馆藏纸本馆藏

核心期刊

全部期刊 SCI 收录期刊 SSCI 收录期刊 EI 收录期刊 CSCD 收录期刊 CSSCI 收录期刊

语言

中文英文

文献类型

期刊文献图书学位论文标准纸本馆藏

帮助

文字说明：

T=题名（书名、题名），A=作者（责任者），K=主题词，P=出版物名称，PU=出版社名称，O=机构（作者单位、学位授予单位、专利申请人），L=中图分类号，C=学科分类号，U=全部字段，Y=年（出版发行年、学位年度、标准发布年）

检索规则说明：

AND代表“并且”；OR代表“或者”；NOT代表“不包含”；(注意必须大写,运算符两边需空一格)

检索范例：

范例一：(K=图书馆学 OR K=情报学) AND A=范并思 AND Y=1982-2016
范例二：P=计算机应用与软件 AND (U=C++ OR U=Basic) NOT K=Visual AND Y=2011-2016

分类表

所选分类

>> <<

限定检索结果

文献类型

23,001 篇 会议
126 册 图书
92 篇 期刊文献

馆藏范围

23,218 篇 电子文献
1 种 纸本馆藏

日期分布

学科分类号

13,623 篇 工学
- 11,108 篇 计算机科学与技术...
- 3,479 篇 软件工程
- 2,445 篇 机械工程
- 1,716 篇 光学工程
- 1,075 篇 电气工程
- 1,014 篇 控制科学与工程
- 785 篇 信息与通信工程
- 412 篇 仪器科学与技术
- 352 篇 生物工程
- 251 篇 生物医学工程（可授...
- 196 篇 电子科学与技术（可...
- 114 篇 化学工程与技术
- 108 篇 安全科学与工程
- 100 篇 测绘科学与技术
- 88 篇 建筑学
- 87 篇 交通运输工程
- 84 篇 土木工程
3,494 篇 医学
- 3,481 篇 临床医学
- 81 篇 基础医学(可授医学...
3,242 篇 理学
- 1,939 篇 物理学
- 1,640 篇 数学
- 563 篇 统计学（可授理学、...
- 500 篇 生物学
- 249 篇 系统科学
- 107 篇 化学
522 篇 管理学
- 311 篇 图书情报与档案管...
- 224 篇 管理科学与工程(可...
- 76 篇 工商管理
276 篇 艺术学
- 276 篇 设计学（可授艺术学...
66 篇 法学
- 63 篇 社会学
38 篇 农学
28 篇 教育学
22 篇 经济学
10 篇 军事学
3 篇 文学

主题

10,187 篇 computer vision
3,967 篇 pattern recognit...
3,005 篇 training
2,007 篇 computational mo...
1,818 篇 visualization
1,815 篇 cameras
1,516 篇 feature extracti...
1,481 篇 shape
1,455 篇 three-dimensiona...
1,438 篇 image segmentati...
1,287 篇 robustness
1,205 篇 computer archite...
1,155 篇 semantics
1,147 篇 conferences
1,107 篇 layout
1,092 篇 computer science
1,087 篇 object detection
1,025 篇 benchmark testin...
970 篇 codes
922 篇 face recognition

机构

136 篇 univ sci & techn...
121 篇 univ chinese aca...
118 篇 chinese univ hon...
107 篇 carnegie mellon ...
101 篇 tsinghua univers...
101 篇 microsoft resear...
95 篇 swiss fed inst t...
93 篇 zhejiang univ pe...
82 篇 university of sc...
81 篇 zhejiang univers...
80 篇 university of ch...
77 篇 shanghai ai lab ...
72 篇 shanghai jiao to...
69 篇 national laborat...
67 篇 microsoft res as...
67 篇 alibaba grp peop...
64 篇 adobe research
61 篇 tsinghua univ pe...
60 篇 peking univ peop...
59 篇 univ oxford oxfo...

作者

81 篇 van gool luc
72 篇 timofte radu
64 篇 zhang lei
47 篇 luc van gool
40 篇 yang yi
40 篇 li stan z.
37 篇 loy chen change
34 篇 chen chen
33 篇 xiaoou tang
32 篇 liu yang
32 篇 qi tian
31 篇 tian qi
31 篇 sun jian
30 篇 murino vittorio
30 篇 pascal fua
29 篇 darrell trevor
29 篇 li fei-fei
28 篇 li xin
28 篇 ying shan
27 篇 vasconcelos nuno

语言

23,137 篇 英文
53 篇 其他
22 篇 中文
5 篇 土耳其文
2 篇 日文

检索条件"任意字段=IEEE Conference on Computer Vision and Pattern Recognition Workshops"

共 23219 条记录，以下是211-220 订阅

全选清除本页清除全部题录导出标记到"检索档案"

详细简洁

排序：

Telling Left from Right: Identifying Geometry-Aware Semantic Correspondence

Telling Left from Right: Identifying Geometry-Aware Semantic...

引用

ieee/CVF conference on computer vision and pattern recognition (CVPR)

作者： Zhang, Junyi Herrmann, Charles Hur, Junhwa Chen, Eric Jampani, Varun Sun, Deqing Yang, Ming-Hsuan Shanghai Jiao Tong Univ Shanghai Peoples R China Google Res Mountain View CA USA UIUC Champaign IL USA Stabil AI London England UC Merced Merced CA USA

ISBN: (纸本)9798350353013;9798350353006

While pre-trained large-scale vision models have shown significant promise for semantic correspondence, their features often struggle to grasp the geometry and orientation of instances. This paper identifies the importance of being geometry-aware for semantic correspondence and reveals a limitation of the features of current foundation models under simple post-processing. We show that incorporating this information can markedly enhance semantic correspondence performance with simple but effective solutions in both zero-shot and supervised settings. We also construct a new challenging benchmark for semantic correspondence built from an existing animal pose estimation dataset, for both pre-training validating models. Our method achieves a PCK@0.10 score of 65.4 (zero-shot) and 85.6 (supervised) on the challenging SPair-71k dataset, surpassing the state of the art by 5.5p and 11.0p absolute gains, respectively. Our code and datasets are publicly available at: https://***

关键词： diffusion models semantic correspondence vision transformer

来源：评论

学校读者我要写书评

暂无评论

eTraM: Event-based Traffic Monitoring Dataset

eTraM: Event-based Traffic Monitoring Dataset

引用

ieee/CVF conference on computer vision and pattern recognition (CVPR)

作者： Verma, Aayush Atul Chakravarthi, Bharatesh Vaghela, Arpitsinh Wei, Hua Yang, Yezhou Arizona State Univ Tempe AZ 85287 USA

ISBN: (纸本)9798350353006

Event cameras, with their high temporal and dynamic range and minimal memory usage, have found applications in various fields. However, their potential in static traffic monitoring remains largely unexplored. To facilitate this exploration, we present eTraM - a first-of-its-kind, fully event-based traffic monitoring dataset. eTraM offers 10 hr of data from different traffic scenarios in various lighting and weather conditions, providing a comprehensive overview of real-world situations. Providing 2M bounding box annotations, it covers eight distinct classes of traffic participants, ranging from vehicles to pedestrians and micro-mobility. eTraM's utility has been assessed using state-of-the-art methods for traffic participant detection, including RVT, RED, and YOLOv8. We quantitatively evaluate the ability of event-based models to generalize on nighttime and unseen scenes. Our findings substantiate the compelling potential of leveraging event cameras for traffic monitoring, opening new avenues for research and application. eTraM is available at https://***/eTraM.

关键词： DVS Dynamic vision Sensor Event-based Event-based vision Event-camera ITS Neuromorphic

来源：评论

学校读者我要写书评

暂无评论

Physical Property Understanding from Language-Embedded Feature Fields

Physical Property Understanding from Language-Embedded Featu...

引用

ieee/CVF conference on computer vision and pattern recognition (CVPR)

作者： Zhai, Albert J. Shen, Yuan Chen, Emily Y. Wang, Gloria X. Wang, Xinlei Wang, Sheng Guan, Kaiyu Wang, Shenlong Univ Illinois Champaign IL 61820 USA

ISBN: (纸本)9798350353006

Can computers perceive the physical properties of objects solely through vision? Research in cognitive science and vision science has shown that humans excel at identifying materials and estimating their physical properties based purely on visual appearance. In this paper, we present a novel approach for dense prediction of the physical properties of objects using a collection of images. Inspired by how humans reason about physics through vision, we leverage large language models to propose candidate materials for each object. We then construct a language-embedded point cloud and estimate the physical properties of each 3D point using a zero-shot kernel regression approach. Our method is accurate, annotation-free, and applicable to any object in the open world. Experiments demonstrate the effectiveness of the proposed approach in various physical property reasoning tasks, such as estimating the mass of common objects, as well as other properties like friction and hardness. Code is available at https://***/NeRF2Physics.

关键词： 3D scene understanding digital twin physical properties vision and language

来源：评论

学校读者我要写书评

暂无评论

Video2Game: Real-time, Interactive, Realistic and Browser-Compatible Environment from a Single Video

Video2Game: Real-time, Interactive, Realistic and Browser-Co...

引用

ieee/CVF conference on computer vision and pattern recognition (CVPR)

作者： Xia, Hongchi Lin, Zhi-Hao Ma, Wei-Chiu Wang, Shenlong Univ Illinois Champaign IL 61820 USA Shanghai Jiao Tong Univ Shanghai Peoples R China Cornell Univ Ithaca NY USA

ISBN: (纸本)9798350353013;9798350353006

Creating high-quality and interactive virtual environments, such as games and simulators, often involves complex and costly manual modeling processes. In this paper, we present Video2Game, a novel approach that automatically converts videos of real-world scenes into realistic and interactive game environments. At the heart of our system are three core components: (i) a neural radiance fields (NeRF) module that effectively captures the geometry and visual appearance of the scene;(ii) a mesh module that distills the knowledge from NeRF for faster rendering;and (iii) a physics module that models the interactions and physical dynamics among the objects. By following the carefully designed pipeline, one can construct an interactable and actionable digital replica of the real world. We benchmark our system on both indoor and large-scale outdoor scenes. We show that we can not only produce highly-realistic renderings in real-time, but also build interactive games on top.

关键词： computer vision

来源：评论

学校读者我要写书评

暂无评论

Projecting Trackable Thermal patterns for Dynamic computer vision

Projecting Trackable Thermal Patterns for Dynamic Computer V...

引用

ieee/CVF conference on computer vision and pattern recognition (CVPR)

作者： Sheinin, Mark Sankaranarayanan, Aswin C. Narasimhan, Srinivasa G. Carnegie Mellon Univ Pittsburgh PA 15213 USA

ISBN: (纸本)9798350353006

Adding artificial patterns to objects, like QR codes, can ease tasks such as object tracking, robot navigation, and conveying information (e.g., a label or a website link). However, these patterns require a physical application and they alter the object's appearance. Conversely, projected patterns can temporarily change the object's appearance, aiding tasks like 3D scanning and retrieving object textures and shading. However, projected patterns impede dynamic tasks like object tracking because they do not 'stick' to the object's surface. Or do they? This paper introduces a novel approach combining the advantages of projected and persistent physical patterns. Our system projects heat patterns using a laser beam (similar in spirit to a LIDAR), which a thermal camera observes and tracks. Such thermal patterns enable tracking poorly-textured objects whose tracking is highly challenging with standard cameras while not affecting the object's appearance or physical properties. To avail these thermal patterns in existing vision frameworks, we train a network to reverse heat diffusion's effects and remove inconsistent pattern points between different thermal frames. We prototyped and tested this approach on dynamic vision tasks like structure from motion, optical flow, and object tracking of everyday textureless objects.

关键词： 3d reconstruction heat Heat diffusion laser optical flow slam structure from motion thermal tracking

来源：评论

学校读者我要写书评

暂无评论

GOAT-Bench: A Benchmark for Multi-Modal Lifelong Navigation

GOAT-Bench: A Benchmark for Multi-Modal Lifelong Navigation

引用

ieee/CVF conference on computer vision and pattern recognition (CVPR)

作者： Khanna, Mukul Ramrakhya, Ram Chhablani, Gunjan Yenamandra, Sriram Gervet, Theophile Chang, Matthew Kiraly, Zsolt Chaplot, Devendra Singh Batra, Dhruv Mottaghi, Roozbeh Georgia Inst Technol Atlanta GA 30332 USA Carnegie Mellon Univ Pittsburgh PA 15213 USA Univ Illinois Urbana IL USA Mistral AI Paris France Univ Washington Seattle WA USA

ISBN: (纸本)9798350353006

The Embodied AI community has made significant strides in visual navigation tasks, exploring targets from 3D coordinates, objects, language descriptions, and images. However, these navigation models often handle only a single input modality as the target. With the progress achieved so far, it is time to move towards universal navigation models capable of handling various goal types, enabling more effective user interaction with robots. To facilitate this goal, we propose GOAT-Bench, a benchmark for the universal navigation task referred to as GO to AnyThing (GOAT). In this task, the agent is directed to navigate to a sequence of targets specified by the category name, language description, or image in an open-vocabulary fashion. We benchmark monolithic RL and modular methods on the GOAT task, analyzing their performance across modalities, the role of explicit and implicit scene memories, their robustness to noise in goal specifications, and the impact of memory in lifelong scenarios.

关键词： computer vision Embodied AI Visual navigation

来源：评论

学校读者我要写书评

暂无评论

Summarize the Past to Predict the Future: Natural Language Descriptions of Context Boost Multimodal Object Interaction Anticipation

Summarize the Past to Predict the Future: Natural Language D...

引用

ieee/CVF conference on computer vision and pattern recognition (CVPR)

作者： Pasca, Razvan-George Gavryushin, Alexey Hamza, Muhammad Kuo, Yen-Ling Mo, Kaichun Van Gool, Luc Hilliges, Otmar Wang, Xi Swiss Fed Inst Technol Zurich Switzerland Univ Zurich Zurich Switzerland Univ Virginia Charlottesville VA USA NVIDIA Santa Clara CA USA Katholieke Univ Leuven Leuven Belgium INSAIT Sofia Bulgaria

ISBN: (纸本)9798350353006

We study object interaction anticipation in egocentric videos. This task requires an understanding of the spatio-temporal context formed by past actions on objects, coined action context. We propose TransFusion, a multimodal transformer-based architecture for short-term object interaction anticipation. Our method exploits the representational power of language by summarizing the action context textually, after leveraging pre-trained vision-language foundation models to extract the action context from past video frames. The summarized action context and the last observed video frame are processed by the multimodal fusion module to forecast the next object interaction. Experiments on the Ego4D next active object interaction dataset show the effectiveness of our multimodal fusion model and highlight the benefits of using the power of foundation models and language-based context summaries in a task where vision may appear to suffice. Our novel approach outperforms all state-of-the-art methods on both versions of the Ego4D dataset. A project video and code are available at https://***/transfusion-proj/.

关键词： computer vision Egocentric vision Human Behavior Prediction Multimodal Learning Object Interaction Anticipation vision-Language Models

来源：评论

学校读者我要写书评

暂无评论

GreedyViG: Dynamic Axial Graph Construction for Efficient vision GNNs

GreedyViG: Dynamic Axial Graph Construction for Efficient Vi...

引用

ieee/CVF conference on computer vision and pattern recognition (CVPR)

作者： Munir, Mustafa Avery, William Rahman, Md Mostafijur Marculescu, Radu Univ Texas Austin Austin TX 78712 USA

ISBN: (纸本)9798350353013;9798350353006

vision graph neural networks (ViG) offer a new avenue for exploration in computer vision. A major bottleneck in ViGs is the inefficient k-nearest neighbor (KNN) operation used for graph construction. To solve this issue, we propose a new method for designing ViGs, Dynamic Axial Graph Construction (DAGC), which is more efficient than KNN as it limits the number of considered graph connections made within an image. Additionally, we propose a novel CNN-GNN architecture, GreedyViG, which uses DAGC. Extensive experiments show that GreedyViG beats existing ViG, CNN, and ViT architectures in terms of accuracy, GMACs, and parameters on image classification, object detection, instance segmentation, and semantic segmentation tasks. Our smallest model, GreedyViG-S, achieves 81.1% top-1 accuracy on ImageNet-1K, 2.9% higher than vision GNN and 2.2% higher than vision HyperGraph Neural Network (ViHGNN), with less GMACs and a similar number of parameters. Our largest model, GreedyViG-B obtains 83.9% top-1 accuracy, 0.2% higher than vision GNN, with a 66.6% decrease in parameters and a 69% decrease in GMACs. GreedyViG-B also obtains the same accuracy as ViHGNN with a 67.3% decrease in parameters and a 71.3% decrease in GMACs. Our work shows that hybrid CNN-GNN architectures not only provide a new avenue for de-signing efficient models, but that they can also exceed the performance of current state-of-the-art models(1).

关键词： Deep Learning Efficient computer vision Graph Neural Networks

来源：评论

学校读者我要写书评

暂无评论

Consistency and Uncertainty: Identifying Unreliable Responses From Black-Box vision-Language Models for Selective Visual Question Answering

Consistency and Uncertainty: Identifying Unreliable Response...

引用

ieee/CVF conference on computer vision and pattern recognition (CVPR)

作者： Khan, Zaid Fu, Yun Northeastern Univ Boston MA 02115 USA

ISBN: (纸本)9798350353006

The goal of selective prediction is to allow an a model to abstain when it may not be able to deliver a reliable prediction, which is important in safety-critical contexts. Existing approaches to selective prediction typically require access to the internals of a model, require retraining a model or study only unimodal models. However, the most powerful models (e.g. GPT-4) are typically only available as black boxes with inaccessible internals, are not retrainable by end-users, and are frequently used for multimodal tasks. We study the possibility of selective prediction for vision-language models in a realistic, black-box setting. We propose using the principle of neighborhood consistency to identify unreliable responses from a black-box vision-language model in question answering tasks. We hypothesize that given only a visual question and model response, the consistency of the model's responses over the neighborhood of a visual question will indicate reliability. It is impossible to directly sample neighbors in feature space in a black-box setting. Instead, we show that it is possible to use a smaller proxy model to approximately sample from the neighborhood. We find that neighborhood consistency can be used to identify model responses to visual questions that are likely unreliable, even in adversarial settings or settings that are out-of-distribution to the proxy model.

关键词： predictive uncertainty selective prediction trustworthy ml vision-language visual question answering

来源：评论

学校读者我要写书评

暂无评论

Building vision-Language Models on Solid Foundations with Masked Distillation

Building Vision-Language Models on Solid Foundations with Ma...

引用

ieee/CVF conference on computer vision and pattern recognition (CVPR)

作者： Sameni, Sepehr Kafle, Kushal Tan, Hao Jenni, Simon Univ Bern Bern Switzerland Adobe Res San Jose CA USA

ISBN: (纸本)9798350353006

Recent advancements in vision-Language Models (VLMs) have marked a significant leap in bridging the gap between computer vision and natural language processing. However, traditional VLMs, trained through contrastive learning on limited and noisy image-text pairs, often lack the spatial and linguistic understanding to generalize well to dense vision tasks or less common languages. Our approach, Solid Foundation CLIP (SF-CLIP), circumvents this issue by implicitly building on the solid visual and language understanding of foundational models trained on vast amounts of unimodal data. SF-CLIP integrates contrastive image-text pretraining with a masked knowledge distillation from large foundational text and vision models. This methodology guides our VLM in developing robust text and image representations. As a result, SF-CLIP shows exceptional zero-shot classification accuracy and enhanced image and text retrieval capabilities, setting a new state of the art for ViT-B/16 trained on YFCC15M and CC12M. Moreover, the dense per-patch supervision enhances our zero-shot and linear probe performance in semantic segmentation tasks. A remarkable aspect of our model is its multilingual proficiency, evidenced by strong retrieval results in multiple languages despite being trained predominantly on English data. We achieve all of these improvements without sacrificing the training efficiency through our selective application of masked distillation and the inheritance of teacher word embeddings.

关键词： CLIP Distillation LLM Multilingual Multimodal Representation Learning

来源：评论

学校读者我要写书评

暂无评论

没有更多数据了...

全选清除本页清除全部题录导出标记到“检索档案”

共500页 << < 18 19 20 21 22 23 24 25 26 27 > >>

检索报告对象比较合并检索0

隐藏清空

合并搜索

回到顶部

执行限定条件

内容：

评分：

请选择保存的检索档案：

请选择收藏分类：

订阅名称：

通借通还

温馨提示：

图书名称：

借书校区：

取书校区：

手机号码：

邮箱地址：

一卡通帐号：

电话和邮箱必须正确填写，我们会与您联系确认。

联系人：

所在院系：

联系邮箱：

联系电话：

内蒙古自治区呼和浩特市赛罕区大学西街235号邮编: 010021

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：