检索结果-内蒙古大学图书馆

您好，读者！请登录

内蒙古大学图书馆

首页
概况
党建
资源
服务
科研支持
- 论文收录引用证明
- 科技查新
知识产权
档案馆
帮助

咨询与建议

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

您的常用邮箱：*

您的手机号码：*

问题描述：

当前已输入0个字，您还可以输入200个字

全部搜索
期刊论文
图书
学位论文
标准
纸本馆藏
外文资源发现
数据库导航
超星发现

高级检索

时间限定

出版年份：

文献类型

图书期刊文献学位论文多媒体

馆藏选择

电子馆藏纸本馆藏

核心期刊

全部期刊 SCI 收录期刊 SSCI 收录期刊 EI 收录期刊 CSCD 收录期刊 CSSCI 收录期刊

语言

中文英文

文献类型

期刊文献图书学位论文标准纸本馆藏

帮助

文字说明：

T=题名（书名、题名），A=作者（责任者），K=主题词，P=出版物名称，PU=出版社名称，O=机构（作者单位、学位授予单位、专利申请人），L=中图分类号，C=学科分类号，U=全部字段，Y=年（出版发行年、学位年度、标准发布年）

检索规则说明：

AND代表“并且”；OR代表“或者”；NOT代表“不包含”；(注意必须大写,运算符两边需空一格)

检索范例：

范例一：(K=图书馆学 OR K=情报学) AND A=范并思 AND Y=1982-2016
范例二：P=计算机应用与软件 AND (U=C++ OR U=Basic) NOT K=Visual AND Y=2011-2016

分类表

所选分类

>> <<

限定检索结果

文献类型

22,772 篇 会议
112 篇 期刊文献
23 册 图书

馆藏范围

22,906 篇 电子文献
1 种 纸本馆藏

日期分布

学科分类号

13,399 篇 工学
- 10,880 篇 计算机科学与技术...
- 3,450 篇 软件工程
- 2,430 篇 机械工程
- 1,721 篇 光学工程
- 1,011 篇 控制科学与工程
- 998 篇 电气工程
- 761 篇 信息与通信工程
- 393 篇 仪器科学与技术
- 337 篇 生物工程
- 257 篇 生物医学工程（可授...
- 215 篇 电子科学与技术（可...
- 113 篇 化学工程与技术
- 112 篇 安全科学与工程
- 98 篇 测绘科学与技术
- 92 篇 交通运输工程
- 86 篇 建筑学
- 82 篇 土木工程
3,362 篇 医学
- 3,348 篇 临床医学
- 79 篇 基础医学(可授医学...
3,250 篇 理学
- 1,953 篇 物理学
- 1,664 篇 数学
- 567 篇 统计学（可授理学、...
- 484 篇 生物学
- 245 篇 系统科学
- 109 篇 化学
506 篇 管理学
- 299 篇 图书情报与档案管...
- 219 篇 管理科学与工程(可...
- 75 篇 工商管理
252 篇 艺术学
- 252 篇 设计学（可授艺术学...
62 篇 法学
- 59 篇 社会学
40 篇 农学
25 篇 教育学
19 篇 经济学
11 篇 军事学
3 篇 文学

主题

10,127 篇 computer vision
4,025 篇 pattern recognit...
2,900 篇 training
1,958 篇 computational mo...
1,793 篇 cameras
1,759 篇 visualization
1,485 篇 shape
1,466 篇 image segmentati...
1,447 篇 feature extracti...
1,412 篇 three-dimensiona...
1,288 篇 robustness
1,169 篇 computer archite...
1,144 篇 layout
1,142 篇 computer science
1,134 篇 semantics
1,071 篇 object detection
1,043 篇 conferences
1,009 篇 benchmark testin...
967 篇 codes
810 篇 face recognition

机构

135 篇 univ sci & techn...
118 篇 univ chinese aca...
118 篇 chinese univ hon...
110 篇 carnegie mellon ...
99 篇 tsinghua univers...
99 篇 microsoft resear...
94 篇 swiss fed inst t...
92 篇 zhejiang univ pe...
82 篇 university of sc...
81 篇 zhejiang univers...
77 篇 shanghai ai lab ...
77 篇 university of ch...
72 篇 shanghai jiao to...
68 篇 microsoft res as...
65 篇 national laborat...
65 篇 alibaba grp peop...
64 篇 tsinghua univ pe...
63 篇 adobe research
60 篇 peking univ peop...
59 篇 peng cheng labor...

作者

78 篇 van gool luc
72 篇 timofte radu
63 篇 zhang lei
45 篇 luc van gool
40 篇 yang yi
37 篇 loy chen change
33 篇 xiaoou tang
33 篇 li stan z.
33 篇 qi tian
32 篇 sun jian
31 篇 liu yang
31 篇 li fei-fei
30 篇 chen chen
30 篇 tian qi
30 篇 pascal fua
29 篇 darrell trevor
28 篇 ying shan
27 篇 li xin
27 篇 vasconcelos nuno
27 篇 hanqing lu

语言

22,845 篇 英文
35 篇 其他
20 篇 中文
5 篇 土耳其文
2 篇 日文

检索条件"任意字段=1994 IEEE Computer-Society Conference on Computer Vision and Pattern Recognition"

共 22907 条记录，以下是211-220 订阅

全选清除本页清除全部题录导出标记到"检索档案"

详细简洁

排序：

Eclipse: Disambiguating Illumination and Materials using Unintended Shadows

Eclipse: Disambiguating Illumination and Materials using Uni...

引用

ieee/CVF conference on computer vision and pattern recognition (CVPR)

作者： Verbin, Dor Mildenhall, Ben Hedman, Peter Barron, Jonathan T. Zickler, Todd Srinivasan, Pratul P. Google Res Mountain View CA 94043 USA Harvard Univ Cambridge MA USA

ISBN: (纸本)9798350353013;9798350353006

Decomposing an object's appearance into representations of its materials and the surrounding illumination is difficult, even when the object's 3D shape is known beforehand. This problem is especially challenging for diffuse objects: it is ill-conditioned because diffuse materials severely blur incoming light, and it is ill-posed because diffuse materials under high-frequency lighting can be indistinguishable from shiny materials under low-frequency lighting. We show that it is possible to recover precise materials and illumination-even from diffuse objects-by exploiting unintended shadows, like the ones cast onto an object by the photographer who moves around it. These shadows are a nuisance in most previous inverse rendering pipelines, but here we exploit them as signals that improve conditioning and help resolve material-lighting ambiguities. We present a method based on differentiable Monte Carlo ray tracing that uses images of an object to jointly recover its spatiallyvarying materials, the surrounding illumination environment, and the shapes of the unseen light occluders who inadvertently cast shadows upon it.

关键词： computer Graphics Inverse Rendering Non-line-of-sight Imaging

来源：评论

学校读者我要写书评

暂无评论

VicTR: Video-conditioned Text Representations for Activity recognition

VicTR: Video-conditioned Text Representations for Activity R...

引用

ieee/CVF conference on computer vision and pattern recognition (CVPR)

作者： Kahatapitiya, Kumara Arnab, Anurag Nagrani, Arsha Ryoo, Michael S. SUNY Stony Brook Stony Brook NY 11794 USA Google Res Mountain View CA USA

ISBN: (纸本)9798350353006

vision-Language models (VLMs) have excelled in the image-domain- especially in zero-shot settings- thanks to the availability of vast pretraining data (i.e., paired image-text samples). However for videos, such paired data is not as abundant. Therefore, video- VLMs are usually designed by adapting pretrained image- VLMs to the video-domain, instead of training from scratch. All such recipes rely on aug-menting visual embeddings with temporal information (i.e., image -+ video), often keeping text embeddings unchanged or even being discarded. In this paper, we argue the contrary, that better video- VLMs can be designed by focusing more on augmenting text, rather than visual information. More specifically, we introduce Video-conditioned Text Representations (Vi c TR): a form of text embeddings optimized w.r.t. vi-sual embeddings, creating a more-flexible contrastive latent space. Our model canfurther make use offreely-available semantic information, in the form of visually- grounded aux-iliary text (e.g. object or scene information). We evaluate our model on few-shot, zero-shot (HMDB-51, UCF-10l), short-form (Kinetics-400) and long-form (Charades) activ-ity recognition benchmarks, showing strong performance among video-VLMs.

关键词： Activity recognition Video Understanding Video-conditioned Text vision-language models

来源：评论

学校读者我要写书评

暂无评论

ICON: Incremental CONfidence for Joint Pose and Radiance Field Optimization

ICON: Incremental CONfidence for Joint Pose and Radiance Fie...

引用

ieee/CVF conference on computer vision and pattern recognition (CVPR)

作者： Wang, Weiyao Gleize, Pierre Tang, Hao Chen, Xingyu Liang, Kevin J. Feiszli, Matt Meta FAIR Menlo Pk CA 94025 USA

ISBN: (纸本)9798350353013;9798350353006

Neural Radiance Fields (NeRF) exhibit remarkable performance for Novel View Synthesis (NVS) given a set of 2D images. However, NeRF training requires accurate camera pose for each input view, typically obtained by Structure-from-Motion (SfM) pipelines. Recent works have attempted to relax this constraint, but they still often rely on decent initial poses which they can refine. Here we aim at removing the requirement for pose initialization. We present Incremental CONfidence (ICON), an optimization procedure for training NeRFs from 2D video frames. ICON only assumes smooth camera motion to estimate initial guess for poses. Further, ICON introduces "confidence": an adaptive measure of model quality used to dynamically reweight gradients. ICON relies on high-confidence poses to learn NeRF, and high-confidence 3D structure (as encoded by NeRF) to learn poses. We show that ICON, without prior pose initialization, achieves superior performance in both CO3D and HO3D versus methods which use SfM pose.

关键词： Three dimensional computer graphics

来源：评论

学校读者我要写书评

暂无评论

Know Your Neighbors: Improving Single-View Reconstruction via Spatial vision-Language Reasoning

Know Your Neighbors: Improving Single-View Reconstruction vi...

引用

ieee/CVF conference on computer vision and pattern recognition (CVPR)

作者： Li, Rui Fischer, Tobias Segu, Mattia Pollefeys, Marc Van Gool, Luc Tombari, Federico Swiss Fed Inst Technol Zurich Switzerland Google Mountain View CA 94043 USA Tech Univ Munich Munich Germany

ISBN: (纸本)9798350353006

Recovering the 3D scene geometry from a single view is a fundamental yet ill-posed problem in computer vision. While classical depth estimation methods infer only a 2.5D scene representation limited to the image plane, recent approaches based on radiance fields reconstruct a full 3D representation. However, these methods still struggle with occluded regions since inferring geometry without visual observation requires (i) semantic knowledge of the surroundings, and (ii) reasoning about spatial context. We propose KYN, a novel method for single-view scene reconstruction that reasons about semantic and spatial context to predict each point's density. We introduce a vision-language modulation module to enrich point features with fine-grained semantic information. We aggregate point representations across the scene through a language-guided spatial attention mechanism to yield per-point density predictions aware of the 3D semantic context. We show that KYN improves 3D shape recovery compared to predicting density for each 3D point in isolation. We achieve state-of-the-art results in scene and object reconstruction on KITTI-360, and show improved zero-shot generalization compared to prior work. Project page: https://***/kyn.

关键词： radiance field spatial attention spatial context vision-language model

来源：评论

学校读者我要写书评

暂无评论

Consistency and Uncertainty: Identifying Unreliable Responses From Black-Box vision-Language Models for Selective Visual Question Answering

Consistency and Uncertainty: Identifying Unreliable Response...

引用

ieee/CVF conference on computer vision and pattern recognition (CVPR)

作者： Khan, Zaid Fu, Yun Northeastern Univ Boston MA 02115 USA

ISBN: (纸本)9798350353006

The goal of selective prediction is to allow an a model to abstain when it may not be able to deliver a reliable prediction, which is important in safety-critical contexts. Existing approaches to selective prediction typically require access to the internals of a model, require retraining a model or study only unimodal models. However, the most powerful models (e.g. GPT-4) are typically only available as black boxes with inaccessible internals, are not retrainable by end-users, and are frequently used for multimodal tasks. We study the possibility of selective prediction for vision-language models in a realistic, black-box setting. We propose using the principle of neighborhood consistency to identify unreliable responses from a black-box vision-language model in question answering tasks. We hypothesize that given only a visual question and model response, the consistency of the model's responses over the neighborhood of a visual question will indicate reliability. It is impossible to directly sample neighbors in feature space in a black-box setting. Instead, we show that it is possible to use a smaller proxy model to approximately sample from the neighborhood. We find that neighborhood consistency can be used to identify model responses to visual questions that are likely unreliable, even in adversarial settings or settings that are out-of-distribution to the proxy model.

关键词： predictive uncertainty selective prediction trustworthy ml vision-language visual question answering

来源：评论

学校读者我要写书评

暂无评论

Learning by Correction: Efficient Tuning Task for Zero-Shot Generative vision-Language Reasoning

Learning by Correction: Efficient Tuning Task for Zero-Shot ...

引用

ieee/CVF conference on computer vision and pattern recognition (CVPR)

作者： Li, Rongjie Wu, Yu He, Xuming ShanghaiTech Univ Sch Informat Sci & Technol Shanghai Peoples R China Shanghai Engn Res Ctr Intelligent Vis & Imaging Shanghai Peoples R China

ISBN: (纸本)9798350353006

Generative vision-language models (VLMs) have shown impressive performance in zero-shot vision-language tasks like image captioning and visual question answering. However, improving their zero-shot reasoning typically requires second-stage instruction tuning, which relies heavily on human-labeled or large language model-generated annotation, incurring high labeling costs. To tackle this challenge, we introduce Image-Conditioned Caption Correction (ICCC), a novel pre-training task designed to enhance VLMs' zero-shot performance without the need for labeled task-aware data. The ICCC task compels VLMs to rectify mismatches between visual and language concepts, thereby enhancing instruction following and text generation conditioned on visual inputs. Leveraging language structure and a lightweight dependency parser, we construct data samples of ICCC task from image-text datasets with low labeling and computation costs. Experimental results on BLIP2 and InstructBLIP demonstrate significant improvements in zero-shot image-text generation-based VL tasks through ICCC instruction tuning.

关键词： Multimodal Reasoning vision-Language

来源：评论

学校读者我要写书评

暂无评论

Neural Refinement for Absolute Pose Regression with Feature Synthesis

Neural Refinement for Absolute Pose Regression with Feature ...

引用

ieee/CVF conference on computer vision and pattern recognition (CVPR)

作者： Chen, Shuai Bhalgat, Yash Li, Xinghui Bin, Jia-Wang Li, Kejie Wang, Zirui Prisacariu, Victor Adrian Univ Oxford Act Vision Lab Oxford England Univ Oxford Visual Geometry Grp Oxford England

ISBN: (纸本)9798350353006

Absolute Pose Regression (APR) methods use deep neural networks to directly regress camera poses from RGB images. However, the predominant APR architectures only rely on 2D operations during inference, resulting in limited accuracy of pose estimation due to the lack of 3D geometry constraints or priors. In this work, we propose a test-time refinement pipeline that leverages implicit geometric constraints using a robust feature field to enhance the ability of APR methods to use 3D information during inference. We also introduce a novel Neural Feature Synthesizer (NeFeS) model, which encodes 3D geometric features during training and directly renders dense novel view features at test time to refine APR methods. To enhance the robustness of our model, we introduce a feature fusion module and a progressive training strategy. Our proposed method achieves state-of- the-art single-image APR accuracy on indoor and outdoor datasets. Code will be released at https:// ***/ActivevisionLab/NeFeS.

关键词： Feature Distillation Neural Radiance Field Pose Regression Test-time Refinement Visual Re-Localization

来源：评论

学校读者我要写书评

暂无评论

Flow-Guided Online Stereo Rectification for Wide Baseline Stereo

Flow-Guided Online Stereo Rectification for Wide Baseline St...

引用

ieee/CVF conference on computer vision and pattern recognition (CVPR)

作者： Kumar, Anush Mannan, Fahim Jafari, Omid Hosseini Li, Shile Heider, Felix Torc Robot Blacksburg VA 24060 USA Princeton Univ Princeton NJ 08544 USA

ISBN: (纸本)9798350353006

Stereo rectification is widely considered "solved" due to the abundance of traditional approaches to perform rectification. However, autonomous vehicles and robots in-the-wild require constant re-calibration due to exposure to various environmental factors, including vibration, and structural stress, when cameras are arranged in a wide-baseline configuration. Conventional rectification methods fail in these challenging scenarios: especially for larger vehicles, such as autonomous freight trucks and semi-trucks, the resulting incorrect rectification severely affects the quality of downstream tasks that use stereo/multi-view data. To tackle these challenges, we propose an online rectification approach that operates at real-time rates while achieving high accuracy. We propose a novel learning-based online calibration approach that utilizes stereo correlation volumes built from a feature representation obtained from cross-image attention. Our model is trained to minimize vertical optical flow as proxy rectification constraint, and predicts the relative rotation between the stereo pair. The method is real-time and even outperforms conventional methods used for offline calibration, and substantially improves downstream stereo depth, post-rectification. We release two public datasets (https://***/online-stereo-recification/), a synthetic and experimental wide baseline dataset, to foster further research.

关键词： Autonomous Driving Camera Pose Estimation computer vision Rectification Datasets Stereo Rectification Wide Baseline Stereo

来源：评论

学校读者我要写书评

暂无评论

A Backpack Full of Skills: Egocentric Video Understanding with Diverse Task Perspectives

A Backpack Full of Skills: Egocentric Video Understanding wi...

引用

ieee/CVF conference on computer vision and pattern recognition (CVPR)

作者： Peirone, Simone Alberto Pistilli, Francesca Alliegro, Antonio Averta, Giuseppe Politecnico Torino Turin Italy Ist Italiano Tecnol Genoa Italy

ISBN: (纸本)9798350353006

Human comprehension of a video stream is naturally broad: in a few instants, we are able to understand what is happening, the relevance and relationship of objects, and forecast what will follow in the near future, everything all at once. We believe that - to effectively transfer such an holistic perception to intelligent machines - an important role is played by learning to correlate concepts and to abstract knowledge coming from different tasks, to synergistically exploit them when learning novel skills. To accomplish this, we look for a unified approach to video understanding which combines shared temporal modelling of human actions with minimal overhead, to support multiple down-stream tasks and enable cooperation when learning novel skills. We then propose EgoPack, a solution that creates a collection of task perspectives that can be carried across downstream tasks and used as a potential source of additional insights, as a backpack of skills that a robot can carry around and use when needed. We demonstrate the effectiveness and efficiency of our approach on four Ego4D benchmarks, outperforming current state-of-the-art methods. Project webpage: ***/EgoPack.

关键词： Egocentric vision Video Understanding

来源：评论

学校读者我要写书评

暂无评论

UnScene3D: Unsupervised 3D Instance Segmentation for Indoor Scenes

UnScene3D: Unsupervised 3D Instance Segmentation for Indoor ...

引用

ieee/CVF conference on computer vision and pattern recognition (CVPR)

作者： Rozenberszki, David Litany, Or Dai, Angela Tech Univ Munich Munich Germany Technion Haifa Israel NVIDIA Santa Clara CA USA

ISBN: (纸本)9798350353006

3D instance segmentation is fundamental to geometric understanding of the world around us. Existing methods for instance segmentation of 3D scenes rely on supervision from expensive, manual 3D annotations. We propose UnScene3D, the first fully unsupervised 3D learning approach for class-agnostic 3D instance segmentation of indoor scans. UnScene3D first generates pseudo masks by leveraging self-supervised color and geometry features to find potential object regions. We operate on a basis of geometric oversegmentation, enabling efficient representation and learning on high-resolution 3D data. The coarse proposals are then refined through self-training our model on its predictions. Our approach improves over clustering-based alternatives to unsupervised 3D instance segmentation methods by more than 300% Average Precision score, demonstrating effective instance segmentation even in challenging, cluttered 3D scenes.

关键词： 3D computer vision 3D Instance Segmentation Graph-cuts Scene Understanding Self-training Unsupervised Instance Segmentation Unsupervised Learning

来源：评论

学校读者我要写书评

暂无评论

没有更多数据了...

全选清除本页清除全部题录导出标记到“检索档案”

共500页 << < 18 19 20 21 22 23 24 25 26 27 > >>

检索报告对象比较合并检索0

隐藏清空

合并搜索

回到顶部

执行限定条件

内容：

评分：

请选择保存的检索档案：

请选择收藏分类：

订阅名称：

通借通还

温馨提示：

图书名称：

借书校区：

取书校区：

手机号码：

邮箱地址：

一卡通帐号：

电话和邮箱必须正确填写，我们会与您联系确认。

联系人：

所在院系：

联系邮箱：

联系电话：

内蒙古自治区呼和浩特市赛罕区大学西街235号邮编: 010021

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：