检索结果-内蒙古大学图书馆

您好，读者！请登录

内蒙古大学图书馆

首页
概况
党建
资源
服务
科研支持
- 论文收录引用证明
- 科技查新
知识产权
档案馆
帮助

咨询与建议

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

您的常用邮箱：*

您的手机号码：*

问题描述：

当前已输入0个字，您还可以输入200个字

全部搜索
期刊论文
图书
学位论文
标准
纸本馆藏
外文资源发现
数据库导航
超星发现

高级检索

分类表

所选分类

>> <<

限定检索结果

标题

标题
作者
主题词
出版物名称
出版社
机构
学科分类号
摘要
ISBN
ISSN
基金资助
索书号

作者

作者
标题
主题词
出版物名称
出版社
机构
学科分类号
摘要
ISBN
ISSN
基金资助
索书号

文献类型

12,844 篇 会议
13 篇 期刊文献
2 册 图书

馆藏范围

12,859 篇 电子文献
0 种 纸本馆藏

日期分布

学科分类号

7,573 篇 工学
- 6,863 篇 计算机科学与技术...
- 880 篇 机械工程
- 814 篇 软件工程
- 435 篇 控制科学与工程
- 360 篇 光学工程
- 306 篇 电气工程
- 209 篇 仪器科学与技术
- 124 篇 信息与通信工程
- 91 篇 生物工程
- 62 篇 生物医学工程（可授...
- 39 篇 电子科学与技术（可...
- 34 篇 安全科学与工程
- 26 篇 化学工程与技术
- 21 篇 交通运输工程
- 20 篇 建筑学
- 18 篇 土木工程
2,957 篇 医学
- 2,956 篇 临床医学
- 15 篇 基础医学(可授医学...
- 12 篇 药学(可授医学、理...
700 篇 理学
- 359 篇 物理学
- 225 篇 数学
- 175 篇 系统科学
- 95 篇 统计学（可授理学、...
- 93 篇 生物学
- 22 篇 化学
201 篇 艺术学
- 201 篇 设计学（可授艺术学...
84 篇 管理学
- 59 篇 图书情报与档案管...
- 25 篇 管理科学与工程(可...
- 14 篇 工商管理
23 篇 法学
- 21 篇 社会学
5 篇 农学
4 篇 教育学
2 篇 经济学
1 篇 军事学

主题

6,464 篇 computer vision
2,688 篇 training
2,437 篇 pattern recognit...
1,780 篇 computational mo...
1,522 篇 visualization
1,348 篇 three-dimensiona...
1,091 篇 computer archite...
1,063 篇 semantics
997 篇 benchmark testin...
976 篇 codes
970 篇 conferences
854 篇 feature extracti...
830 篇 cameras
771 篇 task analysis
707 篇 deep learning
646 篇 image segmentati...
611 篇 object detection
595 篇 shape
554 篇 transformers
538 篇 neural networks

机构

132 篇 univ sci & techn...
122 篇 carnegie mellon ...
120 篇 tsinghua univ pe...
114 篇 univ chinese aca...
113 篇 chinese univ hon...
94 篇 tsinghua univers...
91 篇 zhejiang univ pe...
91 篇 swiss fed inst t...
85 篇 peng cheng lab p...
81 篇 university of ch...
80 篇 zhejiang univers...
77 篇 shanghai ai lab ...
77 篇 peng cheng labor...
75 篇 university of sc...
69 篇 shanghai jiao to...
68 篇 shanghai jiao to...
67 篇 alibaba grp peop...
67 篇 stanford univ st...
66 篇 univ hong kong p...
64 篇 sensetime res pe...

作者

77 篇 timofte radu
63 篇 van gool luc
45 篇 zhang lei
36 篇 yang yi
36 篇 luc van gool
34 篇 tao dacheng
31 篇 loy chen change
29 篇 chen chen
28 篇 sun jian
28 篇 qi tian
25 篇 li xin
24 篇 liu yang
24 篇 tian qi
24 篇 ying shan
23 篇 wang xinchao
23 篇 zha zheng-jun
23 篇 boxin shi
21 篇 zhou jie
21 篇 vasconcelos nuno
20 篇 luo ping

语言

12,851 篇 英文
7 篇 其他
1 篇 中文

检索条件"任意字段=IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops"

共 12859 条记录，以下是161-170 订阅

全选清除本页清除全部题录导出标记到"检索档案"

详细简洁

排序：

相关度排序

相关度排序
时效性降序
时效性升序

vision-and-Language Navigation via Causal Learning

Vision-and-Language Navigation via Causal Learning

引用

ieee/cvf conference on computer vision and pattern recognition (CVPR)

作者： Wang, Liuyi He, Zongtao Dang, Ronghao Shen, Mengjiao Liu, Chengju Chen, Qijun Tongji Univ Sch Elect & Informat Engn Shanghai Peoples R China

ISBN: (纸本)9798350353006

In the pursuit of robust and generalizable environment perception and language understanding, the ubiquitous challenge of dataset bias continues to plague vision-and-language navigation (VLN) agents, hindering their perfor-mance in unseen environments. This paper introduces the generalized cross-modal causal transformer (GOAT), a pioneering solution rooted in the paradigm of causal inference. By delving into both observable and unobservable confounders within vision, language, and history, we propose the back-door and front-door adjustment causal learning (BACL and FACL) modules to promote unbiased learning by comprehensively mitigating potential spurious correlations. Additionally, to capture global confounder features, we propose a cross-modal feature pooling (CFP) module supervised by contrastive learning, which is also shown to be effective in improving cross-modal representations during pre-training. Extensive experiments across multiple VLN datasets (R2R, REVERIE, RxR, and SOON) under-score the superiority of our proposed method over previous state-of-the-art approaches. Code is available at https://***/CrystalSixone/VLN-GOAT.

关键词： causal learning cross-modal fusion embodied AI vision-and-language vision-and-language navigation

来源：评论

学校读者我要写书评

暂无评论

Hybrid Functional Maps for Crease-Aware Non-Isometric Shape Matching

Hybrid Functional Maps for Crease-Aware Non-Isometric Shape ...

引用

ieee/cvf conference on computer vision and pattern recognition (CVPR)

作者： Bastian, Lennart Xie, Yizheng Navab, Nassir Laehner, Zorah Tech Univ Munich Munich Germany Univ Siegen Siegen Germany Univ Bonn Bonn Germany Lamarr Inst Bonn Germany

ISBN: (纸本)9798350353013;9798350353006

Non-isometric shape correspondence remains a fundamental challenge in computer vision. Traditional methods using Laplace-Beltrami operator (LBO) eigenmodes face limitations in characterizing high-frequency extrinsic shape changes like bending and creases. We propose a novel approach of combining the non-orthogonal extrinsic basis of eigenfunctions of the elastic thin-shell hessian with the intrinsic ones of the LBO, creating a hybrid spectral space in which we construct functional maps. To this end, we present a theoretical framework to effectively integrate non-orthogonal basis functions into descriptor- and learning-based functional map methods. Our approach can be incorporated easily into existing functional map pipelines across varying applications and can handle complex deformations beyond isometries. We show extensive evaluations across various supervised and unsupervised settings and demonstrate significant improvements. Notably, our approach achieves up to 15% better mean geodesic error for non-isometric correspondence settings and up to 45% improvement in scenarios with topological noise. Code is available at: https://***/

关键词： computer vision Functional Maps Non-isometric Shape Correspondence Shape Matching Topological Noise

来源：评论

学校读者我要写书评

暂无评论

De-Diffusion Makes Text a Strong Cross-Modal Interface

De-Diffusion Makes Text a Strong Cross-Modal Interface

引用

ieee/cvf conference on computer vision and pattern recognition (CVPR)

作者： Wei, Chen Liu, Chenxi Qi, Siyuan Zhang, Zhishuai Yuille, Alan Yu, Jiahui Google DeepMind London England Johns Hopkins Univ Baltimore MD 21218 USA

ISBN: (纸本)9798350353006

We demonstrate text as a strong cross-modal interface. Rather than relying on deep embeddings to connect image and language as the interface representation, our approach represents an image as text, from which we enjoy the interpretability and flexibility inherent to natural language. We employ an autoencoder that uses a pre-trained text-to-image diffusion model for decoding. The encoder is trained to transform an input image into text, which is then fed into the fixed text- to-image diffusion decoder to reconstruct the original input - a process we term De- Diffusion. Experiments validate both the precision and comprehensiveness of De-Diffusion text representing images, such that it can be readily ingested by off-the-shelf text-to-image tools and LLMs for diverse multi-modal tasks. For example, a single De-Diffusion model can generalize to provide transferable prompts for different text-to-image tools, and also achieves a new state of the art on open-ended vision-language tasks by simply prompting large language models with few-shot examples. Project page: ***.

关键词： Diffusion Generative Model vision and Language

来源：评论

学校读者我要写书评

暂无评论

You Only Need Less Attention at Each Stage in vision Transformers

You Only Need Less Attention at Each Stage in Vision Transfo...

引用

ieee/cvf conference on computer vision and pattern recognition (CVPR)

作者： Zhang, Shuoxi Liu, Hanpeng Lin, Stephen He, Kun Huazhong Univ Sci & Technol Wuhan Peoples R China Microsoft Res Asia Beijing Peoples R China

ISBN: (纸本)9798350353013;9798350353006

The advent of vision Transformers (ViTs) marks a substantial paradigm shift in the realm of computer vision. ViTs capture the global information of images through self-attention modules, which perform dot product computations among patchified image tokens. While self- attention modules empower ViTs to capture long-range dependencies, the computational complexity grows quadratically with the number of tokens, which is a major hindrance to the practical application of ViTs. Moreover, the self-attention mechanism in deep ViTs is also susceptible to the attention saturation issue. Accordingly, we argue against the necessity of computing the attention scores in every layer, and we propose the Less-Attention vision Transformer (LaViT), which computes only a few attention operations at each stage and calculates the subsequent feature alignments in other layers via attention transformations that leverage the previously calculated attention scores. This novel approach can mitigate two primary issues plaguing traditional self-attention modules: the heavy computational burden and attention saturation. Our proposed architecture offers superior efficiency and ease of implementation, merely requiring matrix multiplications that are highly optimized in contemporary deep learning frameworks. Moreover, our architecture demonstrates exceptional performance across various vision tasks including classification, detection and segmentation.

关键词： computer vision efficient training vision transformer

来源：评论

学校读者我要写书评

暂无评论

A Theory of Joint Light and Heat Transport for Lambertian Scenes

A Theory of Joint Light and Heat Transport for Lambertian Sc...

引用

ieee/cvf conference on computer vision and pattern recognition (CVPR)

作者： Ramanagopal, Mani Narayanan, Sriram Sankaranarayanan, Aswin C. Narasimhan, Srinivasa G. Carnegie Mellon Univ Pittsburgh PA 15213 USA

ISBN: (纸本)9798350353006

We present a novel theory that establishes the relationship between light transport in visible and thermal infrared, and heat transport in solids. We show that heat generated due to light absorption can be estimated by modeling heat transport using a thermal camera. For situations where heat conduction is negligible, we analytically solve the heat transport equation to derive a simple expression relating the change in thermal image intensity to the absorbed light intensity and heat capacity of the material. Next, we prove that intrinsic image decomposition for Lambertian scenes becomes a well-posed problem if one has access to the absorbed light. Our theory generalizes to arbitrary shapes and unstructured illumination. Our theory is based on applying energy conservation principle at each pixel independently. We validate our theory using real-world experiments on diffuse objects made of different materials that exhibit both direct and global components (inter-reflections) of light transport under unknown complex lighting.

关键词： heat transport intrinsic image decomposition light transport physics-based vision

来源：评论

学校读者我要写书评

暂无评论

Classes Are Not Equal: An Empirical Study on Image recognition Fairness

Classes Are Not Equal: An Empirical Study on Image Recogniti...

引用

ieee/cvf conference on computer vision and pattern recognition (CVPR)

作者： Cui, Jiequan Zhu, Beier Wen, Xin Qi, Xiaojuan Yu, Bei Zhang, Hanwang Nanyang Technol Univ Singapore Singapore Univ Hong Kong Hong Kong Peoples R China Chinese Univ Hong Kong Hong Kong Peoples R China

ISBN: (纸本)9798350353006

In this paper, we present an empirical study on image unfairness, i.e., extreme class accuracy disparity on balanced data like ImageNet. We demonstrate that are not equal and unfairness is prevalent for image classification models across various datasets, network and model capacities. Moreover, several intriguing properties of fairness are identified. First, the unfairness lies in problematic representation rather than classifier bias distinguished from long-tailed recognition. Second, with the proposed concept of Model Prediction Bias, investigate the origins of problematic representation training optimization. Our findings reveal that models tend to exhibit greater prediction biases for classes that more challenging to recognize. It means that more other will be confused with harder classes. Then the False (FPs) will dominate the learning in optimization, leading to their poor accuracy. Further, we conclude data augmentation and representation learning algorithms improve overall performance by promoting fairness some degree in image classification.

关键词： Fairness Long-tailed recognition Representation Learning vision-Language Models

来源：评论

学校读者我要写书评

暂无评论

DrivingGaussian: Composite Gaussian Splatting for Surrounding Dynamic Autonomous Driving Scenes

DrivingGaussian: Composite Gaussian Splatting for Surroundin...

引用

ieee/cvf conference on computer vision and pattern recognition (CVPR)

作者： Zhou, Xiaoyu Lin, Zhiwei Shan, Xiaojun Wang, Yongtao Sun, Deqing Yang, Ming-Hsuan Peking Univ Wangxuan Inst Comp Technol Beijing Peoples R China Google Res Mountain View CA USA Univ Calif Merced Merced CA USA

ISBN: (纸本)9798350353006

We present DrivingGaussian, an efficient and effective framework for surrounding dynamic autonomous driving scenes. For complex scenes with moving objects, we first sequentially and progressively model the static background of the entire scene with incremental static 3D Gaussians. We then leverage a composite dynamic Gaussian graph to handle multiple moving objects, individually reconstructing each object and restoring their accurate positions and occlusion relationships within the scene. We further use a LiDAR prior for Gaussian Splatting to reconstruct scenes with greater details and maintain panoramic consistency. DrivingGaussian outperforms existing methods in dynamic driving scene reconstruction and enables photorealistic surround-view synthesis with high-fidelity and multi-camera consistency. Our project page is at: https://***/VDIGPKU/DrivingGaussian.

关键词： 3D Reconstruction 3D vision Autonomous Driving

来源：评论

学校读者我要写书评

暂无评论

Cinematic Behavior Transfer via NeRF-based Differentiable Filming

Cinematic Behavior Transfer via NeRF-based Differentiable Fi...

引用

ieee/cvf conference on computer vision and pattern recognition (CVPR)

作者： Jiang, Xuekun Rao, Anyi Wang, Jingbo Lin, Dahua Dai, Bo Shanghai AI Lab Shanghai Peoples R China Stanford Univ Stanford CA 94305 USA Chinese Univ Hong Kong Hong Kong Peoples R China

ISBN: (纸本)9798350353013;9798350353006

In the evolving landscape of digital media and video production, the precise manipulation and reproduction of visual elements like camera movements and character actions are highly desired. Existing SLAM methods face limitations in dynamic scenes and human pose estimation often focuses on 2D projections, neglecting 3D statuses. To address these issues, we first introduce a reverse filming behavior estimation technique. It optimizes camera trajectories by leveraging NeRF as a differentiable renderer and refining SMPL tracks. We then introduce a cinematic transfer pipeline that is able to transfer various shot types to a new 2D video or a 3D virtual environment. The incorporation of 3D engine workflow enables superior rendering and control abilities, which also achieves a higher rating in the user study.

关键词： Three dimensional computer graphics

来源：评论

学校读者我要写书评

暂无评论

Spectral Transfer Guided Active Domain Adaptation For Thermal Imagery

Spectral Transfer Guided Active Domain Adaptation For Therma...

引用

ieee/cvf conference on computer vision and pattern recognition (CVPR)

作者： Ustun, Berkcan Kaya, Ahmet Kagan Ayerden, Ezgi Cakir Altinel, Fazil Aselsan Inc Res Ctr Yenimahalle Turkiye Middle East Tech Univ Dept Elect & Elect Engn Ankara Turkiye

ISBN: (纸本)9798350302493

The exploitation of visible spectrum datasets has led deep networks to show remarkable success. However, real-world tasks include low-lighting conditions which arise performance bottlenecks for models trained on large-scale RGB image datasets. Thermal IR cameras are more robust against such conditions. Therefore, the usage of thermal imagery in real-world applications can be useful. Unsupervised domain adaptation (UDA) allows transferring information from a source domain to a fully unlabeled target domain. Despite substantial improvements in UDA, the performance gap between UDA and its supervised learning counterpart remains significant. By picking a small number of target samples to annotate and using them in training, active domain adaptation tries to mitigate this gap with minimum annotation expense. We propose an active domain adaptation method in order to examine the efficiency of combining the visible spectrum and thermal imagery modalities. When the domain gap is considerably large as in the visible-to-thermal task, we may conclude that the methods without explicit domain alignment cannot achieve their full potential. To this end, we propose a spectral transfer guided active domain adaptation method to select the most informative unlabeled target samples while aligning source and target domains. We used the large-scale visible spectrum dataset MS-COCO as the source domain and the thermal dataset FLIR ADAS as the target domain to present the results of our method. Extensive experimental evaluation demonstrates that our proposed method outperforms the state-of-the-art active domain adaptation methods. The code and models are publicly available.(1)

关键词： computer vision

来源：评论

学校读者我要写书评

暂无评论

On Moving Object Segmentation from Monocular Video with Transformers

On Moving Object Segmentation from Monocular Video with Tran...

引用

ieee/cvf International conference on computer vision (ICCV)

作者： Homeyer, Christian Schnoerr, Christoph Robert Bosch GmbH Corp Res Comp Vis Lab Hildesheim Heidelberg Germany Heidelberg Univ Image & Pattern Anal Grp Heidelberg Germany

ISBN: (纸本)9798350307443

Moving object detection and segmentation from a single moving camera is a challenging task, requiring an understanding of recognition, motion and 3D geometry. Combining both recognition and reconstruction boils down to a fusion problem, where appearance and motion features need to be combined for classification and segmentation. In this paper, we present a novel fusion architecture for monocular motion segmentation - M3Former, which leverages the strong performance of transformers for segmentation and multi-modal fusion. As reconstructing motion from monocular video is ill-posed, we systematically analyze different 2D and 3D motion representations for this problem and their importance for segmentation performance. Finally, we analyze the effect of training data and show that diverse datasets are required to achieve SotA performance on Kitti and Davis. Code will be released upon publication.

关键词： 3D dynamic video monocular motion estimation multi modal fusion segmentation transformer

来源：评论

学校读者我要写书评

暂无评论

没有更多数据了...

全选清除本页清除全部题录导出标记到“检索档案”

共500页 << < 13 14 15 16 17 18 19 20 21 22 > >>

检索报告对象比较合并检索0

隐藏清空

合并搜索

回到顶部

执行限定条件

内容：

评分：

请选择保存的检索档案：

请选择收藏分类：

订阅名称：

通借通还

温馨提示：

图书名称：

借书校区：

取书校区：

手机号码：

邮箱地址：

一卡通帐号：

电话和邮箱必须正确填写，我们会与您联系确认。

联系人：

所在院系：

联系邮箱：

联系电话：

内蒙古自治区呼和浩特市赛罕区大学西街235号邮编: 010021

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：