检索结果-内蒙古大学图书馆

您好，读者！请登录

内蒙古大学图书馆

首页
概况
党建
资源
服务
科研支持
- 论文收录引用证明
- 科技查新
知识产权
档案馆
帮助

咨询与建议

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

您的常用邮箱：*

您的手机号码：*

问题描述：

当前已输入0个字，您还可以输入200个字

全部搜索
期刊论文
图书
学位论文
标准
纸本馆藏
外文资源发现
数据库导航
超星发现

高级检索

时间限定

出版年份：

文献类型

图书期刊文献学位论文多媒体

馆藏选择

电子馆藏纸本馆藏

核心期刊

全部期刊 SCI 收录期刊 SSCI 收录期刊 EI 收录期刊 CSCD 收录期刊 CSSCI 收录期刊

语言

中文英文

文献类型

期刊文献图书学位论文标准纸本馆藏

帮助

文字说明：

T=题名（书名、题名），A=作者（责任者），K=主题词，P=出版物名称，PU=出版社名称，O=机构（作者单位、学位授予单位、专利申请人），L=中图分类号，C=学科分类号，U=全部字段，Y=年（出版发行年、学位年度、标准发布年）

检索规则说明：

AND代表“并且”；OR代表“或者”；NOT代表“不包含”；(注意必须大写,运算符两边需空一格)

检索范例：

范例一：(K=图书馆学 OR K=情报学) AND A=范并思 AND Y=1982-2016
范例二：P=计算机应用与软件 AND (U=C++ OR U=Basic) NOT K=Visual AND Y=2011-2016

分类表

所选分类

>> <<

限定检索结果

文献类型

11,745 篇 会议
8 篇 期刊文献

馆藏范围

11,753 篇 电子文献
0 种 纸本馆藏

日期分布

学科分类号

8,136 篇 工学
- 7,671 篇 计算机科学与技术...
- 804 篇 机械工程
- 577 篇 软件工程
- 376 篇 电气工程
- 249 篇 控制科学与工程
- 208 篇 光学工程
- 85 篇 生物工程
- 83 篇 信息与通信工程
- 29 篇 生物医学工程（可授...
- 23 篇 电子科学与技术（可...
- 21 篇 化学工程与技术
- 15 篇 交通运输工程
- 14 篇 安全科学与工程
- 10 篇 网络空间安全
- 8 篇 仪器科学与技术
- 6 篇 材料科学与工程（可...
- 6 篇 动力工程及工程热...
3,191 篇 医学
- 3,187 篇 临床医学
- 11 篇 基础医学(可授医学...
- 7 篇 公共卫生与预防医...
478 篇 理学
- 213 篇 物理学
- 203 篇 系统科学
- 88 篇 生物学
- 52 篇 数学
- 29 篇 统计学（可授理学、...
- 21 篇 化学
55 篇 管理学
- 29 篇 图书情报与档案管...
- 28 篇 管理科学与工程(可...
- 12 篇 工商管理
17 篇 法学
- 15 篇 社会学
6 篇 农学
4 篇 教育学
2 篇 经济学
1 篇 军事学
1 篇 艺术学

主题

5,434 篇 computer vision
2,516 篇 training
2,087 篇 pattern recognit...
1,621 篇 computational mo...
1,435 篇 visualization
1,306 篇 three-dimensiona...
1,060 篇 semantics
981 篇 codes
968 篇 benchmark testin...
898 篇 computer archite...
884 篇 deep learning
762 篇 task analysis
681 篇 feature extracti...
536 篇 face recognition
527 篇 conferences
515 篇 transformers
515 篇 neural networks
479 篇 object detection
466 篇 image segmentati...
454 篇 cameras

机构

168 篇 univ sci & techn...
144 篇 univ chinese aca...
144 篇 tsinghua univ pe...
143 篇 carnegie mellon ...
135 篇 chinese univ hon...
112 篇 peng cheng lab p...
108 篇 zhejiang univ pe...
97 篇 swiss fed inst t...
92 篇 tsinghua univers...
92 篇 sensetime res pe...
88 篇 shanghai ai lab ...
85 篇 zhejiang univers...
84 篇 shanghai jiao to...
78 篇 peng cheng labor...
77 篇 university of sc...
77 篇 alibaba grp peop...
76 篇 univ hong kong p...
76 篇 tech univ munich...
76 篇 stanford univ st...
73 篇 university of ch...

作者

76 篇 timofte radu
64 篇 van gool luc
50 篇 zhang lei
44 篇 yang yi
40 篇 loy chen change
34 篇 tao dacheng
32 篇 liu yang
32 篇 chen chen
30 篇 zhou jie
30 篇 tian qi
30 篇 sun jian
28 篇 zha zheng-jun
27 篇 qi tian
26 篇 li xin
26 篇 vasconcelos nuno
26 篇 ying shan
25 篇 liu xiaoming
25 篇 luc van gool
25 篇 boxin shi
24 篇 zheng wei-shi

语言

11,746 篇 英文
7 篇 其他

检索条件"任意字段=2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2023"

共 11753 条记录，以下是21-30 订阅

全选清除本页清除全部题录导出标记到"检索档案"

详细简洁

排序：

Learning to Name Classes for vision and Language Models

Learning to Name Classes for Vision and Language Models

引用

ieee/cvf conference on computer vision and pattern recognition (cvpr)

作者： Parisot, Sarah Yang, Yongxin McDonagh, Steven Huawei Noahs Ark Lab Montreal PQ Canada

ISBN: (纸本)9798350301298

Large scale vision and language models can achieve impressive zero-shot recognition performance by mapping class specific text queries to image content. Two distinct challenges that remain however, are high sensitivity to the choice of handcrafted class names that define queries, and the difficulty of adaptation to new, smaller datasets. Towards addressing these problems, we propose to leverage available data to learn, for each class, an optimal word embedding as a function of the visual content. By learning new word embeddings on an otherwise frozen model, we are able to retain zero-shot capabilities for new classes, easily adapt models to new datasets, and adjust potentially erroneous, non-descriptive or ambiguous class names. We show that our solution can easily be integrated in image classification and object detection pipelines, yields significant performance gains in multiple scenarios and provides insights into model biases and labelling errors.

关键词： language reasoning vision

来源：评论

学校读者我要写书评

暂无评论

IDGI: A Framework to Eliminate Explanation Noise from Integrated Gradients

IDGI: A Framework to Eliminate Explanation Noise from Integr...

引用

ieee/cvf conference on computer vision and pattern recognition (cvpr)

作者： Yang, Ruo Wang, Binghui Bilgic, Mustafa IIT Dept Comp Sci Chicago IL 60616 USA

ISBN: (纸本)9798350301298

Integrated Gradients (IG) as well as its variants are well-known techniques for interpreting the decisions of deep neural networks. While IG-based approaches attain state-of-the-art performance, they often integrate noise into their explanation saliency maps, which reduce their interpretability. To minimize the noise, we examine the source of the noise analytically and propose a new approach to reduce the explanation noise based on our analytical findings. We propose the Important Direction Gradient Integration (IDGI) framework, which can be easily incorporated into any IG-based method that uses the Reimann Integration for integrated gradient computation. Extensive experiments with three IG-based methods show that IDGI improves them drastically on numerous interpretability metrics. The source code for IDGI is available at https://***/yangruo1226/IDGI.

关键词： Explainable computer vision

来源：评论

学校读者我要写书评

暂无评论

Behavioral Analysis of vision-and-Language Navigation Agents

Behavioral Analysis of Vision-and-Language Navigation Agents

引用

ieee/cvf conference on computer vision and pattern recognition (cvpr)

作者： Yang, Zijiao Majumdar, Arjun Lee, Stefan Oregon State Univ Corvallis OR 97331 USA Georgia Inst Technol Atlanta GA USA

ISBN: (纸本)9798350301298

To be successful, vision-and-Language Navigation (VLN) agents must be able to ground instructions to actions based on their surroundings. In this work, we develop a methodology to study agent behavior on a skill-specific basis - examining how well existing agents ground instructions about stopping, turning, and moving towards specified objects or rooms. Our approach is based on generating skill-specific interventions and measuring changes in agent predictions. We present a detailed case study analyzing the behavior of a recent agent and then compare multiple agents in terms of skill-specific competency scores. This analysis suggests that biases from training have lasting effects on agent behavior and that existing models are able to ground simple referring expressions. Our comparisons between models show that skill-specific scores correlate with improvements in overall VLN task performance.

关键词： language reasoning vision

来源：评论

学校读者我要写书评

暂无评论

Humans as Light Bulbs: 3D Human Reconstruction from Thermal Reflection

Humans as Light Bulbs: 3D Human Reconstruction from Thermal ...

引用

ieee/cvf conference on computer vision and pattern recognition (cvpr)

作者： Liu, Ruoshi Vondrick, Carl Columbia Univ New York NY 10027 USA

ISBN: (纸本)9798350301298

The relatively hot temperature of the human body causes people to turn into long-wave infrared light sources. Since this emitted light has a larger wavelength than visible light, many surfaces in typical scenes act as infrared mirrors with strong specular reflections. We exploit the thermal reflections of a person onto objects in order to locate their position and reconstruct their pose, even if they are not visible to a normal camera. We propose an analysis-by-synthesis framework that jointly models the objects, people, and their thermal reflections, which combines generative models with differentiable rendering of reflections. Quantitative and qualitative experiments show our approach works in highly challenging cases, such as with curved mirrors or when the person is completely unseen by a normal camera.

关键词： vision + graphics

来源：评论

学校读者我要写书评

暂无评论

引用

ieee/cvf conference on computer vision and pattern recognition (cvpr)

作者： Shaharabany, Tal Wolf, Lior Tel Aviv Univ Tel Aviv Israel

ISBN: (纸本)9798350301298

A phrase grounding model receives an input image and a text phrase and outputs a suitable localization map. We present an effective way to refine a phrase ground model by considering self-similarity maps extracted from the latent representation of the model's image encoder. Our main insights are that these maps resemble localization maps and that by combining such maps, one can obtain useful pseudo-labels for performing self-training. Our results surpass, by a large margin, the state of the art in weakly supervised phrase grounding. A similar gap in performance is obtained for a recently proposed downstream task called WWbL, in which only the image is input, without any text. Our code is available at https://***/talshaharabany/Similarity-Maps-forSelf-Training-Weakly-Supervised- Phrase-Grounding.

关键词： language reasoning vision

来源：评论

学校读者我要写书评

暂无评论

Neural Fourier Filter Bank

Neural Fourier Filter Bank

引用

ieee/cvf conference on computer vision and pattern recognition (cvpr)

作者： Wu, Zhijie Jin, Yuhe Yi, Kwang Moo Univ British Columbia Vancouver BC Canada

ISBN: (纸本)9798350301298

We present a novel method to provide efficient and highly detailed reconstructions. Inspired by wavelets, we learn a neural field that decompose the signal both spatially and frequency-wise. We follow the recent grid-based paradigm for spatial decomposition, but unlike existing work, encourage specific frequencies to be stored in each grid via Fourier features encodings. We then apply a multi-layer perceptron with sine activations, taking these Fourier encoded features in at appropriate layers so that higher-frequency components are accumulated on top of lower-frequency components sequentially, which we sum up to form the final output. We demonstrate that our method outperforms the state of the art regarding model compactness and convergence speed on multiple tasks: 2D image fitting, 3D shape reconstruction, and neural radiance fields. Our code is available at https://***/ubc-vision/NFFB.

关键词： vision + graphics

来源：评论

学校读者我要写书评

暂无评论

RGB no more: Minimally-decoded JPEG vision Transformers

RGB no more: Minimally-decoded JPEG Vision Transformers

引用

ieee/cvf conference on computer vision and pattern recognition (cvpr)

作者： Park, Jeongsoo Johnson, Justin Univ Michigan Ann Arbor MI 48109 USA

ISBN: (纸本)9798350301298

Most neural networks for computer vision are designed to infer using RGB images. However, these RGB images are commonly encoded in JPEG before saving to disk;decoding them imposes an unavoidable overhead for RGB networks. Instead, our work focuses on training vision Transformers (ViT) directly from the encoded features of JPEG. This way, we can avoid most of the decoding overhead, accelerating data load. Existing works have studied this aspect but they focus on CNNs. Due to how these encoded features are structured, CNNs require heavy modification to their architecture to accept such data. Here, we show that this is not the case for ViTs. In addition, we tackle data augmentation directly on these encoded features, which to our knowledge, has not been explored in-depth for training in this setting. With these two improvements - ViT and data augmentation - we show that our ViT-Ti model achieves up to 39.2% faster training and 17.9% faster inference with no accuracy loss compared to the RGB counterpart. (1)

关键词： and reasoning language vision

来源：评论

学校读者我要写书评

暂无评论

LayoutDM: Discrete Diffusion Model for Controllable Layout Generation

LayoutDM: Discrete Diffusion Model for Controllable Layout G...

引用

ieee/cvf conference on computer vision and pattern recognition (cvpr)

作者： Inoue, Naoto Kikuchi, Kotaro Simo-Serra, Edgar Otani, Mayu Yamaguchi, Kota CyberAgent Tokyo Japan Waseda Univ Tokyo Japan

ISBN: (纸本)9798350301298

Controllable layout generation aims at synthesizing plausible arrangement of element bounding boxes with optional constraints, such as type or position of a specific element. In this work, we try to solve a broad range of layout generation tasks in a single model that is based on discrete state-space diffusion models. Our model, named LayoutDM, naturally handles the structured layout data in the discrete representation and learns to progressively infer a noiseless layout from the initial input, where we model the layout corruption process by modality-wise discrete diffusion. For conditional generation, we propose to inject layout constraints in the form of masking or logit adjustment during inference. We show in the experiments that our LayoutDM successfully generates high-quality layouts and outperforms both task-specific and task-agnostic baselines on several layout tasks.(1)

关键词： vision + graphics

来源：评论

学校读者我要写书评

暂无评论

Plateau-reduced Differentiable Path Tracing

Plateau-reduced Differentiable Path Tracing

引用

ieee/cvf conference on computer vision and pattern recognition (cvpr)

作者： Fischer, Michael Ritschel, Tobias UCL London England

ISBN: (纸本)9798350301298

Current differentiable renderers provide light transport gradients with respect to arbitrary scene parameters. However, the mere existence of these gradients does not guarantee useful update steps in an optimization. Instead, inverse rendering might not converge due to inherent plateaus, i.e., regions of zero gradient, in the objective function. We propose to alleviate this by convolving the high-dimensional rendering function, that maps scene parameters to images, with an additional kernel that blurs the parameter space. We describe two Monte Carlo estimators to compute plateau-reduced gradients efficiently, i.e., with low variance, and show that these translate into net-gains in optimization error and runtime performance. Our approach is a straightforward extension to both black-box and differentiable renderers and enables optimization of problems with intricate light transport, such as caustics or global illumination, that existing differentiable renderers do not converge on. Our code is at ***/mfischerucl/prdpt.

关键词： vision + graphics

来源：评论

学校读者我要写书评

暂无评论

Visual Programming: Compositional visual reasoning without training

Visual Programming: Compositional visual reasoning without t...

引用

ieee/cvf conference on computer vision and pattern recognition (cvpr)

作者： Gupta, Tanmay Kembhavi, Aniruddha PRIOR Allen Inst AI Seattle WA 98103 USA

ISBN: (纸本)9798350301298

We present VISPROG, a neuro-symbolic approach to solving complex and compositional visual tasks given natural language instructions. VISPROG avoids the need for any task-specific training. Instead, it uses the in-context learning ability of large language models to generate python-like modular programs, which are then executed to get both the solution and a comprehensive and interpretable rationale. Each line of the generated program may invoke one of several off-the-shelf computer vision models, image processing subroutines, or python functions to produce intermediate outputs that may be consumed by subsequent parts of the program. We demonstrate the flexibility of VISPROG on 4 diverse tasks - compositional visual question answering, zero-shot reasoning on image pairs, factual knowledge object tagging, and language-guided image editing. We believe neuro-symbolic approaches like VISPROG are an exciting avenue to easily and effectively expand the scope of AI systems to serve the long tail of complex tasks that people may wish to perform.

关键词： and reasoning language vision

来源：评论

学校读者我要写书评

暂无评论

没有更多数据了...

全选清除本页清除全部题录导出标记到“检索档案”

共500页 << < 1 2 3 4 5 6 7 8 9 10 > >>

检索报告对象比较合并检索0

隐藏清空

合并搜索

回到顶部

执行限定条件

内容：

评分：

请选择保存的检索档案：

请选择收藏分类：

订阅名称：

通借通还

温馨提示：

图书名称：

借书校区：

取书校区：

手机号码：

邮箱地址：

一卡通帐号：

电话和邮箱必须正确填写，我们会与您联系确认。

联系人：

所在院系：

联系邮箱：

联系电话：

内蒙古自治区呼和浩特市赛罕区大学西街235号邮编: 010021

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：