检索结果-内蒙古大学图书馆

您好，读者！请登录

内蒙古大学图书馆

首页
概况
党建
资源
服务
科研支持
- 论文收录引用证明
- 科技查新
知识产权
档案馆
帮助

咨询与建议

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

您的常用邮箱：*

您的手机号码：*

问题描述：

当前已输入0个字，您还可以输入200个字

全部搜索
期刊论文
图书
学位论文
标准
纸本馆藏
外文资源发现
数据库导航
超星发现

高级检索

时间限定

出版年份：

文献类型

图书期刊文献学位论文多媒体

馆藏选择

电子馆藏纸本馆藏

核心期刊

全部期刊 SCI 收录期刊 SSCI 收录期刊 EI 收录期刊 CSCD 收录期刊 CSSCI 收录期刊

语言

中文英文

文献类型

期刊文献图书学位论文标准纸本馆藏

帮助

文字说明：

T=题名（书名、题名），A=作者（责任者），K=主题词，P=出版物名称，PU=出版社名称，O=机构（作者单位、学位授予单位、专利申请人），L=中图分类号，C=学科分类号，U=全部字段，Y=年（出版发行年、学位年度、标准发布年）

检索规则说明：

AND代表“并且”；OR代表“或者”；NOT代表“不包含”；(注意必须大写,运算符两边需空一格)

检索范例：

范例一：(K=图书馆学 OR K=情报学) AND A=范并思 AND Y=1982-2016
范例二：P=计算机应用与软件 AND (U=C++ OR U=Basic) NOT K=Visual AND Y=2011-2016

分类表

所选分类

>> <<

限定检索结果

文献类型

20,798 篇 会议
88 篇 期刊文献
65 册 图书

馆藏范围

20,950 篇 电子文献
1 种 纸本馆藏

日期分布

学科分类号

13,275 篇 工学
- 10,923 篇 计算机科学与技术...
- 2,484 篇 机械工程
- 2,307 篇 软件工程
- 913 篇 光学工程
- 771 篇 电气工程
- 556 篇 控制科学与工程
- 405 篇 信息与通信工程
- 210 篇 测绘科学与技术
- 131 篇 生物医学工程（可授...
- 104 篇 电子科学与技术（可...
- 100 篇 生物工程
- 92 篇 仪器科学与技术
- 56 篇 化学工程与技术
- 52 篇 建筑学
- 48 篇 土木工程
- 44 篇 安全科学与工程
- 38 篇 力学（可授工学、理...
- 38 篇 航空宇航科学与技...
- 35 篇 交通运输工程
3,457 篇 医学
- 3,449 篇 临床医学
- 34 篇 基础医学(可授医学...
2,315 篇 理学
- 1,154 篇 数学
- 1,132 篇 物理学
- 417 篇 统计学（可授理学、...
- 386 篇 生物学
- 252 篇 系统科学
- 57 篇 化学
353 篇 管理学
- 184 篇 图书情报与档案管...
- 176 篇 管理科学与工程(可...
- 32 篇 工商管理
28 篇 法学
20 篇 农学
15 篇 教育学
9 篇 经济学
8 篇 艺术学
5 篇 文学
5 篇 军事学

主题

8,203 篇 computer vision
3,010 篇 pattern recognit...
2,732 篇 training
1,769 篇 computational mo...
1,657 篇 visualization
1,483 篇 cameras
1,415 篇 shape
1,369 篇 three-dimensiona...
1,369 篇 face recognition
1,285 篇 image segmentati...
1,272 篇 feature extracti...
1,178 篇 robustness
1,090 篇 semantics
1,040 篇 layout
1,007 篇 object detection
975 篇 object recogniti...
969 篇 computer science
946 篇 computer archite...
946 篇 benchmark testin...
931 篇 codes

机构

174 篇 univ sci & techn...
154 篇 carnegie mellon ...
148 篇 univ chinese aca...
144 篇 chinese univ hon...
113 篇 microsoft resear...
103 篇 zhejiang univ pe...
99 篇 swiss fed inst t...
97 篇 tsinghua univ pe...
93 篇 tsinghua univers...
91 篇 microsoft res as...
88 篇 shanghai ai lab ...
81 篇 zhejiang univers...
76 篇 alibaba grp peop...
74 篇 hong kong univ s...
73 篇 university of sc...
72 篇 peking univ peop...
69 篇 university of ch...
68 篇 shanghai jiao to...
66 篇 google res mount...
66 篇 univ oxford oxfo...

作者

80 篇 van gool luc
71 篇 zhang lei
59 篇 timofte radu
48 篇 yang yi
47 篇 xiaoou tang
44 篇 darrell trevor
43 篇 tian qi
43 篇 luc van gool
42 篇 loy chen change
42 篇 sun jian
42 篇 li fei-fei
40 篇 qi tian
39 篇 li stan z.
37 篇 liu yang
37 篇 chen xilin
36 篇 shan shiguang
35 篇 liu xiaoming
35 篇 vasconcelos nuno
35 篇 torralba antonio
32 篇 zhou jie

语言

20,928 篇 英文
14 篇 中文
6 篇 其他
2 篇 日文
2 篇 土耳其文

检索条件"任意字段=2009 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2009"

共 20951 条记录，以下是61-70 订阅

全选清除本页清除全部题录导出标记到"检索档案"

详细简洁

排序：

Insights from the Use of Previously Unseen Neural Architecture Search Datasets

Insights from the Use of Previously Unseen Neural Architectu...

引用

ieee/CVF conference on computer vision and pattern recognition (cvpr)

作者： Geada, Rob Towers, David Forshaw, Matthew Atapour-Abarghouei, Amir McGough, A. Stephen Newcastle Univ Newcastle England Univ Durham Durham England Alan Turing Inst London England

ISBN: (纸本)9798350353006

The boundless possibility of neural networks which can be used to solve a problem - each with different performance - leads to a situation where a Deep Learning expert is required to identify the best neural network. This goes against the hope of removing the need for experts. Neural Architecture Search (NAS) offers a solution to this by automatically identifying the best architecture. However, to date, NAS work has focused on a small set of datasets which we argue are not representative of real-world problems. We introduce eight new datasets created for a series of NAS Challenges: AddNIST, Language, MultNIST, CIFARTile, Gutenberg, Isabella, GeoClassing, and Chesseract. These datasets and challenges are developed to direct attention to issues in NAS development and to encourage authors to consider how their models will perform on datasets unknown to them at development time. We present experimentation using standard Deep Learning methods as well as the best results from challenge participants.

关键词： Classification Datasets Image recognition Machine Learning NAS Neural Architecture Search

来源：评论

学校读者我要写书评

暂无评论

You Only Need Less Attention at Each Stage in vision Transformers

You Only Need Less Attention at Each Stage in Vision Transfo...

引用

ieee/CVF conference on computer vision and pattern recognition (cvpr)

作者： Zhang, Shuoxi Liu, Hanpeng Lin, Stephen He, Kun Huazhong Univ Sci & Technol Wuhan Peoples R China Microsoft Res Asia Beijing Peoples R China

ISBN: (纸本)9798350353013;9798350353006

The advent of vision Transformers (ViTs) marks a substantial paradigm shift in the realm of computer vision. ViTs capture the global information of images through self-attention modules, which perform dot product computations among patchified image tokens. While self- attention modules empower ViTs to capture long-range dependencies, the computational complexity grows quadratically with the number of tokens, which is a major hindrance to the practical application of ViTs. Moreover, the self-attention mechanism in deep ViTs is also susceptible to the attention saturation issue. Accordingly, we argue against the necessity of computing the attention scores in every layer, and we propose the Less-Attention vision Transformer (LaViT), which computes only a few attention operations at each stage and calculates the subsequent feature alignments in other layers via attention transformations that leverage the previously calculated attention scores. This novel approach can mitigate two primary issues plaguing traditional self-attention modules: the heavy computational burden and attention saturation. Our proposed architecture offers superior efficiency and ease of implementation, merely requiring matrix multiplications that are highly optimized in contemporary deep learning frameworks. Moreover, our architecture demonstrates exceptional performance across various vision tasks including classification, detection and segmentation.

关键词： computer vision efficient training vision transformer

来源：评论

学校读者我要写书评

暂无评论

6D-Diff: A Keypoint Diffusion Framework for 6D Object Pose Estimation

6D-Diff: A Keypoint Diffusion Framework for 6D Object Pose E...

引用

ieee/CVF conference on computer vision and pattern recognition (cvpr)

作者： Xu, Li Qui, Haoxuan Cai, Yujun Liu, Jun Singapore Univ Technol & Design Singapore Singapore Nanyang Technol Univ Singapore Singapore

ISBN: (纸本)9798350353006

Estimating the 6D object pose from a single RGB image often involves noise and indeterminacy due to challenges such as occlusions and cluttered backgrounds. Meanwhile, diffusion models have shown appealing performance in generating high-quality images from random noise with high indeterminacy through step-by-step denoising. In-spired by their denoising capability, we propose a novel diffusion-based framework (6D-Diff) to handle the noise and indeterminacy in object pose estimation for better performance. In our framework, to establish accurate 2D-3D correspondence, we formulate 2D keypoints detection as a reverse diffusion (denoising) process. To facilitate such a denoising process, we design a Mixture-of-Cauchy-based forward diffusion process and condition the reverse process on the object appearance features. Extensive experiments on the LM-O and YCB-V datasets demonstrate the effectiveness of our framework.

关键词： Object recognition

来源：评论

学校读者我要写书评

暂无评论

De-Diffusion Makes Text a Strong Cross-Modal Interface

De-Diffusion Makes Text a Strong Cross-Modal Interface

引用

ieee/CVF conference on computer vision and pattern recognition (cvpr)

作者： Wei, Chen Liu, Chenxi Qi, Siyuan Zhang, Zhishuai Yuille, Alan Yu, Jiahui Google DeepMind London England Johns Hopkins Univ Baltimore MD 21218 USA

ISBN: (纸本)9798350353006

We demonstrate text as a strong cross-modal interface. Rather than relying on deep embeddings to connect image and language as the interface representation, our approach represents an image as text, from which we enjoy the interpretability and flexibility inherent to natural language. We employ an autoencoder that uses a pre-trained text-to-image diffusion model for decoding. The encoder is trained to transform an input image into text, which is then fed into the fixed text- to-image diffusion decoder to reconstruct the original input - a process we term De- Diffusion. Experiments validate both the precision and comprehensiveness of De-Diffusion text representing images, such that it can be readily ingested by off-the-shelf text-to-image tools and LLMs for diverse multi-modal tasks. For example, a single De-Diffusion model can generalize to provide transferable prompts for different text-to-image tools, and also achieves a new state of the art on open-ended vision-language tasks by simply prompting large language models with few-shot examples. Project page: ***.

关键词： Diffusion Generative Model vision and Language

来源：评论

学校读者我要写书评

暂无评论

GSNeRF: Generalizable Semantic Neural Radiance Fields with Enhanced 3D Scene Understanding

GSNeRF: Generalizable Semantic Neural Radiance Fields with E...

引用

ieee/CVF conference on computer vision and pattern recognition (cvpr)

作者： Chou, Zi-Ting Huang, Sheng-Yu Liu, I-Jieh Wang, Yu-Chiang Frank Natl Taiwan Univ Grad Inst Commun Engn Taipei Taiwan NVIDIA Taipei Taiwan

ISBN: (纸本)9798350353006

Utilizing multi-view inputs to synthesize novel-view images, Neural Radiance Fields (NeRF) have emerged as a popular research topic in 3D vision. In this work, we introduce a Generalizable Semantic Neural Radiance Fields ( GSNeRF), which uniquely takes image semantics into the synthesis process so that both novel view image and the associated semantic maps can be produced for unseen scenes. Our GSNeRF is composed of two stages: Semantic GeoReasoning and Depth-Guided Visual rendering. The former is able to observe multi- view image inputs to extract semantic and geometry features from a scene. Guided by the resulting image geometry information, the latter performs both image and semantic rendering with improved performances. Our experiments not only confirm that GSNeRF performs favorably against prior works on both novel-view image and semantic segmentation synthesis but the effectiveness of our sampling strategy for visual rendering is further verified.

关键词： 3D computer vision generalizable nerf NeRF segmentation

来源：评论

学校读者我要写书评

暂无评论

A Theory of Joint Light and Heat Transport for Lambertian Scenes

A Theory of Joint Light and Heat Transport for Lambertian Sc...

引用

ieee/CVF conference on computer vision and pattern recognition (cvpr)

作者： Ramanagopal, Mani Narayanan, Sriram Sankaranarayanan, Aswin C. Narasimhan, Srinivasa G. Carnegie Mellon Univ Pittsburgh PA 15213 USA

ISBN: (纸本)9798350353006

We present a novel theory that establishes the relationship between light transport in visible and thermal infrared, and heat transport in solids. We show that heat generated due to light absorption can be estimated by modeling heat transport using a thermal camera. For situations where heat conduction is negligible, we analytically solve the heat transport equation to derive a simple expression relating the change in thermal image intensity to the absorbed light intensity and heat capacity of the material. Next, we prove that intrinsic image decomposition for Lambertian scenes becomes a well-posed problem if one has access to the absorbed light. Our theory generalizes to arbitrary shapes and unstructured illumination. Our theory is based on applying energy conservation principle at each pixel independently. We validate our theory using real-world experiments on diffuse objects made of different materials that exhibit both direct and global components (inter-reflections) of light transport under unknown complex lighting.

关键词： heat transport intrinsic image decomposition light transport physics-based vision

来源：评论

学校读者我要写书评

暂无评论

DrivingGaussian: Composite Gaussian Splatting for Surrounding Dynamic Autonomous Driving Scenes

DrivingGaussian: Composite Gaussian Splatting for Surroundin...

引用

ieee/CVF conference on computer vision and pattern recognition (cvpr)

作者： Zhou, Xiaoyu Lin, Zhiwei Shan, Xiaojun Wang, Yongtao Sun, Deqing Yang, Ming-Hsuan Peking Univ Wangxuan Inst Comp Technol Beijing Peoples R China Google Res Mountain View CA USA Univ Calif Merced Merced CA USA

ISBN: (纸本)9798350353006

We present DrivingGaussian, an efficient and effective framework for surrounding dynamic autonomous driving scenes. For complex scenes with moving objects, we first sequentially and progressively model the static background of the entire scene with incremental static 3D Gaussians. We then leverage a composite dynamic Gaussian graph to handle multiple moving objects, individually reconstructing each object and restoring their accurate positions and occlusion relationships within the scene. We further use a LiDAR prior for Gaussian Splatting to reconstruct scenes with greater details and maintain panoramic consistency. DrivingGaussian outperforms existing methods in dynamic driving scene reconstruction and enables photorealistic surround-view synthesis with high-fidelity and multi-camera consistency. Our project page is at: https://***/VDIGPKU/DrivingGaussian.

关键词： 3D Reconstruction 3D vision Autonomous Driving

来源：评论

学校读者我要写书评

暂无评论

Dense vision Transformer Compression with Few Samples

Dense Vision Transformer Compression with Few Samples

引用

ieee/CVF conference on computer vision and pattern recognition (cvpr)

作者： Zhang, Hanxiao Zhou, Yifan Wang, Guo-Hua Nanjing Univ Natl Key Lab Novel Software Technol Nanjing Peoples R China Nanjing Univ Sch Artificial Intelligence Nanjing Peoples R China

ISBN: (纸本)9798350353006

Few-shot model compression aims to compress a large model into a more compact one with only a tiny training set (even without labels). Block-level pruning has recently emerged as a leading technique in achieving high accuracy and low latency in few-shot CNN compression. But, few-shot compression for vision Transformers (ViT) remains largely unexplored, which presents a new challenge. In particular, the issue of sparse compression exists in traditional CNN few-shot methods, which can only produce very few compressed models of different model sizes. This paper proposes a novel framework for few-shot ViT compression named DC-ViT. Instead of dropping the entire block, DC-ViT selectively eliminates the attention module while retaining and reusing portions of the MLP module. DC-ViT enables dense compression, which outputs numerous compressed models that densely populate the range of model complexity. DC-ViT outperforms state-of-the-art few-shot compression methods by a significant margin of 10 percentage points, along with lower latency in the compression of ViT and its variants.

关键词： Few-shot Model Compression Pruning vision Transformer

来源：评论

学校读者我要写书评

暂无评论

ALGM: Adaptive Local-then-Global Token Merging for Efficient Semantic Segmentation with Plain vision Transformers

ALGM: Adaptive Local-then-Global Token Merging for Efficient...

引用

ieee/CVF conference on computer vision and pattern recognition (cvpr)

作者： Norouzi, Narges Orlova, Svetlana de Geus, Daan Dubbelman, Gijs Eindhoven Univ Technol Eindhoven Netherlands

ISBN: (纸本)9798350353006

This work presents Adaptive Local-then-Global Merging (ALGM), a token reduction method for semantic segmentation networks that use plain vision Transformers. ALGM merges tokens in two stages: (1) In the first network layer, it merges similar tokens within a small local window and (2) halfway through the network, it merges similar tokens across the entire image. This is motivated by an analysis in which we found that, in those situations, tokens with a high cosine similarity can likely be merged without a drop in segmentation quality. With extensive experiments across multiple datasets and network configurations, we show that ALGM not only significantly improves the throughput by up to 100%, but can also enhance the mean IoU by up to +1.1, thereby achieving a better trade-off between segmentation quality and efficiency than existing methods. Moreover, our approach is adaptive during inference, meaning that the same model can be used for optimal efficiency or accuracy, depending on the application. Code is available at https://***/ALGM.

关键词： Efficient vision Transformers Semantic Segmentation Token Merging

来源：评论

学校读者我要写书评

暂无评论

ArGue: Attribute-Guided Prompt Tuning for vision-Language Models

ArGue: Attribute-Guided Prompt Tuning for Vision-Language Mo...

引用

ieee/CVF conference on computer vision and pattern recognition (cvpr)

作者： Tian, Xinyu Zou, Shu Yang, Zhaoyuan Zhang, Jing Australian Natl Univ Canberra ACT Australia GE Res Niskayuna NY USA

ISBN: (纸本)9798350353006

Although soft prompt tuning is effective in efficiently adapting vision-Language (V&L) models for downstream tasks, it shows limitations in dealing with distribution shifts. We address this issue with Attribute-Guided Prompt Tuning (ArGue), making three key contributions. 1) In contrast to the conventional approach of directly appending soft prompts preceding class names, we align the model with primitive visual attributes generated by Large Language Models (LLMs). We posit that a model's ability to express high confidence in these attributes signifies its capacity to discern the correct class rationales. 2) We introduce attribute sampling to eliminate disadvantageous attributes, thus only semantically meaningful attributes are preserved. 3) We propose negative prompting, explicitly enumerating class-agnostic attributes to activate spurious correlations and encourage the model to generate highly orthogonal probability distributions in relation to these negative features. In experiments, our method significantly outperforms current state-of-the-art prompt tuning methods on both novel class prediction and out-of-distribution generalization tasks. The code is available https://***/Liam-Tian/ArGue.

关键词： few-shot adaptation prompt tuning vision-language model

来源：评论

学校读者我要写书评

暂无评论

没有更多数据了...

全选清除本页清除全部题录导出标记到“检索档案”

共500页 << < 3 4 5 6 7 8 9 10 11 12 > >>

检索报告对象比较合并检索0

隐藏清空

合并搜索

回到顶部

执行限定条件

内容：

评分：

请选择保存的检索档案：

请选择收藏分类：

订阅名称：

通借通还

温馨提示：

图书名称：

借书校区：

取书校区：

手机号码：

邮箱地址：

一卡通帐号：

电话和邮箱必须正确填写，我们会与您联系确认。

联系人：

所在院系：

联系邮箱：

联系电话：

内蒙古自治区呼和浩特市赛罕区大学西街235号邮编: 010021

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：