检索结果-内蒙古大学图书馆

您好，读者！请登录

内蒙古大学图书馆

首页
概况
党建
资源
服务
科研支持
- 论文收录引用证明
- 科技查新
知识产权
档案馆
帮助

咨询与建议

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

您的常用邮箱：*

您的手机号码：*

问题描述：

当前已输入0个字，您还可以输入200个字

全部搜索
期刊论文
图书
学位论文
标准
纸本馆藏
外文资源发现
数据库导航
超星发现

高级检索

时间限定

出版年份：

文献类型

图书期刊文献学位论文多媒体

馆藏选择

电子馆藏纸本馆藏

核心期刊

全部期刊 SCI 收录期刊 SSCI 收录期刊 EI 收录期刊 CSCD 收录期刊 CSSCI 收录期刊

语言

中文英文

文献类型

期刊文献图书学位论文标准纸本馆藏

帮助

文字说明：

T=题名（书名、题名），A=作者（责任者），K=主题词，P=出版物名称，PU=出版社名称，O=机构（作者单位、学位授予单位、专利申请人），L=中图分类号，C=学科分类号，U=全部字段，Y=年（出版发行年、学位年度、标准发布年）

检索规则说明：

AND代表“并且”；OR代表“或者”；NOT代表“不包含”；(注意必须大写,运算符两边需空一格)

检索范例：

范例一：(K=图书馆学 OR K=情报学) AND A=范并思 AND Y=1982-2016
范例二：P=计算机应用与软件 AND (U=C++ OR U=Basic) NOT K=Visual AND Y=2011-2016

分类表

所选分类

>> <<

限定检索结果

文献类型

29,393 篇 会议
1,358 册 图书
225 篇 期刊文献

馆藏范围

30,974 篇 电子文献
2 种 纸本馆藏

日期分布

学科分类号

17,280 篇 工学
- 13,630 篇 计算机科学与技术...
- 5,193 篇 软件工程
- 2,983 篇 机械工程
- 2,643 篇 光学工程
- 1,421 篇 控制科学与工程
- 1,411 篇 电气工程
- 1,344 篇 信息与通信工程
- 656 篇 生物工程
- 577 篇 仪器科学与技术
- 513 篇 生物医学工程（可授...
- 468 篇 电子科学与技术（可...
- 251 篇 化学工程与技术
- 212 篇 安全科学与工程
- 140 篇 交通运输工程
- 132 篇 建筑学
- 123 篇 材料科学与工程（可...
- 119 篇 土木工程
5,054 篇 理学
- 3,127 篇 物理学
- 2,406 篇 数学
- 824 篇 生物学
- 802 篇 统计学（可授理学、...
- 299 篇 系统科学
- 228 篇 化学
3,831 篇 医学
- 3,799 篇 临床医学
- 185 篇 基础医学(可授医学...
- 140 篇 药学(可授医学、理...
1,059 篇 管理学
- 617 篇 图书情报与档案管...
- 467 篇 管理科学与工程(可...
- 145 篇 工商管理
373 篇 艺术学
- 373 篇 设计学（可授艺术学...
116 篇 法学
81 篇 农学
48 篇 教育学
43 篇 经济学
18 篇 军事学
8 篇 文学

主题

12,602 篇 computer vision
5,697 篇 pattern recognit...
3,180 篇 training
2,263 篇 cameras
2,178 篇 computational mo...
2,116 篇 feature extracti...
2,048 篇 image segmentati...
1,970 篇 visualization
1,967 篇 shape
1,642 篇 robustness
1,493 篇 layout
1,476 篇 three-dimensiona...
1,445 篇 computer science
1,338 篇 computer archite...
1,296 篇 object detection
1,220 篇 semantics
1,142 篇 face recognition
1,107 篇 conferences
1,077 篇 benchmark testin...
1,056 篇 humans

机构

137 篇 univ sci & techn...
134 篇 tsinghua univers...
134 篇 univ chinese aca...
118 篇 chinese univ hon...
101 篇 microsoft resear...
97 篇 zhejiang univers...
94 篇 national laborat...
93 篇 shanghai jiao to...
93 篇 zhejiang univ pe...
85 篇 university of sc...
79 篇 shanghai ai lab ...
78 篇 swiss fed inst t...
65 篇 microsoft res as...
62 篇 adobe research
62 篇 computer vision ...
61 篇 peking univ peop...
58 篇 univ oxford oxfo...
57 篇 google mountain ...
57 篇 hong kong univ s...
56 篇 google res mount...

作者

107 篇 umapada pal
81 篇 van gool luc
68 篇 zhang lei
59 篇 timofte radu
41 篇 yang yi
37 篇 loy chen change
37 篇 hanqing lu
33 篇 liu yang
33 篇 xiaoou tang
32 篇 nassir navab
32 篇 wang liang
30 篇 tian qi
29 篇 h. bischof
29 篇 jan-michael frah...
29 篇 vittorio murino
29 篇 darrell trevor
27 篇 li xin
27 篇 vasconcelos nuno
27 篇 murino vittorio
27 篇 chen chen

语言

30,833 篇 英文
92 篇 中文
73 篇 其他
6 篇 土耳其文
2 篇 日文
2 篇 俄文
1 篇 西班牙文

检索条件"任意字段=Conference on Computer Vision and Pattern Recognition"

共 30976 条记录，以下是4641-4650 订阅

全选清除本页清除全部题录导出标记到"检索档案"

详细简洁

排序：

Patch-based Progressive 3D Point Set Upsampling 32

Patch-based Progressive 3D Point Set Upsampling

引用

32nd IEEE/CVF conference on computer vision and pattern recognition (CVPR)

作者： Wang Yifan Shihao Wu Hui Huang Cohen-Or, Daniel Sorkine-Hornung, Olga Swiss Fed Inst Technol Zurich Switzerland Shenzhen Univ Shenzhen Peoples R China Tel Aviv Univ Tel Aviv Israel

ISBN: (纸本)9781728132938

We present a detail-driven deep neural network for point set upsampling. A high-resolution point set is essential for point-based rendering and surface reconstruction. Inspired by the recent success of neural image super-resolution techniques, we progressively train a cascade of patch-based upsampling networks on different levels of detail end-to-end. We propose a series of architectural design contributions that lead to a substantial performance boost. The effect of each technical contribution is demonstrated in an ablation study. Qualitative and quantitative experiments show that our method significantly outperforms the state-of-the- art learning-based [58, 59], and optimazation-based [23] approaches, both in terms of handling low-resolution inputs and revealing high-fidelity details. The data and code are at http://***/yifita/3pu.

关键词： 3D from Multiview and Sensors Deep Learning vision + Graphics

来源：评论

学校读者我要写书评

暂无评论

Pixel-Adaptive Convolutional Neural Networks 32

Pixel-Adaptive Convolutional Neural Networks

引用

32nd IEEE/CVF conference on computer vision and pattern recognition (CVPR)

作者： Su, Hang Jampani, Varun Sun, Deqing Gallo, Orazio Learned-Miller, Erik Kautz, Jan UMass Amherst Amherst MA 01003 USA NVIDIA Santa Clara CA USA

ISBN: (纸本)9781728132938

Convolutions are the fundamental building blocks of CNNs. The fact that their weights are spatially shared is one of the main reasons for their widespread use, but it is also a major limitation, as it makes convolutions content-agnostic. We propose a pixel-adaptive convolution (PAC) operation, a simple yet effective modification of standard convolutions, in which the filter weights are multiplied with a spatially varying kernel that depends on learnable, local pixel features. PAC is a generalization of several popular filtering techniques and thus can be used for a wide range of use cases. Specifically, we demonstrate state-of-the-art performance when PAC is used for deep joint image upsampling. PAC also offers an effective alternative to fully-connected CRF (Full-CRF), called PAC-CRF, which performs competitively compared to Full-CRF, while being considerably faster. In addition, we also demonstrate that PAC can be used as a drop-in replacement for convolution layers in pre-trained networks, resulting in consistent performance improvements.

关键词： Deep Learning Low-level vision Scene Analysis and Understanding

来源：评论

学校读者我要写书评

暂无评论

DINER: Disorder-Invariant Implicit Neural Representation

DINER: Disorder-Invariant Implicit Neural Representation

引用

IEEE/CVF conference on computer vision and pattern recognition (CVPR)

作者： Xie, Shaowen Zhu, Hao Liu, Zhen Zhang, Qi Zhou, You Cao, Xun Ma, Zhan Nanjing Univ Sch Elect Sci & Engn Nanjing 210023 Peoples R China Tencent Co Al Lab Shenzhen 518054 Peoples R China

ISBN: (纸本)9798350301298

Implicit neural representation (INR) characterizes the attributes of a signal as a function of corresponding coordinates which emerges as a sharp weapon for solving inverse problems. However, the capacity of INR is limited by the spectral bias in the network training. In this paper, we find that such a frequency-related problem could be largely solved by re-arranging the coordinates of the input signal, for which we propose the disorder-invariant implicit neural representation (DINER) by augmenting a hash-table to a traditional INR backbone. Given discrete signals sharing the same histogram of attributes and different arrangement orders, the hash-table could project the coordinates into the same distribution for which the mapped signal can be better modeled using the subsequent INR network, leading to significantly alleviated spectral bias. Experiments not only reveal the generalization of the DINER for different INR backbones (MLP vs. SIREN) and various tasks (image/video representation, phase retrieval, and refractive index recovery) but also show the superiority over the state-of-the-art algorithms both in quality and speed. Project page: https://***/DINER-website/

关键词： Low-level vision

来源：评论

学校读者我要写书评

暂无评论

Transfer4D: A framework for frugal motion capture and deformation transfer

Transfer4D: A framework for frugal motion capture and deform...

引用

IEEE/CVF conference on computer vision and pattern recognition (CVPR)

作者： Maheshwari, Shubh Narain, Rahul Hebbalaguppe, Ramya TCS Res Gurugram India Indian Inst Technol Delhi New Delhi India

ISBN: (纸本)9798350301298

Animating a virtual character based on a real performance of an actor is a challenging task that currently requires expensive motion capture setups and additional effort by expert animators, rendering it accessible only to large production houses. The goal of our work is to democratize this task by developing a frugal alternative termed "Transfer4D" that uses only commodity depth sensors and further reduces animators' effort by automating the rigging and animation transfer process. Our approach can transfer motion from an incomplete, single-view depth video to a semantically similar target mesh, unlike prior works that make a stricter assumption on the source to be noise-free and watertight. To handle sparse, incomplete videos from depth video inputs and variations between source and target objects, we propose to use skeletons as an intermediary representation between motion capture and transfer. We propose a novel unsupervised skeleton extraction pipeline from a single-view depth sequence that incorporates additional geometric information, resulting in superior performance in motion reconstruction and transfer in comparison to the contemporary methods and making our approach generic. We use non-rigid reconstruction to track motion from the depth sequence, and then we rig the source object using skinning decomposition. Finally, the rig is embedded into the target object for motion retargeting.

关键词： vision + graphics

来源：评论

学校读者我要写书评

暂无评论

Recurrent Neural Networks with Intra-Frame Iterations for Video Deblurring 32

Recurrent Neural Networks with Intra-Frame Iterations for Vi...

引用

32nd IEEE/CVF conference on computer vision and pattern recognition (CVPR)

作者： Nah, Seungjun Son, Sanghyun Lee, Kyoung Mu Seoul Natl Univ Dept ECE ASRI Seoul South Korea

ISBN: (纸本)9781728132938

Recurrent neural networks (RNNs) are widely used for sequential data processing. Recent state-of-the-art video deblurringmethods bank on convolutionalrecurrentneural network architectures to exploit the temporal relationship between neighboringframes. In this work, we aim to improve the accuracy of recurrentmodels by adaptingthe hidden states transferredfrom pastframes to the frame being processed so that the relationsbetween video frames could be better used. We iteratively update the hidden state via reusing RNN cell parametersbefore predictingan output deblurredframe. Since we use existing parametersto update the hidden state, our method improves accuracy withoutadditionalmodules. As the architectureremains the same regardless of iteration number fewer iterationmodels can be considered as a partial computationalpath of the models with more *** take advantage of this property, we employ a stochasticmethod to optimize our iterative models better. At training time, we randomly choose the iteration number on the fly and apply a regularizationloss that favors less computation unless there are considerable reconstructiongains. We show that our method exhibits stateof-the-art video deblurringperformance while operatingin real-time speed.

关键词： Computational Photography Deep Learning Low-level vision

来源：评论

学校读者我要写书评

暂无评论

VLPD: Context-Aware Pedestrian Detection via vision-Language Semantic Self-Supervision

VLPD: Context-Aware Pedestrian Detection via Vision-Language...

引用

IEEE/CVF conference on computer vision and pattern recognition (CVPR)

作者： Liu, Mengyin Jiang, Jie Zhu, Chao Yin, Xu-Cheng Univ Sci & Technol Beijing Sch Comp & Commun Engn Beijing Peoples R China Tencent Data Platform Dept Shenzhen Peoples R China

ISBN: (纸本)9798350301298

Detecting pedestrians accurately in urban scenes is significant for realistic applications like autonomous driving or video surveillance. However, confusing human-like objects often lead to wrong detections, and small scale or heavily occluded pedestrians are easily missed due to their unusual appearances. To address these challenges, only object regions are inadequate, thus how to fully utilize more explicit and semantic contexts becomes a key problem. Meanwhile, previous context-aware pedestrian detectors either only learn latent contexts with visual clues, or need laborious annotations to obtain explicit and semantic contexts. Therefore, we propose in this paper a novel approach via vision-Language semantic self-supervision for context-aware Pedestrian Detection (VLPD) to model explicitly semantic contexts without any extra annotations. Firstly, we propose a self-supervised vision-Language Semantic (VLS) segmentation method, which learns both fully-supervised pedestrian detection and contextual segmentation via self-generated explicit labels of semantic classes by vision-language models. Furthermore, a self-supervised Prototypical Semantic Contrastive (PSC) learning method is proposed to better discriminate pedestrians and other classes, based on more explicit and semantic contexts obtained from VLS. Extensive experiments on popular benchmarks show that our proposed VLPD achieves superior performances over the previous state-of-the-arts, particularly under challenging circumstances like small scale and heavy occlusion. Code is available at https://***/lmy98129/VLPD.

关键词： detection recognition: Categorization retrieval

来源：评论

学校读者我要写书评

暂无评论

HPLFlowNet: Hierarchical Permutohedral Lattice FlowNet for Scene Flow Estimation on Large-scale Point Clouds 32

HPLFlowNet: Hierarchical Permutohedral Lattice FlowNet for S...

引用

IEEE/CVF conference on computer vision and pattern recognition (CVPR)

作者： Gu, Xiuye Wang, Yijie Wu, Chongruo Lee, Yong Jae Wang, Panqu Stanford Univ Stanford CA 94305 USA TuSimple San Diego CA USA Univ Calif Davis Davis CA 95616 USA

ISBN: (纸本)9781728132938

We present a novel deep neural network architecture for end-to-end scene flow estimation that directly operates on large-scale 3D point clouds. Inspired by Bilateral Convolutional Layers (BCL), we propose novel DownBCL, UpBCL, and CorrBCL operations that restore structural information from unstructured point clouds, and fuse information from two consecutive point clouds. Operating on discrete and sparse permutohedral lattice points, our architectural design is parsimonious in computational cost. Our model can efficiently process a pair of point cloud frames at once with a maximum of 86K points per frame. Our approach achieves state-of-the-art performance on the FlyingThings3D and KITTI Scene Flow 2015 datasets. Moreover, trained on synthetic data, our approach shows great generalization ability on real-world data and on different point densities without fine-tuning.

关键词： 3D from Multiview and Sensors Deep Learning Low-level vision Motion and Tracking Robotics + Driving Scene Analysis and Unders

来源：评论

学校读者我要写书评

暂无评论

GROUNDHOG : Grounding Large Language Models to Holistic Segmentation

GROUNDHOG : Grounding Large Language Models to Holistic Segm...

引用

IEEE/CVF conference on computer vision and pattern recognition (CVPR)

作者： Zhang, Yichi Qiao, Zhiqiao Gao, Xiaofeng Shakiah, Suhaila Gao, Qiaozi Chai, Joyce Univ Michigan Ann Arbor MI 48109 USA Amazon AGI Seattle WA USA

ISBN: (纸本)9798350353006

Most multimodal large language models (MLLMs) learn language-to-object grounding through causal language modeling where grounded objects are captured by bounding boxes as sequences of location tokens. This paradigm lacks pixel-level representations that are important for fine-grained visual understanding and diagnosis. In this work, we introduce GROUNDHOG, an MLLM developed by grounding Large Language Models to holistic segmentation. GROUNDHOG incorporates a masked feature extractor and converts extracted features into visual entity tokens for the MLLM backbone, which then connectsgroundable phrases to unified grounding masks by retrieving and merging the entity masks. To train GROUNDHOG, we carefully curated M3G2, a grounded visual instruction tuning dataset with Multi-Modal Multi-Grained Grounding, by harvesting a collection of segmentation-grounded datasets with rich annotations. Our experimental results show that GROUNDHOG achieves superior performance on various language grounding tasks without task-specific fine-tuning, and significantly reduces object hallucination. GROUNDHOG also demonstrates better grounding towards complex forms of visual input and provides easy-to-understand diagnosis in failure cases.

关键词： Language Grounding Multi-Modal vision-Language Model

来源：评论

学校读者我要写书评

暂无评论

Improving Referring Expression Grounding with Cross-modal Attention-guided Erasing 32

Improving Referring Expression Grounding with Cross-modal At...

引用

32nd IEEE/CVF conference on computer vision and pattern recognition (CVPR)

作者： Liu, Xihui Wang, Zihao Shao, Jing Wang, Xiaogang Li, Hongsheng Chinese Univ Hong Kong Hong Kong Peoples R China SenseTime Res Hong Kong Peoples R China

ISBN: (纸本)9781728132938

Referring expression grounding aims at locating certain objects or persons in an image with a referring expression, where the key challenge is to comprehend and align various types of information from visual and textual domain, such as visual attributes, location and interactions with surrounding regions. Although the attention mechanism has been successfully applied for cross-modal alignments, previous attention models focus on only the most dominant features of both modalities, and neglect the fact that there could be multiple comprehensive textual-visual correspondences between images and referring expressions. To tackle this issue, we design a novel cross-modal attention-guided erasing approach, where we discard the most dominant information from either textual or visual domains to generate difficult training samples online, and to drive the model to discover complementary textual-visual correspondences. Extensive experiments demonstrate the effectiveness of our proposed method, which achieves state-of-the-art performance on three referring expression grounding datasets.

关键词： Categorization recognition: Detection Retrieval vision + Language

来源：评论

学校读者我要写书评

暂无评论

Scene Graph Generation with External Knowledge and Image Reconstruction 32

Scene Graph Generation with External Knowledge and Image Rec...

引用

IEEE/CVF conference on computer vision and pattern recognition (CVPR)

作者： Gu, Jiuxiang Zhao, Handong Lin, Zhe Li, Sheng Cai, Jianfei Ling, Mingyang Nanyang Technol Univ Interdisciplinary Grad Sch ROSE Lab Singapore Singapore Adobe Res San Jose CA USA Univ Georgia Athens GA 30602 USA Google Cloud AI San Jose CA USA

ISBN: (纸本)9781728132938

Scene graph generation has received growing attention with the advancements in image understanding tasks such as object detection, attributes and relationship prediction, etc. However, existing datasets are biased in terms of object and relationship labels, or often come with noisy and missing annotations, which makes the development of a reliable scene graph prediction model very challenging. In this paper, we propose a novel scene graph generation algorithm with external knowledge and image reconstruction loss to overcome these dataset issues. In particular, we extract commonsense knowledge from the external knowledge base to refine object and phrase features for improving generalizability in scene graph generation. To address the bias of noisy object annotations, we introduce an auxiliary image reconstruction path to regularize the scene graph generation network. Extensive experiments show that our framework can generate better scene graphs, achieving the state-of-the-art performance on two benchmark datasets: Visual Relationship Detection and Visual Genome datasets.

关键词： vision + Language Visual Reasoning

来源：评论

学校读者我要写书评

暂无评论

没有更多数据了...

全选清除本页清除全部题录导出标记到“检索档案”

共500页 << < 461 462 463 464 465 466 467 468 469 470 > >>

检索报告对象比较合并检索0

隐藏清空

合并搜索

回到顶部

执行限定条件

内容：

评分：

请选择保存的检索档案：

请选择收藏分类：

订阅名称：

通借通还

温馨提示：

图书名称：

借书校区：

取书校区：

手机号码：

邮箱地址：

一卡通帐号：

电话和邮箱必须正确填写，我们会与您联系确认。

联系人：

所在院系：

联系邮箱：

联系电话：

内蒙古自治区呼和浩特市赛罕区大学西街235号邮编: 010021

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：