检索结果-内蒙古大学图书馆

您好，读者！请登录

内蒙古大学图书馆

首页
概况
党建
资源
服务
科研支持
- 论文收录引用证明
- 科技查新
知识产权
档案馆
帮助

咨询与建议

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

您的常用邮箱：*

您的手机号码：*

问题描述：

当前已输入0个字，您还可以输入200个字

全部搜索
期刊论文
图书
学位论文
标准
纸本馆藏
外文资源发现
数据库导航
超星发现

高级检索

时间限定

出版年份：

文献类型

图书期刊文献学位论文多媒体

馆藏选择

电子馆藏纸本馆藏

核心期刊

全部期刊 SCI 收录期刊 SSCI 收录期刊 EI 收录期刊 CSCD 收录期刊 CSSCI 收录期刊

语言

中文英文

文献类型

期刊文献图书学位论文标准纸本馆藏

帮助

文字说明：

T=题名（书名、题名），A=作者（责任者），K=主题词，P=出版物名称，PU=出版社名称，O=机构（作者单位、学位授予单位、专利申请人），L=中图分类号，C=学科分类号，U=全部字段，Y=年（出版发行年、学位年度、标准发布年）

检索规则说明：

AND代表“并且”；OR代表“或者”；NOT代表“不包含”；(注意必须大写,运算符两边需空一格)

检索范例：

范例一：(K=图书馆学 OR K=情报学) AND A=范并思 AND Y=1982-2016
范例二：P=计算机应用与软件 AND (U=C++ OR U=Basic) NOT K=Visual AND Y=2011-2016

分类表

所选分类

>> <<

限定检索结果

文献类型

23,000 篇 会议
126 册 图书
92 篇 期刊文献

馆藏范围

23,217 篇 电子文献
1 种 纸本馆藏

日期分布

学科分类号

13,623 篇 工学
- 11,107 篇 计算机科学与技术...
- 3,479 篇 软件工程
- 2,444 篇 机械工程
- 1,717 篇 光学工程
- 1,076 篇 电气工程
- 1,014 篇 控制科学与工程
- 784 篇 信息与通信工程
- 411 篇 仪器科学与技术
- 352 篇 生物工程
- 251 篇 生物医学工程（可授...
- 196 篇 电子科学与技术（可...
- 114 篇 化学工程与技术
- 107 篇 安全科学与工程
- 100 篇 测绘科学与技术
- 88 篇 建筑学
- 86 篇 交通运输工程
- 84 篇 土木工程
3,493 篇 医学
- 3,480 篇 临床医学
- 81 篇 基础医学(可授医学...
3,241 篇 理学
- 1,939 篇 物理学
- 1,640 篇 数学
- 563 篇 统计学（可授理学、...
- 500 篇 生物学
- 249 篇 系统科学
- 106 篇 化学
521 篇 管理学
- 311 篇 图书情报与档案管...
- 223 篇 管理科学与工程(可...
- 76 篇 工商管理
276 篇 艺术学
- 276 篇 设计学（可授艺术学...
66 篇 法学
- 63 篇 社会学
38 篇 农学
28 篇 教育学
22 篇 经济学
10 篇 军事学
3 篇 文学

主题

10,187 篇 computer vision
3,967 篇 pattern recognit...
3,010 篇 training
2,002 篇 computational mo...
1,816 篇 cameras
1,814 篇 visualization
1,515 篇 feature extracti...
1,482 篇 shape
1,459 篇 three-dimensiona...
1,439 篇 image segmentati...
1,289 篇 robustness
1,203 篇 computer archite...
1,158 篇 semantics
1,148 篇 conferences
1,106 篇 layout
1,093 篇 computer science
1,088 篇 object detection
1,024 篇 benchmark testin...
967 篇 codes
921 篇 face recognition

机构

136 篇 univ sci & techn...
121 篇 univ chinese aca...
118 篇 chinese univ hon...
107 篇 carnegie mellon ...
101 篇 tsinghua univers...
101 篇 microsoft resear...
97 篇 swiss fed inst t...
93 篇 zhejiang univ pe...
82 篇 university of sc...
81 篇 zhejiang univers...
80 篇 university of ch...
77 篇 shanghai ai lab ...
72 篇 shanghai jiao to...
69 篇 national laborat...
68 篇 microsoft res as...
66 篇 alibaba grp peop...
64 篇 adobe research
60 篇 peking univ peop...
59 篇 univ oxford oxfo...
59 篇 tsinghua univ pe...

作者

81 篇 van gool luc
71 篇 timofte radu
64 篇 zhang lei
51 篇 luc van gool
41 篇 li stan z.
40 篇 yang yi
37 篇 loy chen change
35 篇 chen chen
33 篇 xiaoou tang
33 篇 qi tian
32 篇 liu yang
32 篇 pascal fua
31 篇 tian qi
31 篇 sun jian
30 篇 murino vittorio
29 篇 darrell trevor
28 篇 li xin
28 篇 li fei-fei
27 篇 vasconcelos nuno
27 篇 hanqing lu

语言

23,023 篇 英文
166 篇 其他
22 篇 中文
5 篇 土耳其文
2 篇 日文

检索条件"任意字段=IEEE Conference on Computer Vision and Pattern Recognition Workshops"

共 23218 条记录，以下是1141-1150 订阅

全选清除本页清除全部题录导出标记到"检索档案"

详细简洁

排序：

Generic-to-Specific Distillation of Masked Autoencoders

Generic-to-Specific Distillation of Masked Autoencoders

引用

ieee/CVF conference on computer vision and pattern recognition (CVPR)

作者： Huang, Wei Peng, Zhiliang Dong, Li Wei, Furu Jiao, Jianbin Ye, Qixiang Univ Chinese Acad Sci Beijing Peoples R China Microsoft Research Redmond WA USA

ISBN: (纸本)9798350301298

Large vision Transformers (ViTs) driven by self-supervised pre-training mechanisms achieved unprecedented progress. Lightweight ViT models limited by the model capacity, however, benefit little from those pre-training mechanisms. Knowledge distillation defines a paradigm to transfer representations from large (teacher) models to small (student) ones. However, the conventional single-stage distillation easily gets stuck on task-specific transfer, failing to retain the task-agnostic knowledge crucial for model generalization. In this study, we propose generic-to-specific distillation (G2SD), to tap the potential of small ViT models under the supervision of large models pre-trained by masked autoencoders. In generic distillation, decoder of the small model is encouraged to align feature predictions with hidden representations of the large model, so that task-agnostic knowledge can be transferred. In specific distillation, predictions of the small model are constrained to be consistent with those of the large model, to transfer task-specific features which guarantee task performance. With G2SD, the vanilla ViT-Small model respectively achieves 98.7%, 98.1% and 99.3% the performance of its teacher (ViT-Base) for image classification, object detection, and semantic segmentation, setting a solid baseline for two-stage vision distillation. Code will be available at https://***/pengzhiliang/G2SD.

关键词： Efficient and scalable vision

来源：评论

学校读者我要写书评

暂无评论

BiFormer: vision Transformer with Bi-Level Routing Attention

BiFormer: Vision Transformer with Bi-Level Routing Attention

引用

ieee/CVF conference on computer vision and pattern recognition (CVPR)

作者： Zhu, Lei Wang, Xinjiang Ke, Zhanghan Zhang, Wayne Lau, Rynson City Univ Hong Kong Hong Kong Peoples R China SenseTime Res Hong Kong Peoples R China

ISBN: (纸本)9798350301298

As the core building block of vision transformers, attention is a powerful tool to capture long-range dependency. However, such power comes at a cost: it incurs a huge computation burden and heavy memory footprint as pair-wise token interaction across all spatial locations is computed. A series of works attempt to alleviate this problem by introducing handcrafted and content-agnostic sparsity into attention, such as restricting the attention operation to be inside local windows, axial stripes, or dilated windows. In contrast to these approaches, we propose a novel dynamic sparse attention via bi-level routing to enable a more flexible allocation of computations with content awareness. Specifically, for a query, irrelevant key-value pairs are first filtered out at a coarse region level, and then fine-grained token-to-token attention is applied in the union of remaining candidate regions (i.e., routed regions). We provide a simple yet effective implementation of the proposed bi-level routing attention, which utilizes the sparsity to save both computation and memory while involving only GPU-friendly dense matrix multiplications. Built with the proposed bi-level routing attention, a new general vision transformer, named BiFormer, is then presented. As BiFormer attends to a small subset of relevant tokens in a query adaptive manner without distraction from other irrelevant ones, it enjoys both good performance and high computational efficiency, especially in dense prediction tasks. Empirical results across several computer vision tasks such as image classification, object detection, and semantic segmentation verify the effectiveness of our design. Code is available at https://***/rayleizhu/BiFormer.

关键词： Deep learning architectures and techniques

来源：评论

学校读者我要写书评

暂无评论

DisWOT: Student Architecture Search for Distillation WithOut Training

DisWOT: Student Architecture Search for Distillation WithOut...

引用

ieee/CVF conference on computer vision and pattern recognition (CVPR)

作者： Dong, Peijie Li, Lujun Wei, Zimian Natl Univ Def Technol Changsha Peoples R China Chinese Acad Sci Beijing 100864 Peoples R China

ISBN: (纸本)9798350301298

Knowledge distillation (KD) is an effective training strategy to improve the lightweight student models under the guidance of cumbersome teachers. However, the large architecture difference across the teacher-student pairs limits the distillation gains. In contrast to previous adaptive distillation methods to reduce the teacher-student gap, we explore a novel training-free framework to search for the best student architectures for a given teacher. Our work first empirically show that the optimal model under vanilla training cannot be the winner in distillation. Secondly, we find that the similarity of feature semantics and sample relations between random-initialized teacher-student networks have good correlations with final distillation performances. Thus, we efficiently measure similarity matrixs conditioned on the semantic activation maps to select the optimal student via an evolutionary algorithm without any training. In this way, our student architecture search for Distillation WithOut Training (DisWOT) significantly improves the performance of the model in the distillation stage with at least 180x training acceleration. Additionally, we extend similarity metrics in DisWOT as new distillers and KD-based zero-proxies. Our experiments on CIFAR, ImageNet and NAS-Bench-201 demonstrate that our technique achieves state-of-the-art results on different search spaces. Our project and code are available at https://***/DisWOT-CVPR2023/.

关键词： Efficient and scalable vision

来源：评论

学校读者我要写书评

暂无评论

TINC: Tree-structured Implicit Neural Compression

TINC: Tree-structured Implicit Neural Compression

引用

ieee/CVF conference on computer vision and pattern recognition (CVPR)

作者： Yang, Runzhao Tsinghua Univ Dept Automat Beijing 100084 Peoples R China

ISBN: (纸本)9798350301298

Implicit neural representation (INR) can describe the target scenes with high fidelity using a small number of parameters, and is emerging as a promising data compression technique. However, limited spectrum coverage is intrinsic to INR, and it is non-trivial to remove redundancy in diverse complex data effectively. Preliminary studies can only exploit either global or local correlation in the target data and thus of limited performance. In this paper, we propose a Tree-structured Implicit Neural Compression (TINC) to conduct compact representation for local regions and extract the shared features of these local representations in a hierarchical manner. Specifically, we use Multi-Layer Perceptrons (MLPs) to fit the partitioned local regions, and these MLPs are organized in tree structure to share parameters according to the spatial distance. The parameter sharing scheme not only ensures the continuity between adjacent regions, but also jointly removes the local and non-local redundancy. Extensive experiments show that TINC improves the compression fidelity of INR, and has shown impressive compression capabilities over commercial tools and other deep learning based methods. Besides, the approach is of high flexibility and can be tailored for different data and parameter settings. The source code can be found at https://***/RichealYoung/TINC.

关键词： cell microscopy Medical and biological vision

来源：评论

学校读者我要写书评

暂无评论

ABLE-NeRF: Attention-Based Rendering with Learnable Embeddings for Neural Radiance Field

ABLE-NeRF: Attention-Based Rendering with Learnable Embeddin...

引用

ieee/CVF conference on computer vision and pattern recognition (CVPR)

作者： Tang, Zhe Jun Cham, Tat-Jen Zhao, Haiyu Nanyang Technol Univ S Lab Singapore Singapore Nanyang Technol Univ Singapore Singapore SenseTime Res Singapore Singapore

ISBN: (纸本)9798350301298

Neural Radiance Field (NeRF) is a popular method in representing 3D scenes by optimising a continuous volumetric scene function. Its large success which lies in applying volumetric rendering (VR) is also its Achilles' heel in producing view-dependent effects. As a consequence, glossy and transparent surfaces often appear murky. A remedy to reduce these artefacts is to constrain this VR equation by excluding volumes with back-facing normal. While this approach has some success in rendering glossy surfaces, translucent objects are still poorly represented. In this paper, we present an alternative to the physics-based VR approach by introducing a self-attention-based framework on volumes along a ray. In addition, inspired by modern game engines which utilise Light Probes to store local lighting passing through the scene, we incorporate Learnable Embeddings to capture view dependent effects within the scene. Our method, which we call ABLE-NeRF, significantly reduces 'blurry' glossy surfaces in rendering and produces realistic translucent surfaces which lack in prior art. In the Blender dataset, ABLE-NeRF achieves SOTA results and surpasses Ref-NeRF in all 3 image quality metrics PSNR, SSIM, LPIPS.

关键词： vision + graphics

来源：评论

学校读者我要写书评

暂无评论

Weakly-supervised Single-view Image Relighting

Weakly-supervised Single-view Image Relighting

引用

ieee/CVF conference on computer vision and pattern recognition (CVPR)

作者： Yi, Renjiao Zhu, Chenyang Xu, Kai Natl Univ Def Technol Changsha Hunan Peoples R China

ISBN: (纸本)9798350301298

We present a learning-based approach to relight a single image of Lambertian and low-frequency specular objects. Our method enables inserting objects from photographs into new scenes and relighting them under the new environment lighting, which is essential for AR applications. To relight the object, we solve both inverse rendering and re-rendering. To resolve the ill-posed inverse rendering, we propose a weakly-supervised method by a low-rank constraint. To facilitate the weakly-supervised training, we contribute Relit, a large-scale (750K images) dataset of videos with aligned objects under changing illuminations. For re-rendering, we propose a differentiable specular rendering layer to render low-frequency non-Lambertian materials under various illuminations of spherical harmonics. The whole pipeline is end-to-end and efficient, allowing for a mobile app implementation of AR object insertion. Extensive evaluations demonstrate that our method achieves state-of-the-art performance. Project page: https://***/relighting/.

关键词： Physics-based vision and shape-from-X

来源：评论

学校读者我要写书评

暂无评论

LANA: A Language-Capable Navigator for Instruction Following and Generation

LANA: A Language-Capable Navigator for Instruction Following...

引用

ieee/CVF conference on computer vision and pattern recognition (CVPR)

作者： Wang, Xiaohan Wang, Wenguan Shao, Jiayi Yang, Yi Zhejiang Univ CCAI Hangzhou Peoples R China

ISBN: (纸本)9798350301298

Recently, visual-language navigation (VLN) - entailing robot agents to follow navigation instructions - has shown great advance. However, existing literature put most emphasis on interpreting instructions into actions, only delivering "dumb" wayfinding agents. In this article, we devise LANA, a language-capable navigation agent which is able to not only execute human-written navigation commands, but also provide route descriptions to humans. This is achieved by simultaneously learning instruction following and generation with only one single model. More specifically, two encoders, respectively for route and language encoding, are built and shared by two decoders, respectively for action prediction and instruction generation, so as to exploit cross-task knowledge and capture task-specific characteristics. Throughout pretraining and fine-tuning, both instruction following and generation are set as optimization objectives. We empirically verify that, compared with recent advanced task-specific solutions, LANA attains better performances on both instruction following and route description, with nearly half complexity. In addition, endowed with language generation capability, LANA can explain to human its behaviours and assist human's wayfinding. This work is expected to foster future efforts towards building more trustworthy and socially intelligent navigation robots.

关键词： Embodied vision: Active agents simulation

来源：评论

学校读者我要写书评

暂无评论

Sparsifiner: Learning Sparse Instance-Dependent Attention for Efficient vision Transformers

Sparsifiner: Learning Sparse Instance-Dependent Attention fo...

引用

ieee/CVF conference on computer vision and pattern recognition (CVPR)

作者： Wei, Cong Duke, Brendan Jiang, Ruowei Aarabi, Parham Taylor, Graham W. Shkurti, Florian Univ Toronto Toronto ON Canada Univ Guelph Guelph ON N1G 2W1 Canada Modiface Inc Toronto ON Canada Vector Inst Toronto ON Canada

ISBN: (纸本)9798350301298

vision Transformers (ViT) have shown competitive advantages in terms of performance compared to convolutional neural networks (CNNs), though they often come with high computational costs. To this end, previous methods explore different attention patterns by limiting a fixed number of spatially nearby tokens to accelerate the ViT's multi-head self-attention (MHSA) operations. However, such structured attention patterns limit the token-to-token connections to their spatial relevance, which disregards learned semantic connections from a full attention mask. In this work, we propose an approach to learn instance-dependent attention patterns, by devising a lightweight connectivity predictor module that estimates the connectivity score of each pair of tokens. Intuitively, two tokens have high connectivity scores if the features are considered relevant either spatially or semantically. As each token only attends to a small number of other tokens, the binarized connectivity masks are often very sparse by nature and therefore provide the opportunity to reduce network FLOPs via sparse computations. Equipped with the learned unstructured attention pattern, sparse attention ViT (Sparsifiner) produces a superior Pareto frontier between FLOPs and top-1 accuracy on ImageNet compared to token sparsity. Our method reduces 48% similar to 69% FLOPs of MHSA while the accuracy drop is within 0.4%. We also show that combining attention and token sparsity reduces ViT FLOPs by over 60%.

关键词： Efficient and scalable vision

来源：评论

学校读者我要写书评

暂无评论

Efficient On-device Training via Gradient Filtering

Efficient On-device Training via Gradient Filtering

引用

ieee/CVF conference on computer vision and pattern recognition (CVPR)

作者： Yang, Yuedong Li, Guihong Marculescu, Radu Univ Texas Austin Austin TX 78712 USA

ISBN: (纸本)9798350301298

Despite its importance for federated learning, continuous learning and many other applications, on-device training remains an open problem for EdgeAI. The problem stems from the large number of operations (e.g., floating point multiplications and additions) and memory consumption required during training by the back-propagation algorithm. Consequently, in this paper, we propose a new gradient filtering approach which enables on-device CNN model training. More precisely, our approach creates a special structure with fewer unique elements in the gradient map, thus significantly reducing the computational complexity and memory consumption of back propagation during training. Extensive experiments on image classification and semantic segmentation with multiple CNN models (e.g., MobileNet, DeepLabV3, UPerNet) and devices (e.g., Raspberry Pi and Jetson Nano) demonstrate the effectiveness and wide applicability of our approach. For example, compared to SOTA, we achieve up to 19x speedup and 77.1% memory savings on ImageNet classification with only 0.1% accuracy loss. Finally, our method is easy to implement and deploy;over 20x speedup and 90% energy savings have been observed compared to highly optimized baselines in MKLDNN and CUDNN on NVIDIA Jetson Nano. Consequently, our approach opens up a new direction of research with a huge potential for on-device training.

关键词： Efficient and scalable vision

来源：评论

学校读者我要写书评

暂无评论

Blur Interpolation Transformer for Real-World Motion from Blur

Blur Interpolation Transformer for Real-World Motion from Bl...

引用

ieee/CVF conference on computer vision and pattern recognition (CVPR)

作者： Zhong, Zhihang Cao, Mingdeng Ji, Xiang Zheng, Yinqiang Sato, Imari Univ Tokyo Tokyo Japan Natl Inst Informat Tokyo Japan

ISBN: (纸本)9798350301298

This paper studies the challenging problem of recovering motion from blur, also known as joint deblurring and interpolation or blur temporal super-resolution. The challenges are twofold: 1) the current methods still leave considerable room for improvement in terms of visual quality even on the synthetic dataset, and 2) poor generalization to real-world data. To this end, we propose a blur interpolation transformer (BiT) to effectively unravel the underlying temporal correlation encoded in blur. Based on multi-scale residual Swin transformer blocks, we introduce dual-end temporal supervision and temporally symmetric ensembling strategies to generate effective features for time-varying motion rendering. In addition, we design a hybrid camera system to collect the first real-world dataset of one-to-many blur-sharp video pairs. Experimental results show that BiT has a significant gain over the state-of-the-art methods on the public dataset Adobe240. Besides, the proposed real-world dataset effectively helps the model generalize well to real blurry scenarios. Code and data are available at https://***/zzh-tech/BiT.

关键词： Low-level vision

来源：评论

学校读者我要写书评

暂无评论

没有更多数据了...

全选清除本页清除全部题录导出标记到“检索档案”

共500页 << < 111 112 113 114 115 116 117 118 119 120 > >>

检索报告对象比较合并检索0

隐藏清空

合并搜索

回到顶部

执行限定条件

内容：

评分：

请选择保存的检索档案：

请选择收藏分类：

订阅名称：

通借通还

温馨提示：

图书名称：

借书校区：

取书校区：

手机号码：

邮箱地址：

一卡通帐号：

电话和邮箱必须正确填写，我们会与您联系确认。

联系人：

所在院系：

联系邮箱：

联系电话：

内蒙古自治区呼和浩特市赛罕区大学西街235号邮编: 010021

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：