检索结果-内蒙古大学图书馆

您好，读者！请登录

内蒙古大学图书馆

首页
概况
党建
资源
服务
科研支持
- 论文收录引用证明
- 科技查新
知识产权
档案馆
帮助

咨询与建议

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

您的常用邮箱：*

您的手机号码：*

问题描述：

当前已输入0个字，您还可以输入200个字

全部搜索
期刊论文
图书
学位论文
标准
纸本馆藏
外文资源发现
数据库导航
超星发现

高级检索

时间限定

出版年份：

文献类型

图书期刊文献学位论文多媒体

馆藏选择

电子馆藏纸本馆藏

核心期刊

全部期刊 SCI 收录期刊 SSCI 收录期刊 EI 收录期刊 CSCD 收录期刊 CSSCI 收录期刊

语言

中文英文

文献类型

期刊文献图书学位论文标准纸本馆藏

帮助

文字说明：

T=题名（书名、题名），A=作者（责任者），K=主题词，P=出版物名称，PU=出版社名称，O=机构（作者单位、学位授予单位、专利申请人），L=中图分类号，C=学科分类号，U=全部字段，Y=年（出版发行年、学位年度、标准发布年）

检索规则说明：

AND代表“并且”；OR代表“或者”；NOT代表“不包含”；(注意必须大写,运算符两边需空一格)

检索范例：

范例一：(K=图书馆学 OR K=情报学) AND A=范并思 AND Y=1982-2016
范例二：P=计算机应用与软件 AND (U=C++ OR U=Basic) NOT K=Visual AND Y=2011-2016

分类表

所选分类

>> <<

限定检索结果

文献类型

20,798 篇 会议
87 篇 期刊文献
65 册 图书

馆藏范围

20,949 篇 电子文献
1 种 纸本馆藏

日期分布

学科分类号

13,274 篇 工学
- 10,922 篇 计算机科学与技术...
- 2,484 篇 机械工程
- 2,307 篇 软件工程
- 913 篇 光学工程
- 770 篇 电气工程
- 556 篇 控制科学与工程
- 405 篇 信息与通信工程
- 210 篇 测绘科学与技术
- 131 篇 生物医学工程（可授...
- 104 篇 电子科学与技术（可...
- 100 篇 生物工程
- 92 篇 仪器科学与技术
- 56 篇 化学工程与技术
- 52 篇 建筑学
- 48 篇 土木工程
- 44 篇 安全科学与工程
- 38 篇 力学（可授工学、理...
- 38 篇 航空宇航科学与技...
- 35 篇 交通运输工程
3,457 篇 医学
- 3,449 篇 临床医学
- 34 篇 基础医学(可授医学...
2,315 篇 理学
- 1,154 篇 数学
- 1,132 篇 物理学
- 417 篇 统计学（可授理学、...
- 386 篇 生物学
- 252 篇 系统科学
- 57 篇 化学
353 篇 管理学
- 184 篇 图书情报与档案管...
- 176 篇 管理科学与工程(可...
- 32 篇 工商管理
28 篇 法学
20 篇 农学
15 篇 教育学
9 篇 经济学
8 篇 艺术学
5 篇 文学
5 篇 军事学

主题

8,202 篇 computer vision
3,009 篇 pattern recognit...
2,732 篇 training
1,769 篇 computational mo...
1,657 篇 visualization
1,482 篇 cameras
1,415 篇 shape
1,369 篇 three-dimensiona...
1,369 篇 face recognition
1,285 篇 image segmentati...
1,272 篇 feature extracti...
1,178 篇 robustness
1,090 篇 semantics
1,040 篇 layout
1,006 篇 object detection
975 篇 object recogniti...
968 篇 computer science
946 篇 computer archite...
946 篇 benchmark testin...
931 篇 codes

机构

174 篇 univ sci & techn...
154 篇 carnegie mellon ...
148 篇 univ chinese aca...
144 篇 chinese univ hon...
113 篇 microsoft resear...
103 篇 zhejiang univ pe...
99 篇 swiss fed inst t...
97 篇 tsinghua univ pe...
93 篇 tsinghua univers...
91 篇 microsoft res as...
88 篇 shanghai ai lab ...
81 篇 zhejiang univers...
76 篇 alibaba grp peop...
74 篇 hong kong univ s...
73 篇 university of sc...
72 篇 peking univ peop...
69 篇 university of ch...
68 篇 shanghai jiao to...
66 篇 google res mount...
66 篇 univ oxford oxfo...

作者

80 篇 van gool luc
71 篇 zhang lei
59 篇 timofte radu
48 篇 yang yi
47 篇 xiaoou tang
44 篇 darrell trevor
43 篇 tian qi
43 篇 luc van gool
42 篇 loy chen change
42 篇 sun jian
42 篇 li fei-fei
40 篇 qi tian
39 篇 li stan z.
37 篇 liu yang
37 篇 chen xilin
36 篇 shan shiguang
35 篇 liu xiaoming
35 篇 vasconcelos nuno
35 篇 torralba antonio
32 篇 zhou jie

语言

20,927 篇 英文
14 篇 中文
6 篇 其他
2 篇 日文
2 篇 土耳其文

检索条件"任意字段=2009 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2009"

共 20950 条记录，以下是681-690 订阅

全选清除本页清除全部题录导出标记到"检索档案"

详细简洁

排序：

Filtering, Distillation, and Hard Negatives for vision-Language Pre-Training

Filtering, Distillation, and Hard Negatives for Vision-Langu...

引用

ieee/CVF conference on computer vision and pattern recognition (cvpr)

作者： Radenovic, Filip Dubey, Abhimanyu Kadian, Abhishek Mihaylov, Todor Vandenhende, Simon Patel, Yash Wen, Yi Ramanathan, Vignesh Mahajan, Dhruv Meta AI New York NY 10003 USA Czech Tech Univ Prague Czech Republic

ISBN: (纸本)9798350301298

vision-language models trained with contrastive learning on large-scale noisy data are becoming increasingly popular for zero-shot recognition problems. In this paper we improve the following three aspects of the contrastive pre-training pipeline: dataset noise, model initialization and the training objective. First, we propose a straightforward filtering strategy titled Complexity, Action, and Text-spotting (CAT) that significantly reduces dataset size, while achieving improved performance across zero-shot vision-language tasks. Next, we propose an approach titled Concept Distillation to leverage strong unimodal representations for contrastive training that does not increase training complexity while outperforming prior work. Finally, we modify the traditional contrastive alignment objective, and propose an importance-sampling approach to up-sample the importance of hard-negatives without adding additional complexity. On an extensive zero-shot benchmark of 29 tasks, our Distilled and Hard-negative Training (DiHT) approach improves on 20 tasks compared to the baseline. Furthermore, for few-shot linear probing, we propose a novel approach that bridges the gap between zero-shot and few-shot performance, substantially improving over prior work. Models are available at ***/facebookresearch/diht.

关键词： and reasoning language vision

来源：评论

学校读者我要写书评

暂无评论

Are Data-driven Explanations Robust against Out-of-distribution Data?

Are Data-driven Explanations Robust against Out-of-distribut...

引用

ieee/CVF conference on computer vision and pattern recognition (cvpr)

作者： Li, Tang Qiao, Fenuchun Ma, Mengmeng Peng, Xi Univ Delaware Newark DE 19716 USA

ISBN: (纸本)9798350301298

As black-box models increasingly power high-stakes applications, a variety of data-driven explanation methods have been introduced. Meanwhile, machine learning models are constantly challenged by distributional shifts. A question naturally arises: Are data-driven explanations robust against out-of-distribution data? Our empirical results show that even though predict correctly, the model might still yield unreliable explanations under distributional shifts. How to develop robust explanations against out-of-distribution data? To address this problem, we propose an end-to-end model-agnostic learning framework Distributionally Robust Explanations (DRE). The key idea is, inspired by self-supervised learning, to fully utilizes the inter-distribution information to provide supervisory signals for the learning of explanations without human annotation. Can robust explanations benefit the model's generalization capability? We conduct extensive experiments on a wide range of tasks and data types, including classification and regression on image and scientific tabular data. Our results demonstrate that the proposed method significantly improves the model's performance in terms of explanation and prediction robustness against distributional shifts.

关键词： Explainable computer vision

来源：评论

学校读者我要写书评

暂无评论

Learning Steerable Function for Efficient Image Resampling

Learning Steerable Function for Efficient Image Resampling

引用

ieee/CVF conference on computer vision and pattern recognition (cvpr)

作者： Li, Jiacheng Chen, Chang Huang, Wei Lang, Zhiqiang Song, Fenglong Yan, Youliang Xiong, Zhiwei Univ Sci & Technol China Chengdu Peoples R China Huawei Noahs Ark Lab Montreal PQ Canada

ISBN: (纸本)9798350301298

Image resampling is a basic technique that is widely employed in daily applications. Existing deep neural networks (DNNs) have made impressive progress in resampling performance. Yet these methods are still not the perfect substitute for interpolation, due to the issues of efficiency and continuous resampling. In this work, we propose a novel method of Learning Resampling Function (termed LeRF), which takes advantage of both the structural priors learned by DNNs and the locally continuous assumption of interpolation methods. Specifically, LeRF assigns spatially-varying steerable resampling functions to input image pixels and learns to predict the hyper-parameters that determine the orientations of these resampling functions with a neural network. To achieve highly efficient inference, we adopt look-up tables (LUTs) to accelerate the inference of the learned neural network. Furthermore, we design a directional ensemble strategy and edge-sensitive indexing patterns to better capture local structures. Extensive experiments show that our method runs as fast as interpolation, generalizes well to arbitrary transformations, and outperforms interpolation significantly, e.g., up to 3dB PSNR gain over bicubic for x2 upsampling on Manga109.

关键词： Low-level vision

来源：评论

学校读者我要写书评

暂无评论

Comprehensive and Delicate: An Efficient Transformer for Image Restoration

Comprehensive and Delicate: An Efficient Transformer for Ima...

引用

ieee/CVF conference on computer vision and pattern recognition (cvpr)

作者： Zhao, Haiyu Gou, Yuanbiao Li, Boyun Peng, Dezhong Lv, Jiancheng Peng, Xi Sichuan Univ Coll Comp Sci Chengdu Peoples R China

ISBN: (纸本)9798350301298

vision Transformers have shown promising performance in image restoration, which usually conduct window- or channel-based attention to avoid intensive computations. Although the promising performance has been achieved, they go against the biggest success factor of Transformers to a certain extent by capturing the local instead of global dependency among pixels. In this paper, we propose a novel efficient image restoration Transformer that first captures the superpixel-wise global dependency, and then transfers it into each pixel. Such a coarse-to-fine paradigm is implemented through two neural blocks, i.e., condensed attention neural block (CA) and dual adaptive neural block (DA). In brief, CA employs feature aggregation, attention computation, and feature recovery to efficiently capture the global dependency at the superpixel level. To embrace the pixel-wise global dependency, DA takes a novel dual-way structure to adaptively encapsulate the globality from superpixels into pixels. Thanks to the two neural blocks, our method achieves comparable performance while taking only similar to 6% FLOPs compared with SwinIR.

关键词： Low-level vision

来源：评论

学校读者我要写书评

暂无评论

vision Transformers are Parameter-Efficient Audio-Visual Learners

Vision Transformers are Parameter-Efficient Audio-Visual Lea...

引用

ieee/CVF conference on computer vision and pattern recognition (cvpr)

作者： Lin, Yan-Bo Sung, Yi-Lin Lei, Jie Bansal, Mohit Bertasius, Gedas UNC Chapel Hill Dept Comp Sci Chapel Hill NC 27514 USA

ISBN: (纸本)9798350301298

vision transformers (ViTs) have achieved impressive results on various computer vision tasks in the last several years. In this work, we study the capability of frozen ViTs, pretrained only on visual data, to generalize to audio-visual data without finetuning any of its original parameters. To do so, we propose a latent audio-visual hybrid (LAVISH) adapter that adapts pretrained ViTs to audio-visual tasks by injecting a small number of trainable parameters into every layer of a frozen ViT. To efficiently fuse visual and audio cues, our LAVISH adapter uses a small set of latent tokens, which form an attention bottleneck, thus, eliminating the quadratic cost of standard cross-attention. Compared to the existing modality-specific audio-visual methods, our approach achieves competitive or even better performance on various audio-visual tasks while using fewer tunable parameters and without relying on costly audio pretraining or external audio encoders. Our code is available at https://***/project_page/LAVISH/

关键词： Multi-modal learning

来源：评论

学校读者我要写书评

暂无评论

STMT: A Spatial-Temporal Mesh Transformer for MoCap-Based Action recognition

STMT: A Spatial-Temporal Mesh Transformer for MoCap-Based Ac...

引用

ieee/CVF conference on computer vision and pattern recognition (cvpr)

作者： Zhu, Xiaoyu Huang, Po-Yao Liang, Junwei de Melo, Celso M. Hauptmann, Alexander Carnegie Mellon Univ Pittsburgh PA 15213 USA Meta AI FAIR New York NY USA HKUST Guangzhou Guangzhou Peoples R China DEVCOM Army Res Lab Adelphi MD USA

ISBN: (纸本)9798350301298

We study the problem of human action recognition using motion capture (MoCap) sequences. Unlike existing techniques that take multiple manual steps to derive standardized skeleton representations as model input, we propose a novel Spatial-Temporal Mesh Transformer (STMT) to directly model the mesh sequences. The model uses a hierarchical transformer with intra-frame off-set attention and inter-frame self-attention. The attention mechanism allows the model to freely attend between any two vertex patches to learn non-local relationships in the spatial-temporal domain. Masked vertex modeling and future frame prediction are used as two self-supervised tasks to fully activate the bi-directional and auto-regressive attention in our hierarchical transformer. The proposed method achieves state-of-the-art performance compared to skeleton-based and point-cloud-based models on common MoCap benchmarks. Code is available at https://github. com/zgzxy001/STMT.

关键词： Video: Action and event understanding

来源：评论

学校读者我要写书评

暂无评论

Automatic recognition of Food Ingestion Environment from the AIM-2 Wearable Sensor

Automatic Recognition of Food Ingestion Environment from the...

引用

ieee/CVF conference on computer vision and pattern recognition (cvpr)

作者： Huang, Yuning Hassan, M. A. He, Jiangpeng Higgins, J. McCrory, Megan Eicher-Miller, Heather Thomas, J. Graham Sazonov, Edward Zhu, Fengqing Purdue Univ W Lafayette IN 47907 USA Univ Calif Davis Davis CA 95616 USA Univ Colorado Aurora CO USA Boston Univ Boston MA 02215 USA Brown Univ Providence RI 02912 USA Univ Alabama Tuscaloosa AL USA

ISBN: (纸本)9798350365474

Detecting an ingestion environment is an important aspect of monitoring dietary intake. It provides insightful information for dietary assessment. However, it is a challenging problem where human-based reviewing can be tedious, and algorithm-based review suffers from data imbalance and perceptual aliasing problems. To address these issues, we propose a neural network-based method with a two-stage training framework that tactfully combines fine-tuning and transfer learning techniques. Our method is evaluated on a newly collected dataset called "UA Free Living Study", which uses an egocentric wearable camera, AIM-2 sensor, to simulate food consumption in free-living conditions. The proposed training framework is applied to common neural network backbones, combined with approaches in the general imbalanced classification field. Experimental results on the collected dataset show that our proposed method for automatic ingestion environment recognition successfully addresses the challenging data imbalance problem in the dataset and achieves a promising overall classification accuracy of 96.63%.

关键词： Classification Dietary Assessment Scene recognition Transfer Learning

来源：评论

学校读者我要写书评

暂无评论

Neural Video Compression with Diverse Contexts

Neural Video Compression with Diverse Contexts

引用

ieee/CVF conference on computer vision and pattern recognition (cvpr)

作者： Li, Jiahao Li, Bin Lu, Yan Microsoft Res Asia Beijing Peoples R China

ISBN: (纸本)9798350301298

For any video codecs, the coding efficiency highly relies on whether the current signal to be encoded can find the relevant contexts from the previous reconstructed signals. Traditional codec has verified more contexts bring substantial coding gain, but in a time-consuming manner. However, for the emerging neural video codec (NVC), its contexts are still limited, leading to low compression ratio. To boost NVC, this paper proposes increasing the context diversity in both temporal and spatial dimensions. First, we guide the model to learn hierarchical quality patterns across frames, which enriches long-term and yet high-quality temporal contexts. Furthermore, to tap the potential of optical flow-based coding framework, we introduce a group-based offset diversity where the cross-group interaction is proposed for better context mining. In addition, this paper also adopts a quadtree-based partition to increase spatial context diversity when encoding the latent representation in parallel. Experiments show that our codec obtains 23.5% bitrate saving over previous SOTA NVC. Better yet, our codec has surpassed the under-developing next generation traditional codec/ECM in both RGB and YUV420 colorspaces, in terms of PSNR. The codes are at https://***/microsoft/DCVC.

关键词： Low-level vision

来源：评论

学校读者我要写书评

暂无评论

NIPQ: Noise proxy-based Integrated Pseudo-Quantization

NIPQ: Noise proxy-based Integrated Pseudo-Quantization

引用

ieee/CVF conference on computer vision and pattern recognition (cvpr)

作者： Shin, Juncheol So, Junhyuk Park, Sein Kang, Seungyeop Yoo, Sungjoo Park, Eunhyeok POSTECH Dept Comp Sci & Engn Pohang South Korea POSTECH Grad Sch Artificial Intelligence Pohang South Korea Seoul Natl Univ Dept ent Comp Sci & Engn Seoul South Korea

ISBN: (纸本)9798350301298

Straight-through estimator (STE), which enables the gradient flow over the non-differentiable function via approximation, has been favored in studies related to quantization-aware training (QAT). However, STE incurs unstable convergence during QAT, resulting in notable quality degradation in low precision. Recently, pseudo-quantization training has been proposed as an alternative approach to updating the learnable parameters using the pseudo-quantization noise instead of STE. In this study, we propose a novel noise proxy-based integrated pseudo-quantization (NIPQ) that enables unified support of pseudoquantization for both activation and weight by integrating the idea of truncation on the pseudo-quantization framework. NIPQ updates all of the quantization parameters (e.g., bit-width and truncation boundary) as well as the network parameters via gradient descent without STE instability. According to our extensive experiments, NIPQ outperforms existing quantization algorithms in various vision and language applications by a large margin.

关键词： Efficient and scalable vision

来源：评论

学校读者我要写书评

暂无评论

Interpreting COVID Lateral Flow Tests' Results with Foundation Models

Interpreting COVID Lateral Flow Tests' Results with Foundati...

引用

ieee/CVF conference on computer vision and pattern recognition (cvpr)

作者： Pandey, Stuti Myers-Dean, Josh Reynolds, Jarek Gurari, Danna Univ Colorado Boulder CO 80309 USA Univ Texas Austin Austin TX 78712 USA

ISBN: (纸本)9798350365474

Lateral flow tests (LFTs) enable rapid, low-cost testing for health conditions including Covid, pregnancy, HIV, and malaria. Automated readers of LFT results can yield many benefits including empowering blind people to independently learn about their health and accelerating data entry for large-scale monitoring (e.g., for pandemics such as Covid) by using only a single photograph per LFT test. Accordingly, we explore the abilities of modern foundation vision language models (VLMs) in interpreting such tests. To enable this analysis, we first create a new labeled dataset with hierarchical segmentations of each LFT test and its nested test result window. We call this dataset LFT-Grounding. Next, we benchmark eight modern VLMs in zero-shot settings for analyzing these images. We demonstrate that current VLMs frequently fail to correctly identify the type of LFT test, interpret the test results, locate the nested result window of the LFT tests, and recognize LFT tests when they partially obfuscated. To facilitate community-wide progress towards automated LFT reading, we publicly release our dataset at https://***/ lft_grounding_foundation_models/

关键词： Accessibility Foundation vision Language Models Lateral Flow Test Prompt Engineering Zero-Shot

来源：评论

学校读者我要写书评

暂无评论

没有更多数据了...

全选清除本页清除全部题录导出标记到“检索档案”

共500页 << < 65 66 67 68 69 70 71 72 73 74 > >>

检索报告对象比较合并检索0

隐藏清空

合并搜索

回到顶部

执行限定条件

内容：

评分：

请选择保存的检索档案：

请选择收藏分类：

订阅名称：

通借通还

温馨提示：

图书名称：

借书校区：

取书校区：

手机号码：

邮箱地址：

一卡通帐号：

电话和邮箱必须正确填写，我们会与您联系确认。

联系人：

所在院系：

联系邮箱：

联系电话：

内蒙古自治区呼和浩特市赛罕区大学西街235号邮编: 010021

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：