检索结果-内蒙古大学图书馆

您好，读者！请登录

内蒙古大学图书馆

首页
概况
党建
资源
服务
科研支持
- 论文收录引用证明
- 科技查新
知识产权
档案馆
帮助

咨询与建议

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

您的常用邮箱：*

您的手机号码：*

问题描述：

当前已输入0个字，您还可以输入200个字

全部搜索
期刊论文
图书
学位论文
标准
纸本馆藏
外文资源发现
数据库导航
超星发现

高级检索

时间限定

出版年份：

文献类型

图书期刊文献学位论文多媒体

馆藏选择

电子馆藏纸本馆藏

核心期刊

全部期刊 SCI 收录期刊 SSCI 收录期刊 EI 收录期刊 CSCD 收录期刊 CSSCI 收录期刊

语言

中文英文

文献类型

期刊文献图书学位论文标准纸本馆藏

帮助

文字说明：

T=题名（书名、题名），A=作者（责任者），K=主题词，P=出版物名称，PU=出版社名称，O=机构（作者单位、学位授予单位、专利申请人），L=中图分类号，C=学科分类号，U=全部字段，Y=年（出版发行年、学位年度、标准发布年）

检索规则说明：

AND代表“并且”；OR代表“或者”；NOT代表“不包含”；(注意必须大写,运算符两边需空一格)

检索范例：

范例一：(K=图书馆学 OR K=情报学) AND A=范并思 AND Y=1982-2016
范例二：P=计算机应用与软件 AND (U=C++ OR U=Basic) NOT K=Visual AND Y=2011-2016

分类表

所选分类

>> <<

限定检索结果

文献类型

6,639 篇 会议
34 篇 期刊文献
5 册 图书

馆藏范围

6,677 篇 电子文献
1 种 纸本馆藏

日期分布

学科分类号

3,950 篇 工学
- 3,725 篇 计算机科学与技术...
- 1,476 篇 软件工程
- 807 篇 光学工程
- 323 篇 信息与通信工程
- 240 篇 控制科学与工程
- 206 篇 机械工程
- 169 篇 电气工程
- 85 篇 生物医学工程（可授...
- 73 篇 电子科学与技术（可...
- 70 篇 生物工程
- 65 篇 仪器科学与技术
- 38 篇 建筑学
- 36 篇 土木工程
- 34 篇 力学（可授工学、理...
- 32 篇 航空宇航科学与技...
- 29 篇 安全科学与工程
- 23 篇 化学工程与技术
- 21 篇 材料科学与工程（可...
1,498 篇 理学
- 969 篇 物理学
- 929 篇 数学
- 369 篇 统计学（可授理学、...
- 136 篇 生物学
- 40 篇 系统科学
- 26 篇 化学
210 篇 医学
- 210 篇 临床医学
- 23 篇 基础医学(可授医学...
165 篇 管理学
- 123 篇 图书情报与档案管...
- 44 篇 管理科学与工程(可...
- 29 篇 工商管理
21 篇 法学
- 21 篇 社会学
10 篇 农学
9 篇 教育学
6 篇 经济学
2 篇 军事学
1 篇 艺术学

主题

2,364 篇 computer vision
848 篇 pattern recognit...
663 篇 cameras
634 篇 computer science
592 篇 face recognition
558 篇 layout
541 篇 conferences
527 篇 image segmentati...
514 篇 shape
454 篇 object recogniti...
453 篇 robustness
394 篇 humans
339 篇 feature extracti...
324 篇 training
305 篇 object detection
263 篇 image recognitio...
260 篇 application soft...
249 篇 lighting
248 篇 computational mo...
238 篇 image reconstruc...

机构

44 篇 microsoft resear...
27 篇 department of co...
21 篇 swiss fed inst t...
21 篇 school of comput...
21 篇 carnegie mellon ...
20 篇 department of co...
19 篇 swiss fed inst t...
18 篇 department of co...
17 篇 department of in...
17 篇 the robotics ins...
17 篇 institute of com...
16 篇 univ sci & techn...
16 篇 robotics institu...
15 篇 tsinghua univ pe...
14 篇 department of el...
14 篇 center for autom...
14 篇 school of comput...
14 篇 school of comput...
13 篇 univ maryland co...
13 篇 microsoft resear...

作者

39 篇 timofte radu
28 篇 s.k. nayar
25 篇 huang thomas s.
24 篇 xiaoou tang
22 篇 t. kanade
20 篇 chellappa rama
20 篇 t.s. huang
19 篇 van gool luc
19 篇 nayar shree k.
19 篇 t. darrell
17 篇 a.k. jain
17 篇 a. zisserman
17 篇 heung-yeung shum
17 篇 jain anil k.
17 篇 zisserman andrew
16 篇 g. healey
16 篇 torralba antonio
16 篇 l. van gool
15 篇 ying wu
15 篇 m. shah

语言

6,668 篇 英文
8 篇 中文
2 篇 其他

检索条件"任意字段=2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR 2003"

共 6678 条记录，以下是141-150 订阅

全选清除本页清除全部题录导出标记到"检索档案"

详细简洁

排序：

Uncovering the Hidden Cost of Model Compression

Uncovering the Hidden Cost of Model Compression

引用

ieee/CVF conference on computer vision and pattern recognition (cvpr)

作者： Misra, Diganta Chaudhary, Muawiz Goyal, Agam Runwal, Bharat Chen, Pin Yu Carnegie Mellon Univ Pittsburgh PA 15213 USA Landskape AI Leiden Netherlands Mila Quebec AI Inst Montreal PQ Canada Concordia Univ Montreal PQ Canada Univ Wisconsin Madison Madison WI USA IBM Res Yorktown Hts NY USA

ISBN: (纸本)9798350365474

In an age dominated by resource-intensive foundation models, the ability to efficiently adapt to downstream tasks is crucial. Visual Prompting (VP), drawing inspiration from the prompting techniques employed in Large Language Models (LLMs), has emerged as a pivotal method for transfer learning in the realm of computer vision. As the importance of efficiency continues to rise, research into model compression has become indispensable in alleviating the computational burdens associated with training and deploying over-parameterized neural networks. A primary objective in model compression is to develop sparse and/or quantized models capable of matching or even surpassing the performance of their over-parameterized, full-precision counterparts. Although previous studies have explored the effects of model compression on transfer learning, its impact on visual prompting-based transfer remains unclear. This study aims to bridge this gap, shedding light on the fact that model compression detrimentally impacts the performance of visual prompting-based transfer, particularly evident in scenarios with low data volume. Furthermore, our findings underscore the adverse influence of sparsity on the calibration of downstream visual-prompted models. However, intriguingly, we also illustrate that such negative effects on calibration are not present when models are compressed via quantization. This empirical investigation underscores the need for a nuanced understanding beyond mere accuracy in sparse and quantized settings, thereby paving the way for further exploration in Visual Prompting techniques tailored for sparse and quantized models.

关键词： Compression Quantization Sparsity vision prompting

来源：评论

学校读者我要写书评

暂无评论

LOFI: LOng-tailed FIne-Grained Network for Food recognition

LOFI: LOng-tailed FIne-Grained Network for Food Recognition

引用

ieee/CVF conference on computer vision and pattern recognition (cvpr)

作者： Rodriguez-de-Vera, Jesus M. Estepa, Imanol G. Bolanos, Marc Nagarajan, Bhalaji Radeva, Petia Univ Barcelona Barcelona Spain AIGecko Technol SL Barcelona Spain

ISBN: (纸本)9798350365474

Food recognition plays a crucial role in several healthcare applications. Nevertheless, it presents significant computer vision challenges such as long-tailed and fine-grained distributions that hinder its progress. In this work, we propose LOFI, a Long-tailed Fine-grained Network aimed specifically at tackling these food recognition challenges by improving the feature learning capabilities of food recognition models. Specifically, we improve vanilla R-CNN architecture by tailoring it for food recognition. We design an efficient multi-task framework for fine-grained food recognition, which exploits the lexical similarity of dishes during training to improve the discriminative ability of the network. Secondly, we include a Graph Confidence Propagation module based on graph neural networks to aggregate the information of overlapping detections and refine the final prediction of the network. Extensive analysis and ablations of different components of LOFI highlight that it successfully addresses the targeted problems and leads to noticeable gains in performance. Remarkably, the proposed method achieves competitive results and outperforms the current state-of-the-art methods in three public food benchmarks: UECFood-256, AiCrowd Food Challenge 2022, and UECFood-100 segmented.

关键词： Fine-grained Food visual recognition Graph Neural Networks Instance segmentation Long-tailed Object detection

来源：评论

学校读者我要写书评

暂无评论

COOD: Combined out-of-distribution detection using multiple measures for anomaly & novel class detection in large-scale hierarchical classification

COOD: Combined out-of-distribution detection using multiple ...

引用

ieee/CVF conference on computer vision and pattern recognition (cvpr)

作者： Hogeweg, Laurens E. Gangireddy, Rajesh Brunink, Django Kalkman, Vincent J. Cornelissen, Ludo Kamminga, Jacob W. Intel Eindhoven Netherlands Nat Biodivers Ctr Leiden Netherlands Univ Twente Enschede Netherlands

ISBN: (纸本)9798350365474

High-performing out-of-distribution (OOD) detection, both anomaly and novel class, is an important prerequisite for the practical use of classification models. In this paper we focus on the species recognition task in images, concerned with large databases, a large number of fine-grained hierarchical classes, severe class imbalance, and varying image quality. We propose a framework for combining individual OOD measures into one combined OOD (COOD) measure using a supervised model. The individual measures are several existing state-of-the-art measures and several novel OOD measures developed with novel class detection and hierarchical class structure in mind. COOD was extensively evaluated on three large-scale (500k+ images) biodiversity datasets in the context of anomaly and novel class detection. We show that COOD outperforms individual, including state-of-the-art, OOD measures by a large margin in terms of TPR@1%FPR in the majority of experiments, e.g., improving detecting ImageNet images (OOD) from 54.3% to 83.3% for the iNaturalist 2018 dataset. SHAP (feature contribution) analysis shows that different individual OOD measures are essential for various tasks, indicating that multiple OOD measures and combinations are needed to generalize. Additionally, we show that explicitly considering ID images that are incorrectly classified for the original (species) recognition task is important for constructing high-performing OOD detection methods and for practical applicability. The framework can easily be extended or adapted to other tasks and media modalities.

关键词： anomaly image recognition novel class

来源：评论

学校读者我要写书评

暂无评论

HarvestNet: A Dataset for Detecting Smallholder Farming Activity Using Harvest Piles and Remote Sensing

HarvestNet: A Dataset for Detecting Smallholder Farming Acti...

引用

ieee/CVF conference on computer vision and pattern recognition (cvpr)

作者： Xu, Jonathan Elmustafa, Amna Weldegebriel, Liya Negash, Emnet Lee, Richard Meng, Chenlin Ermon, Stefano Lobell, David Stanford Univ Stanford CA 94305 USA Univ Waterloo Waterloo ON Canada Univ Ghent Ghent Belgium Mekelle Univ Mekele Ethiopia

ISBN: (纸本)9798350365474

Small farms contribute to a large share of the productive land in developing countries. In regions such as subSaharan Africa, where 80% of farms are small (under 2 ha in size), the task of mapping smallholder cropland is an important part of tracking sustainability measures such as crop productivity. However, the visually diverse and nuanced appearance of small farms has limited the effectiveness of traditional approaches to cropland mapping. Here we introduce a new approach based on the detection of harvest piles characteristic of many smallholder systems throughout the world. We present HarvestNet, a dataset for mapping the presence of farms in the Ethiopian regions of Tigray and Amhara during 2020-2023, collected using expert knowledge and satellite images, totaling 7k hand-labeled images and 2k ground-collected labels. We also benchmark a set of baselines, including SOTA models in remote sensing, with our best models having around 80% classification performance on hand labelled data and 90% and 98% accuracy on ground truth data for Tigray and Amhara, respectively. We also perform a visual comparison with a widely used pre-existing coverage map and show that our model detects an extra 56,621 hectares of cropland in Tigray. We conclude that remote sensing of harvest piles can contribute to more timely and accurate cropland assessments in food insecure regions. The dataset can be accessed through https://***/s/45a7b45556b90a9a11d2, while the code for the dataset and benchmarks is publicly available at https://***/jonxuxu/harvestpiles.

关键词： agriculture computer vision dataset harvest piles machine learning Remote sensing sustainability

来源：评论

学校读者我要写书评

暂无评论

Internal Diverse Image Completion

Internal Diverse Image Completion

引用

ieee/CVF conference on computer vision and pattern recognition (cvpr)

作者： Alkobi, Noa Shaham, Tamar Rott Michaeli, Tomer Technion Haifa Israel MIT Technion Haifa Israel

ISBN: (纸本)9798350302493

Image completion is widely used in photo restoration and editing applications, e.g. for object removal. Recently, there has been a surge of research on generating diverse completions for missing regions. However, existing methods require large training sets from a specific domain of interest, and often fail on general-content images. In this paper, we propose a diverse completion method that does not require a training set and can thus treat arbitrary images from any domain. Our internal diverse completion (IDC) approach draws inspiration from recent single-image generative models that are trained on multiple scales of a single image, adapting them to the extreme setting in which only a small portion of the image is available for training. We illustrate the strength of IDC on several datasets, using both user studies and quantitative comparisons.

关键词： computer vision

来源：评论

学校读者我要写书评

暂无评论

IrrNet: Advancing Irrigation Mapping with Incremental Patch Size Training on Remote Sensing Imagery

IrrNet: Advancing Irrigation Mapping with Incremental Patch ...

引用

ieee/CVF conference on computer vision and pattern recognition (cvpr)

作者： Hoque, Oishee Bintey Swarup, Samarth Adiga, Abhijin Nouwakpo, Sayjro Kossi Marathe, Madhav Univ Virginia Dept Comp Sci Charlottesville VA 22903 USA Univ Virginia Biocomplex Inst Charlottesville VA 22903 USA ARS USDA Kimberly ID USA

ISBN: (纸本)9798350365474

Irrigation mapping plays a crucial role in effective water management, essential for preserving both water quality and quantity, and is key to mitigating the global issue of water scarcity. The complexity of agricultural fields, adorned with diverse irrigation practices, especially when multiple systems coexist in close quarters, poses a unique challenge. This complexity is further compounded by the nature of Landsat's remote sensing data, where each pixel is rich with densely packed information, complicating the task of accurate irrigation mapping. In this study, we introduce an innovative approach that employs a progressive training method, which strategically increases patch sizes throughout the training process, utilizing datasets from Landsat 5 and 7, labeled with the WRLU dataset for precise labeling. This initial focus allows the model to capture detailed features, progressively shifting to broader, more general features as the patch size enlarges. Remarkably, our method enhances the performance of existing state-of-the-art models by approximately 20%. Furthermore, our analysis delves into the significance of incorporating various spectral bands into the model, assessing their impact on performance. The findings reveal that additional bands are instrumental in enabling the model to discern finer details more effectively. This work sets a new standard for leveraging remote sensing imagery in irrigation mapping.

关键词： computer vision Irrigation Mapping Patch Scaling Remote Sensing Segmentation Transfer learning

来源：评论

学校读者我要写书评

暂无评论

SegFormer3D: an Efficient Transformer for 3D Medical Image Segmentation

SegFormer3D: an Efficient Transformer for 3D Medical Image S...

引用

ieee/CVF conference on computer vision and pattern recognition (cvpr)

作者： Perera, Shehan Navard, Pouyan Yilmaz, Alper Ohio State Univ Photogrammetr Comp Vis Lab Columbus OH 43210 USA

ISBN: (纸本)9798350365474

The adoption of vision Transformers (ViTs) based architectures represents a significant advancement in 3D Medical Image (MI) segmentation, surpassing traditional Convolutional Neural Network (CNN) models by enhancing global contextual understanding. While this paradigm shift has significantly enhanced 3D segmentation performance, state-of-the-art architectures require extremely large and complex architectures with large scale computing resources for training and deployment. Furthermore, in the context of limited datasets, often encountered in medical imaging, larger models can present hurdles in both model generalization and convergence. In response to these challenges and to demonstrate that lightweight models are a valuable area of research in 3D medical imaging, we present SegFormer3D, a hierarchical Transformer that calculates attention across multiscale volumetric features. Additionally, SegFormer3D avoids complex decoders and uses an all-MLP decoder to aggregate local and global attention features to produce highly accurate segmentation masks. The proposed memory efficient Transformer preserves the performance characteristics of a significantly larger model in a compact design. SegFormer3D democratizes deep learning for 3D medical image segmentation by offering a model with 33x less parameters and a 13x reduction in GFLOPS compared to the current state-of-the-art (SOTA). We benchmark SegFormer3D against the current SOTA models on three widely used datasets Synapse, BRaTs, and ACDC, achieving competitive results. Code: https://***/OSUPCVLab/***

关键词： 3D Medical Image Segmentation ACDC Attention BraTs Deep Learning Efficient Attention Segmentation Synapse Transformers vision Transformers

来源：评论

学校读者我要写书评

暂无评论

Co-designing a Sub-millisecond Latency Event-based Eye Tracking System with Submanifold Sparse CNN

Co-designing a Sub-millisecond Latency Event-based Eye Track...

引用

ieee/CVF conference on computer vision and pattern recognition (cvpr)

作者： Zhang, Baoheng Gao, Yizhao Li, Jingyuan So, Hayden Kwok-Hay Univ Hong Kong Hong Kong Peoples R China

ISBN: (纸本)9798350365474

Eye-tracking technology is integral to numerous consumer electronics applications, particularly in the realm of virtual and augmented reality (VR/AR). These applications demand solutions that excel in three crucial aspects: low-latency, low-power consumption, and precision. Yet, achieving optimal performance across all these fronts presents a formidable challenge, necessitating a balance between sophisticated algorithms and efficient backend hardware implementations. In this study, we tackle this challenge through a synergistic software/hardware co-design of the system with an event camera. Leveraging the inherent sparsity of event-based input data, we integrate a novel sparse FPGA dataflow accelerator customized for submanifold sparse convolution neural networks (SCNN). The SCNN implemented on the accelerator can efficiently extract the embedding feature vector from each representation of event slices by only processing the non-zero activations. Subsequently, these vectors undergo further processing by a gated recurrent unit (GRU) and a fully connected layer on the host CPU to generate the eye centers. Deployment and evaluation of our system reveal outstanding performance metrics. On the Event-based Eye-Tracking-AIS2024 dataset, our system achieves 81% p5 accuracy, 99.5% p10 accuracy, and 3.71 Mean Euclidean Distance with 0.7 ms latency while only consuming 2.29 mJ per inference. Notably, our solution opens up opportunities for future eye-tracking systems. Code is available at https://***/CASRHKU/ESDA/tree/eye_tracking.

关键词： dynamic vision sensor event camera event-based vision eye tracking FPGA hardware-software codesign sparse processing

来源：评论

学校读者我要写书评

暂无评论

Orientation-conditioned Facial Texture Mapping for Video-based Facial Remote Photoplethysmography Estimation

Orientation-conditioned Facial Texture Mapping for Video-bas...

引用

ieee/CVF conference on computer vision and pattern recognition (cvpr)

作者： Cantrill, Sam Ahmedt-Aristizabal, David Petersson, Lars Suominen, Hanna Armin, Mohammad Ali Australian Natl Univ Canberra ACT Australia Commonwealth & Sci Ind Res Org Data61 Canberra ACT Australia Univ Turku Turku Finland

ISBN: (纸本)9798350365474

Camera-based remote photoplethysmography (rPPG) enables contactless measurement of important physiological signals such as pulse rate (PR). However, dynamic and unconstrained subject motion introduces significant variability into the facial appearance in video, confounding the ability of video-based methods to accurately extract the rPPG signal. In this study, we leverage the 3D facial surface to construct a novel orientation-conditioned facial texture video representation which improves the motion robustness of existing video-based facial rPPG estimation methods. Our proposed method achieves a significant 18.2% performance improvement in cross-dataset testing on MMPD over our baseline using the PhysNet model trained on PURE, highlighting the efficacy and generalization benefits of our designed video representation. We demonstrate significant performance improvements of up to 29.6% in all tested motion scenarios in cross-dataset testing on MMPD, even in the presence of dynamic and unconstrained subject motion. Emphasizing the benefits the benefits of disentangling motion through modeling the 3D facial surface for motion robust facial rPPG estimation. We validate the efficacy of our design decisions and the impact of different video processing steps through an ablation study. Our findings illustrate the potential strengths of exploiting the 3D facial surface as a general strategy for addressing dynamic and unconstrained subject motion in videos. The code is available at https://***/orientation-uv-rppg/.

关键词： computer vision motion rppg

来源：评论

学校读者我要写书评

暂无评论

Multi-Modal Fusion of Event and RGB for Monocular Depth Estimation Using a Unified Transformer-based Architecture

Multi-Modal Fusion of Event and RGB for Monocular Depth Esti...

引用

ieee/CVF conference on computer vision and pattern recognition (cvpr)

作者： Devulapally, Anusha Khan, Md Fahim Faysal Advani, Siddharth Narayanan, Vijaykrishnan Penn State Univ University Pk PA 16802 USA Samsung Elect Amer Ridgefield Pk NJ USA

ISBN: (纸本)9798350365474

In the field of robotics and autonomous navigation, accurate pixel-level depth estimation has gained significant importance. Event cameras or dynamic vision sensors, capture asynchronous changes in brightness at the pixel level, offering benefits such as high temporal resolution, no motion blur, and a wide dynamic range. However, unlike traditional cameras that measure absolute intensity, event cameras lack the ability to provide scene context. Efficiently combining the advantages of both asynchronous events and synchronous RGB images to enhance depth estimation remains a challenge. In our study, we introduce a unified transformer that combines both event and RGB modalities to achieve precise depth prediction. In contrast to individual transformers for input modalities, a unified transformer model captures inter-modal dependencies and uses self-attention to enhance event-RGB contextual interactions. This approach exceeds the performance of recurrent neural network (RNN) methods used in state-of-the-art models. To encode the temporal information from events, convLSTMs are used before the transformer to improve depth estimation. Our proposed architecture outperforms the existing approaches in terms of absolute mean depth error, achieving state-of-the-art results in most cases. Additionally, the performance is also seen in other metrics like RMSE, absolute relative difference and depth thresholds compared to the existing approaches. The source code is available at:https://***/anusha-devulapally/ER-F2D.

关键词： Event Cameras Monocular Depth Estimation Multi-Modal Fusion vision Transformer

来源：评论

学校读者我要写书评

暂无评论

没有更多数据了...

全选清除本页清除全部题录导出标记到“检索档案”

共500页 << < 11 12 13 14 15 16 17 18 19 20 > >>

检索报告对象比较合并检索0

隐藏清空

合并搜索

回到顶部

执行限定条件

内容：

评分：

请选择保存的检索档案：

请选择收藏分类：

订阅名称：

通借通还

温馨提示：

图书名称：

借书校区：

取书校区：

手机号码：

邮箱地址：

一卡通帐号：

电话和邮箱必须正确填写，我们会与您联系确认。

联系人：

所在院系：

联系邮箱：

联系电话：

内蒙古自治区呼和浩特市赛罕区大学西街235号邮编: 010021

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：