检索结果-内蒙古大学图书馆

您好，读者！请登录

内蒙古大学图书馆

首页
概况
党建
资源
服务
科研支持
- 论文收录引用证明
- 科技查新
知识产权
档案馆
帮助

咨询与建议

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

您的常用邮箱：*

您的手机号码：*

问题描述：

当前已输入0个字，您还可以输入200个字

全部搜索
期刊论文
图书
学位论文
标准
纸本馆藏
外文资源发现
数据库导航
超星发现

高级检索

时间限定

出版年份：

文献类型

图书期刊文献学位论文多媒体

馆藏选择

电子馆藏纸本馆藏

核心期刊

全部期刊 SCI 收录期刊 SSCI 收录期刊 EI 收录期刊 CSCD 收录期刊 CSSCI 收录期刊

语言

中文英文

文献类型

期刊文献图书学位论文标准纸本馆藏

帮助

文字说明：

T=题名（书名、题名），A=作者（责任者），K=主题词，P=出版物名称，PU=出版社名称，O=机构（作者单位、学位授予单位、专利申请人），L=中图分类号，C=学科分类号，U=全部字段，Y=年（出版发行年、学位年度、标准发布年）

检索规则说明：

AND代表“并且”；OR代表“或者”；NOT代表“不包含”；(注意必须大写,运算符两边需空一格)

检索范例：

范例一：(K=图书馆学 OR K=情报学) AND A=范并思 AND Y=1982-2016
范例二：P=计算机应用与软件 AND (U=C++ OR U=Basic) NOT K=Visual AND Y=2011-2016

分类表

所选分类

>> <<

限定检索结果

文献类型

4,477 篇 会议
9 篇 期刊文献
5 册 图书

馆藏范围

4,491 篇 电子文献
0 种 纸本馆藏

日期分布

学科分类号

2,329 篇 工学
- 1,912 篇 计算机科学与技术...
- 541 篇 软件工程
- 417 篇 机械工程
- 327 篇 光学工程
- 269 篇 控制科学与工程
- 216 篇 仪器科学与技术
- 117 篇 信息与通信工程
- 99 篇 电气工程
- 79 篇 生物工程
- 50 篇 生物医学工程（可授...
- 34 篇 电子科学与技术（可...
- 25 篇 安全科学与工程
- 21 篇 化学工程与技术
- 16 篇 建筑学
- 15 篇 交通运输工程
- 14 篇 土木工程
489 篇 理学
- 327 篇 物理学
- 194 篇 数学
- 83 篇 生物学
- 79 篇 统计学（可授理学、...
- 23 篇 系统科学
- 18 篇 化学
206 篇 艺术学
- 206 篇 设计学（可授艺术学...
67 篇 管理学
- 48 篇 图书情报与档案管...
- 19 篇 管理科学与工程(可...
- 10 篇 工商管理
45 篇 医学
- 45 篇 临床医学
- 13 篇 基础医学(可授医学...
- 11 篇 药学(可授医学、理...
20 篇 法学
- 18 篇 社会学
7 篇 农学
4 篇 教育学
1 篇 经济学
1 篇 文学
1 篇 军事学

主题

1,834 篇 computer vision
890 篇 conferences
696 篇 pattern recognit...
656 篇 training
472 篇 cameras
381 篇 feature extracti...
375 篇 computational mo...
341 篇 visualization
314 篇 computer archite...
285 篇 image segmentati...
259 篇 face recognition
231 篇 object detection
230 篇 robustness
208 篇 shape
193 篇 three-dimensiona...
184 篇 humans
176 篇 neural networks
169 篇 semantics
166 篇 computer science
157 篇 benchmark testin...

机构

21 篇 swiss fed inst t...
19 篇 swiss fed inst t...
18 篇 university of sc...
17 篇 univ sci & techn...
17 篇 carnegie mellon ...
15 篇 institute for co...
14 篇 tsinghua univers...
13 篇 computer vision ...
13 篇 tsinghua univ pe...
13 篇 stanford univ st...
12 篇 harbin inst tech...
12 篇 mit cambridge ma...
12 篇 sun yat sen univ...
12 篇 carnegie mellon ...
11 篇 chinese univ hon...
11 篇 megvii technol p...
11 篇 chinese acad sci...
10 篇 comp vis ctr bar...
10 篇 univ modena & re...
10 篇 beihang univ peo...

作者

57 篇 timofte radu
20 篇 luc van gool
20 篇 radu timofte
17 篇 horst bischof
16 篇 van gool luc
15 篇 sergio escalera
12 篇 zhigang zhu
12 篇 li stan z.
12 篇 chen wei-ting
12 篇 bischof horst
12 篇 lei lei
11 篇 fan haoqiang
11 篇 sun jian
11 篇 marcos v. conde
11 篇 lei zhen
10 篇 escalera sergio
10 篇 cucchiara rita
10 篇 zhang lei
10 篇 angel d. sappa
10 篇 liu shuaicheng

语言

4,486 篇 英文
4 篇 中文
1 篇 其他

检索条件"任意字段=2013 IEEE Conference on Computer Vision and Pattern Recognition Workshops, CVPRW 2013"

共 4491 条记录，以下是101-110 订阅

全选清除本页清除全部题录导出标记到"检索档案"

详细简洁

排序：

DSTCFuse: A Method based on Dual-cycled Cross-awareness of Structure Tensor for Semantic Segmentation via Infrared and Visible Image Fusion

DSTCFuse: A Method based on Dual-cycled Cross-awareness of S...

引用

ieee/CVF conference on computer vision and pattern recognition (CVPR)

作者： Li, Xuan Chen, Rongfu Wang, Jie Ma, Lei Cheng, Li Yuan, Haiwen Wuhan Inst Technol Sch Elect & Informat Engn Wuhan Peoples R China Hubei Key Lab Opt Informat & Pattern Recognit Wuhan Peoples R China

ISBN: (纸本)9798350365474

Multi-modality information fusion can compensate deficiencies of single modality and provide rich scene information for 2D semantic segmentation. However, the inconsistency in the feature space between different modalities may lead to poor presentation of objects and that would affect subsequent segmented effectiveness. The idea of modal transition can reduce the modal differences and avoid biased processing during the fusion process, but it is hard to perfectly retain the contents of the source images. To address these challenges, a fusion method based on dual-cycled cross-awareness of structure tensor is proposed. Firstly, we propose a dual-cycle modality transition network based on cross-awareness consistency to learn the differences in feature space from different modalities. Secondly, a set of globally structure-tensor preserving modules are designed to enhance the capabilities of the network in capturing complementary features and perceiving global modal consistency. Under the joint constraint of globally structure-tensor awareness loss and cross-awareness loss, our network achieves a robust mapping of feature space from visible to pseudo-infrared images without relying on Ground-Truth. Finally, the pseudo-infrared images that inherit the superior qualities of two modalities are fused with the original infrared images directly, which effectively reduces the complexity of fusion. Extensive comparative experiments show that our method outperforms state-of-the-art methods in qualitative and quantitative evaluation.

关键词： Cross-modality transition Global structure-tensor awareness Image fusion

来源：评论

学校读者我要写书评

暂无评论

Uncovering the Hidden Cost of Model Compression

Uncovering the Hidden Cost of Model Compression

引用

ieee/CVF conference on computer vision and pattern recognition (CVPR)

作者： Misra, Diganta Chaudhary, Muawiz Goyal, Agam Runwal, Bharat Chen, Pin Yu Carnegie Mellon Univ Pittsburgh PA 15213 USA Landskape AI Leiden Netherlands Mila Quebec AI Inst Montreal PQ Canada Concordia Univ Montreal PQ Canada Univ Wisconsin Madison Madison WI USA IBM Res Yorktown Hts NY USA

ISBN: (纸本)9798350365474

In an age dominated by resource-intensive foundation models, the ability to efficiently adapt to downstream tasks is crucial. Visual Prompting (VP), drawing inspiration from the prompting techniques employed in Large Language Models (LLMs), has emerged as a pivotal method for transfer learning in the realm of computer vision. As the importance of efficiency continues to rise, research into model compression has become indispensable in alleviating the computational burdens associated with training and deploying over-parameterized neural networks. A primary objective in model compression is to develop sparse and/or quantized models capable of matching or even surpassing the performance of their over-parameterized, full-precision counterparts. Although previous studies have explored the effects of model compression on transfer learning, its impact on visual prompting-based transfer remains unclear. This study aims to bridge this gap, shedding light on the fact that model compression detrimentally impacts the performance of visual prompting-based transfer, particularly evident in scenarios with low data volume. Furthermore, our findings underscore the adverse influence of sparsity on the calibration of downstream visual-prompted models. However, intriguingly, we also illustrate that such negative effects on calibration are not present when models are compressed via quantization. This empirical investigation underscores the need for a nuanced understanding beyond mere accuracy in sparse and quantized settings, thereby paving the way for further exploration in Visual Prompting techniques tailored for sparse and quantized models.

关键词： Compression Quantization Sparsity vision prompting

来源：评论

学校读者我要写书评

暂无评论

Orientation-conditioned Facial Texture Mapping for Video-based Facial Remote Photoplethysmography Estimation

Orientation-conditioned Facial Texture Mapping for Video-bas...

引用

ieee/CVF conference on computer vision and pattern recognition (CVPR)

作者： Cantrill, Sam Ahmedt-Aristizabal, David Petersson, Lars Suominen, Hanna Armin, Mohammad Ali Australian Natl Univ Canberra ACT Australia Commonwealth & Sci Ind Res Org Data61 Canberra ACT Australia Univ Turku Turku Finland

ISBN: (纸本)9798350365474

Camera-based remote photoplethysmography (rPPG) enables contactless measurement of important physiological signals such as pulse rate (PR). However, dynamic and unconstrained subject motion introduces significant variability into the facial appearance in video, confounding the ability of video-based methods to accurately extract the rPPG signal. In this study, we leverage the 3D facial surface to construct a novel orientation-conditioned facial texture video representation which improves the motion robustness of existing video-based facial rPPG estimation methods. Our proposed method achieves a significant 18.2% performance improvement in cross-dataset testing on MMPD over our baseline using the PhysNet model trained on PURE, highlighting the efficacy and generalization benefits of our designed video representation. We demonstrate significant performance improvements of up to 29.6% in all tested motion scenarios in cross-dataset testing on MMPD, even in the presence of dynamic and unconstrained subject motion. Emphasizing the benefits the benefits of disentangling motion through modeling the 3D facial surface for motion robust facial rPPG estimation. We validate the efficacy of our design decisions and the impact of different video processing steps through an ablation study. Our findings illustrate the potential strengths of exploiting the 3D facial surface as a general strategy for addressing dynamic and unconstrained subject motion in videos. The code is available at https://***/orientation-uv-rppg/.

关键词： computer vision motion rppg

来源：评论

学校读者我要写书评

暂无评论

HarvestNet: A Dataset for Detecting Smallholder Farming Activity Using Harvest Piles and Remote Sensing

HarvestNet: A Dataset for Detecting Smallholder Farming Acti...

引用

ieee/CVF conference on computer vision and pattern recognition (CVPR)

作者： Xu, Jonathan Elmustafa, Amna Weldegebriel, Liya Negash, Emnet Lee, Richard Meng, Chenlin Ermon, Stefano Lobell, David Stanford Univ Stanford CA 94305 USA Univ Waterloo Waterloo ON Canada Univ Ghent Ghent Belgium Mekelle Univ Mekele Ethiopia

ISBN: (纸本)9798350365474

Small farms contribute to a large share of the productive land in developing countries. In regions such as subSaharan Africa, where 80% of farms are small (under 2 ha in size), the task of mapping smallholder cropland is an important part of tracking sustainability measures such as crop productivity. However, the visually diverse and nuanced appearance of small farms has limited the effectiveness of traditional approaches to cropland mapping. Here we introduce a new approach based on the detection of harvest piles characteristic of many smallholder systems throughout the world. We present HarvestNet, a dataset for mapping the presence of farms in the Ethiopian regions of Tigray and Amhara during 2020-2023, collected using expert knowledge and satellite images, totaling 7k hand-labeled images and 2k ground-collected labels. We also benchmark a set of baselines, including SOTA models in remote sensing, with our best models having around 80% classification performance on hand labelled data and 90% and 98% accuracy on ground truth data for Tigray and Amhara, respectively. We also perform a visual comparison with a widely used pre-existing coverage map and show that our model detects an extra 56,621 hectares of cropland in Tigray. We conclude that remote sensing of harvest piles can contribute to more timely and accurate cropland assessments in food insecure regions. The dataset can be accessed through https://***/s/45a7b45556b90a9a11d2, while the code for the dataset and benchmarks is publicly available at https://***/jonxuxu/harvestpiles.

关键词： agriculture computer vision dataset harvest piles machine learning Remote sensing sustainability

来源：评论

学校读者我要写书评

暂无评论

ViTKD: Feature-based Knowledge Distillation for vision Transformers

ViTKD: Feature-based Knowledge Distillation for Vision Trans...

引用

ieee/CVF conference on computer vision and pattern recognition (CVPR)

作者： Yang, Zhendong Li, Zhe Zeng, Ailing Li, Zexian Yu, Chun Yu, Liming Tsinghua Shenzhen Int Grad Sch Shenzhen Peoples R China Int Digital Econ Acad IDEA Shenzhen Peoples R China Chinese Acad Sci Inst Automat Beijing Peoples R China Beihang Univ Beijing Peoples R China IDEA Shenzhen Peoples R China

ISBN: (纸本)9798350365474

Knowledge Distillation (KD) has been extensively studied as a means to enhance the performance of smaller models in Convolutional Neural Networks (CNNs). Recently, the vision Transformer (ViT) has demonstrated remarkable success in various computer vision tasks, leading to an increased demand for KD in ViT. However, while logit-based KD has been applied to ViT, other feature-based KD methods for CNNs cannot be directly implemented due to the significant structure gap. In this paper, we conduct an analysis of the properties of different feature layers in ViT to identify a method for feature-based ViT distillation. Our findings reveal that both shallow and deep layers in ViT are equally important for distillation and require distinct distillation strategies. Based on these guidelines, we propose our feature-based method ViTKD, which mimics the shallow layers and generates the deep layer in the teacher. ViTKD leads to consistent and significant improvements in the students. On ImageNet-1K, we achieve performance boosts of 1.64% for DeiT-Tiny, 1.40% for DeiT-Small, and 1.70% for DeiT-Base. Downstream tasks also demonstrate the superiority of ViTKD. Additionally, ViTKD and logit-based KD are complementary and can be applied together directly, further enhancing the student's performance. Specifically, DeiT-T, S, and B achieve accuracies of 77.78%, 83.59%, and 85.41%, respectively, using this combined approach. Code is available at https:// github. com/ yzdv/cls_KD.

关键词： Convolutional neural networks

来源：评论

学校读者我要写书评

暂无评论

Multi-Task Multi-Modal Self-Supervised Learning for Facial Expression recognition

Multi-Task Multi-Modal Self-Supervised Learning for Facial E...

引用

ieee/CVF conference on computer vision and pattern recognition (CVPR)

作者： Halawa, Marah Blume, Florian Bideau, Pia Maier, Martin Rahman, Rasha Abdel Hellwich, Olaf Tech Univ Berlin Berlin Germany Univ Grenoble Alpes Grenoble INP CNRS INRIALJK Grenoble France Humboldt Univ Berlin Germany Res Cluster Excellence Sci Intelligence Berlin Germany

ISBN: (纸本)9798350365474

Human communication is multi-modal;e.g., face-to-face interaction involves auditory signals (speech) and visual signals (face movements and hand gestures). Hence, it is essential to exploit multiple modalities when designing machine learning-based facial expression recognition systems. In addition, given the ever-growing quantities of video data that capture human facial expressions, such systems should utilize raw unlabeled videos without requiring expensive annotations. Therefore, in this work, we employ a multitask multi-modal self-supervised learning method for facial expression recognition from in-the-wild video data. Our model combines three self-supervised objective functions: First, a multi-modal contrastive loss, that pulls diverse data modalities of the same video together in the representation space. Second, a multi-modal clustering loss that preserves the semantic structure of input data in the representation space. Finally, a multi-modal data reconstruction loss. We conduct a comprehensive study on this multimodal multi-task self-supervised learning method on three facial expression recognition benchmarks. To that end, we examine the performance of learning through different combinations of self-supervised tasks on the facial expression recognition downstream task. Our model ConCluGen outperforms several multi-modal self-supervised and fully supervised baselines on the CMU-MOSEI dataset. Our results generally show that multi-modal self-supervision tasks offer large performance gains for challenging tasks such as facial expression recognition, while also reducing the amount of manual annotations required. We release our pre-trained models as well as source code publicly (1) .

关键词： computer vision facial expression recognition multi-modal representation learning self-supervised

来源：评论

学校读者我要写书评

暂无评论

Calibrating Higher-Order Statistics for Few-Shot Class-Incremental Learning with Pre-trained vision Transformers

Calibrating Higher-Order Statistics for Few-Shot Class-Incre...

引用

ieee/CVF conference on computer vision and pattern recognition (CVPR)

作者： Goswami, Dipam Twardowski, Bartlomiej van de Weijer, Joost Univ Autonoma Barcelona Dept Comp Sci Barcelona Spain Comp Vis Ctr Barcelona Spain IDEAS NCBR Warsaw Poland

ISBN: (纸本)9798350365474

Few-shot class-incremental learning (FSCIL) aims to adapt the model to new classes from very few data (5 samples) without forgetting the previously learned classes. Recent works in many-shot CIL (MSCIL) (using all available training data) exploited pre-trained models to reduce forgetting and achieve better plasticity. In a similar fashion, we use ViT models pre-trained on large-scale datasets for few-shot settings, which face the critical issue of low plasticity. FSCIL methods start with a many-shot first task to learn a very good feature extractor and then move to the few-shot setting from the second task onwards. While the focus of most recent studies is on how to learn the many-shot first task so that the model generalizes to all future few-shot tasks, we explore in this work how to better model the few-shot data using pre-trained models, irrespective of how the first task is trained. Inspired by recent works in MSCIL, we explore how using higher-order feature statistics can influence the classification of few-shot classes. We identify the main challenge of obtaining a good covariance matrix from few-shot data and propose to calibrate the covariance matrix for new classes based on semantic similarity to the many-shot base classes. Using the calibrated feature statistics in combination with existing methods significantly improves few-shot continual classification on several FSCIL benchmarks. Code is available at https://***/dipamgoswami/FSCIL-Calibration.

关键词： Continual Learning Few-Shot Class-Incremental Learning

来源：评论

学校读者我要写书评

暂无评论

NOISe: Nuclei-Aware Osteoclast Instance Segmentation for Mouse-to-Human Domain Transfer

NOISe: Nuclei-Aware Osteoclast Instance Segmentation for Mou...

引用

ieee/CVF conference on computer vision and pattern recognition (CVPR)

作者： Manne, Sai Kumar Reddy Martin, Brendan Roy, Tyler Neilson, Ryan Peters, Rebecca Chillara, Meghana Lary, Christine W. Motyl, Katherine J. Wan, Michael Northeastern Univ Boston MA 02115 USA MaineHlth Inst Res Scarborough ME USA Univ Maine Orono ME 04469 USA Tufts Univ Sch Med Medford MA 02155 USA

ISBN: (纸本)9798350365474

Osteoclast cell image analysis plays a key role in osteoporosis research, but it typically involves extensive manual image processing and hand annotations by a trained expert. In the last few years, a handful of machine learning approaches for osteoclast image analysis have been developed, but none have addressed the full instance segmentation task required to produce the same output as that of the human expert led process. Furthermore, none of the prior, fully automated algorithms have publicly available code, pretrained models, or annotated datasets, inhibiting reproduction and extension of their work. We present a new dataset with similar to 2 x 10(5) expert annotated mouse osteoclast masks, together with a deep learning instance segmentation method which works for both in vitro mouse osteoclast cells on plastic tissue culture plates and human osteoclast cells on bone chips. To our knowledge, this is the first work to automate the full osteoclast instance segmentation task. Our method achieves a performance of 0.82 mAP(0.5) (mean average precision at intersection-over-union threshold of 0.5) in cross validation for mouse osteoclasts. We present a novel nuclei-aware osteoclast instance segmentation training strategy (NOISe) based on the unique biology of osteoclasts, to improve the model's generalizability and boost the mAP(0.5) from 0.60 to 0.82 on human osteoclasts. We publish our annotated mouse osteoclast image dataset, instance segmentation models, and code at ***/michaelwwan/noise to enable reproducibility and to provide a public tool to accelerate osteoporosis research(1).

关键词： computer vision instance segmentation medical imaging microscope image analysis osteoporosis

来源：评论

学校读者我要写书评

暂无评论

GAN-based vision Transformer for High-Quality Thermal Image Enhancement

GAN-based Vision Transformer for High-Quality Thermal Image ...

引用

ieee/CVF conference on computer vision and pattern recognition (CVPR)

作者： Marnissi, Mohamed Amine Fathallah, Abir Univ Sfax Ecole Natl Ingn Sfax Sfax 3038 Tunisia Inst Polytech Paris Samovar CNRS Telecom SudParis 9 Rue Charles Fourier F-91011 Evry France

ISBN: (纸本)9798350302493

Generative Adversarial Networks (GANs) have shown an outstanding ability to generate high-quality images with visual realism and similarity to real images. This paper presents a new architecture for thermal image enhancement. Precisely, the strengths of architecture-based vision transformers and generative adversarial networks are exploited. The thermal loss feature introduced in our approach is specifically used to produce high-quality images. Thermal image enhancement also relies on fine-tuning based on visible images, resulting in an overall improvement in image quality. A visual quality metric was used to evaluate the performance of the proposed architecture. Significant improvements were found over the original thermal images and other enhancement methods established on a subset of the KAIST dataset. The performance of the proposed enhancement architecture is also verified on the detection results by obtaining better performance with a considerable margin regarding different versions of the YOLO detector.

关键词： Image enhancement

来源：评论

学校读者我要写书评

暂无评论

The Casual Conversations v2 Dataset A diverse, large benchmark for measuring fairness and robustness in audio/vision/speech models

The Casual Conversations v2 Dataset A diverse, large benchma...

引用

ieee/CVF conference on computer vision and pattern recognition (CVPR)

作者： Porgali, Bilal Albiero, Vitor Ryda, Jordan Ferrer, Cristian Canton Hazirbas, Caner Meta AI Menlo Pk CA 94025 USA

ISBN: (纸本)9798350302493

This paper introduces a new large consent-driven dataset aimed at assisting in the evaluation of algorithmic bias and robustness of computer vision and audio speech models in regards to 11 attributes that are self-provided or labeled by trained annotators. The dataset includes 26,467 videos of 5,567 unique paid participants, with an average of almost 5 videos per person, recorded in Brazil, India, Indonesia, Mexico, Vietnam, Philippines, and the USA, representing diverse demographic characteristics. The participants agreed for their data to be used in assessing fairness of AI models and provided self-reported age, gender, language/dialect, disability status, physical adornments, physical attributes and geo-location information, while trained annotators labeled apparent skin tone using the Fitzpatrick Skin Type and Monk Skin Tone scales, and voice timbre. Annotators also labeled for different recording setups and per-second activity annotations.

关键词： Large dataset

来源：评论

学校读者我要写书评

暂无评论

没有更多数据了...

全选清除本页清除全部题录导出标记到“检索档案”

共450页 << < 7 8 9 10 11 12 13 14 15 16 > >>

检索报告对象比较合并检索0

隐藏清空

合并搜索

回到顶部

执行限定条件

内容：

评分：

请选择保存的检索档案：

请选择收藏分类：

订阅名称：

通借通还

温馨提示：

图书名称：

借书校区：

取书校区：

手机号码：

邮箱地址：

一卡通帐号：

电话和邮箱必须正确填写，我们会与您联系确认。

联系人：

所在院系：

联系邮箱：

联系电话：

内蒙古自治区呼和浩特市赛罕区大学西街235号邮编: 010021

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：