检索结果-内蒙古大学图书馆

您好，读者！请登录

内蒙古大学图书馆

首页
概况
党建
资源
服务
科研支持
- 论文收录引用证明
- 科技查新
知识产权
档案馆
帮助

咨询与建议

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

您的常用邮箱：*

您的手机号码：*

问题描述：

当前已输入0个字，您还可以输入200个字

全部搜索
期刊论文
图书
学位论文
标准
纸本馆藏
外文资源发现
数据库导航
超星发现

高级检索

时间限定

出版年份：

文献类型

图书期刊文献学位论文多媒体

馆藏选择

电子馆藏纸本馆藏

核心期刊

全部期刊 SCI 收录期刊 SSCI 收录期刊 EI 收录期刊 CSCD 收录期刊 CSSCI 收录期刊

语言

中文英文

文献类型

期刊文献图书学位论文标准纸本馆藏

帮助

文字说明：

T=题名（书名、题名），A=作者（责任者），K=主题词，P=出版物名称，PU=出版社名称，O=机构（作者单位、学位授予单位、专利申请人），L=中图分类号，C=学科分类号，U=全部字段，Y=年（出版发行年、学位年度、标准发布年）

检索规则说明：

AND代表“并且”；OR代表“或者”；NOT代表“不包含”；(注意必须大写,运算符两边需空一格)

检索范例：

范例一：(K=图书馆学 OR K=情报学) AND A=范并思 AND Y=1982-2016
范例二：P=计算机应用与软件 AND (U=C++ OR U=Basic) NOT K=Visual AND Y=2011-2016

分类表

所选分类

>> <<

限定检索结果

文献类型

12,844 篇 会议
13 篇 期刊文献
2 册 图书

馆藏范围

12,859 篇 电子文献
0 种 纸本馆藏

日期分布

学科分类号

7,573 篇 工学
- 6,863 篇 计算机科学与技术...
- 880 篇 机械工程
- 814 篇 软件工程
- 435 篇 控制科学与工程
- 360 篇 光学工程
- 306 篇 电气工程
- 209 篇 仪器科学与技术
- 124 篇 信息与通信工程
- 91 篇 生物工程
- 62 篇 生物医学工程（可授...
- 39 篇 电子科学与技术（可...
- 34 篇 安全科学与工程
- 26 篇 化学工程与技术
- 21 篇 交通运输工程
- 20 篇 建筑学
- 18 篇 土木工程
2,957 篇 医学
- 2,956 篇 临床医学
- 15 篇 基础医学(可授医学...
- 12 篇 药学(可授医学、理...
700 篇 理学
- 359 篇 物理学
- 225 篇 数学
- 175 篇 系统科学
- 95 篇 统计学（可授理学、...
- 93 篇 生物学
- 22 篇 化学
201 篇 艺术学
- 201 篇 设计学（可授艺术学...
84 篇 管理学
- 59 篇 图书情报与档案管...
- 25 篇 管理科学与工程(可...
- 14 篇 工商管理
23 篇 法学
- 21 篇 社会学
5 篇 农学
4 篇 教育学
2 篇 经济学
1 篇 军事学

主题

6,464 篇 computer vision
2,688 篇 training
2,437 篇 pattern recognit...
1,780 篇 computational mo...
1,522 篇 visualization
1,348 篇 three-dimensiona...
1,091 篇 computer archite...
1,063 篇 semantics
997 篇 benchmark testin...
976 篇 codes
970 篇 conferences
854 篇 feature extracti...
830 篇 cameras
771 篇 task analysis
707 篇 deep learning
646 篇 image segmentati...
611 篇 object detection
595 篇 shape
554 篇 transformers
538 篇 neural networks

机构

132 篇 univ sci & techn...
122 篇 carnegie mellon ...
120 篇 tsinghua univ pe...
114 篇 univ chinese aca...
113 篇 chinese univ hon...
94 篇 tsinghua univers...
91 篇 zhejiang univ pe...
91 篇 swiss fed inst t...
85 篇 peng cheng lab p...
81 篇 university of ch...
80 篇 zhejiang univers...
77 篇 shanghai ai lab ...
77 篇 peng cheng labor...
75 篇 university of sc...
69 篇 shanghai jiao to...
68 篇 shanghai jiao to...
67 篇 alibaba grp peop...
67 篇 stanford univ st...
66 篇 univ hong kong p...
64 篇 sensetime res pe...

作者

77 篇 timofte radu
63 篇 van gool luc
45 篇 zhang lei
36 篇 yang yi
36 篇 luc van gool
34 篇 tao dacheng
31 篇 loy chen change
29 篇 chen chen
28 篇 sun jian
28 篇 qi tian
25 篇 li xin
24 篇 liu yang
24 篇 tian qi
24 篇 ying shan
23 篇 wang xinchao
23 篇 zha zheng-jun
23 篇 boxin shi
21 篇 zhou jie
21 篇 vasconcelos nuno
20 篇 luo ping

语言

12,850 篇 英文
8 篇 其他
1 篇 中文

检索条件"任意字段=IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops"

共 12859 条记录，以下是641-650 订阅

全选清除本页清除全部题录导出标记到"检索档案"

详细简洁

排序：

AVFormer: Injecting vision into Frozen Speech Models for Zero-Shot AV-ASR

AVFormer: Injecting Vision into Frozen Speech Models for Zer...

引用

ieee/cvf conference on computer vision and pattern recognition (CVPR)

作者： Seo, Paul Hongsuck Nagrani, Arsha Schmid, Cordelia Google Res Mountain View CA 94043 USA

ISBN: (纸本)9798350301298

Audiovisual automatic speech recognition (AV-ASR) aims to improve the robustness of a speech recognition system by incorporating visual information. Training fully supervised multimodal models for this task from scratch, however is limited by the need for large labelled audiovisual datasets (in each downstream domain of interest). We present AVFormer, a simple method for augmenting audio-only models with visual information, at the same time performing lightweight domain adaptation. We do this by (i) injecting visual embeddings into a frozen ASR model using lightweight trainable adaptors. We show that these can be trained on a small amount of weakly labelled video data with minimum additional training time and parameters. (ii) We also introduce a simple curriculum scheme during training which we show is crucial to enable the model to jointly process audio and visual information effectively;and finally (iii) we show that our model achieves state of the art zero-shot results on three different AV-ASR benchmarks (How2, VisSpeech and Ego4D), while also crucially preserving decent performance on traditional audio-only speech recognition benchmarks (LibriSpeech). Qualitative results show that our model effectively leverages visual information for robust speech recognition.

关键词： Multi-modal learning

来源：评论

学校读者我要写书评

暂无评论

Proceedings - 2024 ieee Winter conference on Applications of computer vision workshops, WACVW 2024

Proceedings - 2024 IEEE Winter Conference on Applications of...

引用

2024 ieee/cvf Winter conference on Applications of computer vision workshops, WACVW 2024

ISBN: (纸本)9798350370287

The proceedings contain 123 papers. The topics discussed include: the SARFish dataset and challenge;NORPPA: NOvel ringed seal re-identification by pelage pattern aggregation;multiple toddler tracking in indoor videos;challenges in video-based infant action recognition: a critical examination of the state of the art;KABR: in-situ dataset for kenyan animal behavior recognition from drone videos;the hitchhiker's guide to endangered species pose estimation;efficient domain adaptation via generative prior for 3D infant pose estimation;dynamic gaussian splatting from markerless motion capture reconstruct infants movements;neural texture puppeteer: a framework for neural geometry and texture rendering of articulated shapes, enabling re-identification at interactive speed;and DigiDogs: single-view 3D pose estimation of dogs using synthetic training data.

关键词：

来源：评论

学校读者我要写书评

暂无评论

Generative Flows as a General Purpose Solution for Inverse Problems

Generative Flows as a General Purpose Solution for Inverse P...

引用

ieee/cvf conference on computer vision and pattern recognition (CVPR)

作者： Chavez, Jose A. San Pablo Catholic Univ Arequipa Peru

ISBN: (纸本)9781665487399

Due to the success of generative flows to model data distributions, they have been explored in inverse problems. Given a pre-trained generative flow, previous work proposed to minimize the 2-norm of the latent variables as a regularization term. The intuition behind it was to ensure high likelihood latent variables that produce the closest restoration. However, high-likelihood latent variables may generate unrealistic samples as we show in our experiments. We therefore propose a solver to directly produce high-likelihood reconstructions. We hypothesize that our approach could make generative flows a general purpose solver for inverse problems. Furthermore, we propose 1x1 coupling functions to introduce permutations in a generative flow. It has the advantage that its inverse does not require to be calculated in the generation process. Finally, we evaluate our method for denoising, deblurring, inpainting, and colorization. We observe a compelling improvement of our method over prior works.

关键词： Couplings computer vision Inverse problems conferences Noise reduction Data models pattern recognition

来源：评论

学校读者我要写书评

暂无评论

MetaCLUE: Towards Comprehensive Visual Metaphors Research

MetaCLUE: Towards Comprehensive Visual Metaphors Research

引用

ieee/cvf conference on computer vision and pattern recognition (CVPR)

作者： Akula, Arjun R. Driscoll, Brendan Narayana, Pradyumna Changpinyo, Soravit Jia, Zhiwei Damle, Suyash Pruthi, Garima Basu, Sugato Guibas, Leonidas Freeman, William T. Li, Yuanzhen Jampani, Varun Google Mountain View CA 94043 USA

ISBN: (纸本)9798350301298

Creativity is an indispensable part of human cognition and also an inherent part of how we make sense of the world. Metaphorical abstraction is fundamental in communicating creative ideas through nuanced relationships between abstract concepts such as feelings. While computer vision benchmarks and approaches predominantly focus on understanding and generating literal interpretations of images, metaphorical comprehension of images remains relatively unexplored. Towards this goal, we introduce MetaCLUE, a set of vision tasks on visual metaphor. We also collect high-quality and rich metaphor annotations (abstract objects, concepts, relationships along with their corresponding object boxes) as there do not exist any datasets that facilitate the evaluation of these tasks. We perform a comprehensive analysis of state-of-the-art models in vision and language based on our annotations, highlighting strengths and weaknesses of current approaches in visual metaphor classification, localization, understanding (retrieval, question answering, captioning) and generation (text-to-image synthesis) tasks. We hope this work provides a concrete step towards developing AI systems with human-like creative capabilities. Project page: https://***

关键词： and reasoning language vision

来源：评论

学校读者我要写书评

暂无评论

KiUT: Knowledge-injected U-Transformer for Radiology Report Generation

KiUT: Knowledge-injected U-Transformer for Radiology Report ...

引用

ieee/cvf conference on computer vision and pattern recognition (CVPR)

作者： Huang, Zhongzhen Zhang, Xiaofan Zhang, Shaoting Shanghai Jiao Tong Univ Shanghai Peoples R China Shanghai AI Lab Shanghai Peoples R China SenseTime Res Hong Kong Peoples R China

ISBN: (纸本)9798350301298

Radiology report generation aims to automatically generate a clinically accurate and coherent paragraph from the X-ray image, which could relieve radiologists from the heavy burden of report writing. Although various image caption methods have shown remarkable performance in the natural image field, generating accurate reports for medical images requires knowledge of multiple modalities, including vision, language, and medical terminology. We propose a Knowledge-injected U-Transformer (KiUT) to learn multi-level visual representation and adaptively dis-till the information with contextual and clinical knowledge for word prediction. In detail, a U-connection schema between the encoder and decoder is designed to model interactions between different modalities. And a symptom graph and an injected knowledge distiller are developed to assist the report generation. Experimentally, we outperform state-of-the-art methods on two widely used benchmark datasets: IU-Xray and MIMIC-CXR. Further experimental results prove the advantages of our architecture and the complementary benefits of the injected knowledge.

关键词： cell microscopy Medical and biological vision

来源：评论

学校读者我要写书评

暂无评论

Towards Professional Level Crowd Annotation of Expert Domain Data

Towards Professional Level Crowd Annotation of Expert Domain...

引用

ieee/cvf conference on computer vision and pattern recognition (CVPR)

作者： Wang, Pei Vasconcelos, Nuno Univ Calif San Diego La Jolla CA 92093 USA

ISBN: (纸本)9798350301298

Image recognition on expert domains is usually fine-grained and requires expert labeling, which is costly. This limits dataset sizes and the accuracy of learning systems. To address this challenge, we consider annotating expert data with crowdsourcing. This is denoted as PrOfeSsional lEvel cRowd (POSER) annotation. A new approach, based on semi-supervised learning (SSL) and denoted as SSL with human filtering (SSL-HF) is proposed. It is a human-in-the-loop SSL method, where crowd-source workers act as filters of pseudo-labels, replacing the unreliable confidence thresholding used by state-of-the-art SSL methods. To enable annotation by non-experts, classes are specified implicitly, via positive and negative sets of examples and augmented with deliberative explanations, which highlight regions of class ambiguity. In this way, SSL-HF leverages the strong low-shot learning and confidence estimation ability of humans to create an intuitive but effective labeling experience. Experiments show that SSL-HF significantly outperforms various alternative approaches in several benchmarks.

关键词： Efficient and scalable vision

来源：评论

学校读者我要写书评

暂无评论

ECO: Ensembling Context Optimization for vision-Language Models

ECO: Ensembling Context Optimization for Vision-Language Mod...

引用

ieee/cvf International conference on computer vision (ICCV)

作者： Agnolucci, Lorenzo Baldrati, Alberto Todino, Francesco Becattini, Federico Bertini, Marco Del Bimbo, Alberto Univ Florence Florence Italy Univ Siena Siena Italy Univ Pisa Pisa Italy

ISBN: (纸本)9798350307443

Image recognition has recently witnessed a paradigm shift, where vision-language models are now used to perform few-shot classification based on textual prompts. Among these, the CLIP model has shown remarkable capabilities for zero-shot transfer by matching an image and a custom textual prompt in its latent space. This has paved the way for several works that focus on engineering or learning textual contexts for maximizing CLIP's classification capabilities. In this paper, we follow this trend by learning an ensemble of prompts for image classification. We show that learning diverse and possibly shorter contexts improves considerably and consistently the results rather than relying on a single trainable prompt. In particular, we report better few-shot capabilities with no additional cost at inference time. We demonstrate the capabilities of our approach on 11 different benchmarks.

关键词： CLIP CoOP Few shot classification Few shot learning ImageNet Prompt ensembling Prompt learning vision and Language

来源：评论

学校读者我要写书评

暂无评论

Practical Network Acceleration with Tiny Sets

Practical Network Acceleration with Tiny Sets

引用

ieee/cvf conference on computer vision and pattern recognition (CVPR)

作者： Wang, Guo-Hua Wu, Jianxin Nanjing Univ State Key Lab Novel Software Technol Nanjing Peoples R China

ISBN: (纸本)9798350301298

Due to data privacy issues, accelerating networks with tiny training sets has become a critical need in practice. Previous methods mainly adopt filter-level pruning to accelerate networks with scarce training samples. In this paper, we reveal that dropping blocks is a fundamentally superior approach in this scenario. It enjoys a higher acceleration ratio and results in a better latency-accuracy performance under the few-shot setting. To choose which blocks to drop, we propose a new concept namely recoverability to measure the difficulty of recovering the compressed network. Our recoverability is efficient and effective for choosing which blocks to drop. Finally, we propose an algorithm named PRACTISE to accelerate networks using only tiny sets of training images. PRACTISE outperforms previous methods by a significant margin. For 22% latency reduction, PRACTISE surpasses previous methods by on average 7% on ImageNet-1k. It also enjoys high generalization ability, working well under data-free or out-of-domain data settings, too. Our code is at https://***/DoctorKey/Practise.

关键词： Efficient and scalable vision

来源：评论

学校读者我要写书评

暂无评论

Multi-Class Cell Detection Using Modified Self-Attention

Multi-Class Cell Detection Using Modified Self-Attention

引用

ieee/cvf conference on computer vision and pattern recognition (CVPR)

作者： Sugimoto, Tatsuhiko Ito, Hiroaki Teramoto, Yuki Yoshizawa, Akihiko Bise, Ryoma Kyushu Univ Fukuoka Japan Kyoto Univ Hosp Kyoto Japan

ISBN: (数字)9781665487399

ISBN: (纸本)9781665487399

Multi-class cell detection (cancer or non-cancer) from a whole slide image (WSI) is an important task for pathological diagnosis. Cancer and non-cancer cells often have a similar appearance, so it is difficult even for experts to classify a cell from a patch image of individual cells. They usually identify the cell type not only on the basis of the appearance of a single cell but also on the context of the surrounding cells. For using such information, we propose a multi-class cell-detection method that introduces a modified self-attention to aggregate the surrounding image features of both classes. Experimental results demonstrate the effectiveness of the proposed method;our method achieved the best performance compared with a method, which simply uses the standard self-attention method.

关键词： Heating systems Pathology computer vision Aggregates conferences Feature extraction pattern recognition

来源：评论

学校读者我要写书评

暂无评论

1% VS 100%: Parameter-Efficient Low Rank Adapter for Dense Predictions

1% VS 100%: Parameter-Efficient Low Rank Adapter for Dense P...

引用

ieee/cvf conference on computer vision and pattern recognition (CVPR)

作者： Yin, Dongshuo Yang, Yiran Wang, Zhechao Yu, Hongfeng Wei, Kaiwen Sun, Xian Chinese Acad Sci Aerosp Informat Res Inst Key Lab Network Informat Syst Technol Beijing Peoples R China Univ Chinese Acad Sci Sch Elect Elect & Commun Engn Beijing Peoples R China

ISBN: (纸本)9798350301298

Fine-tuning large-scale pre-trained vision models to downstream tasks is a standard technique for achieving state-of-the-art performance on computer vision benchmarks. However, fine-tuning the whole model with millions of parameters is inefficient as it requires storing a same-sized new model copy for each task. In this work, we propose LoRand, a method for fine-tuning large-scale vision models with a better trade-off between task performance and the number of trainable parameters. LoRand generates tiny adapter structures with low-rank synthesis while keeping the original backbone parameters fixed, resulting in high parameter sharing. To demonstrate LoRand's effectiveness, we implement extensive experiments on object detection, semantic segmentation, and instance segmentation tasks. By only training a small percentage (1% to 3%) of the pre-trained backbone parameters, LoRand achieves comparable performance to standard fine-tuning on COCO and ADE20K and outperforms fine-tuning in low-resource PASCAL VOC dataset.

关键词： Efficient and scalable vision

来源：评论

学校读者我要写书评

暂无评论

没有更多数据了...

全选清除本页清除全部题录导出标记到“检索档案”

共500页 << < 61 62 63 64 65 66 67 68 69 70 > >>

检索报告对象比较合并检索0

隐藏清空

合并搜索

回到顶部

执行限定条件

内容：

评分：

请选择保存的检索档案：

请选择收藏分类：

订阅名称：

通借通还

温馨提示：

图书名称：

借书校区：

取书校区：

手机号码：

邮箱地址：

一卡通帐号：

电话和邮箱必须正确填写，我们会与您联系确认。

联系人：

所在院系：

联系邮箱：

联系电话：

内蒙古自治区呼和浩特市赛罕区大学西街235号邮编: 010021

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：