检索结果-内蒙古大学图书馆

您好，读者！请登录

内蒙古大学图书馆

首页
概况
党建
资源
服务
科研支持
- 论文收录引用证明
- 科技查新
知识产权
档案馆
帮助

咨询与建议

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

您的常用邮箱：*

您的手机号码：*

问题描述：

当前已输入0个字，您还可以输入200个字

全部搜索
期刊论文
图书
学位论文
标准
纸本馆藏
外文资源发现
数据库导航
超星发现

高级检索

时间限定

出版年份：

文献类型

图书期刊文献学位论文多媒体

馆藏选择

电子馆藏纸本馆藏

核心期刊

全部期刊 SCI 收录期刊 SSCI 收录期刊 EI 收录期刊 CSCD 收录期刊 CSSCI 收录期刊

语言

中文英文

文献类型

期刊文献图书学位论文标准纸本馆藏

帮助

文字说明：

T=题名（书名、题名），A=作者（责任者），K=主题词，P=出版物名称，PU=出版社名称，O=机构（作者单位、学位授予单位、专利申请人），L=中图分类号，C=学科分类号，U=全部字段，Y=年（出版发行年、学位年度、标准发布年）

检索规则说明：

AND代表“并且”；OR代表“或者”；NOT代表“不包含”；(注意必须大写,运算符两边需空一格)

检索范例：

范例一：(K=图书馆学 OR K=情报学) AND A=范并思 AND Y=1982-2016
范例二：P=计算机应用与软件 AND (U=C++ OR U=Basic) NOT K=Visual AND Y=2011-2016

分类表

所选分类

>> <<

限定检索结果

文献类型

50,479 篇 会议
1,421 册 图书
1,041 篇 期刊文献
1 篇 学位论文

馆藏范围

52,940 篇 电子文献
4 种 纸本馆藏

日期分布

学科分类号

31,811 篇 工学
- 24,804 篇 计算机科学与技术...
- 12,568 篇 软件工程
- 5,153 篇 光学工程
- 4,756 篇 电气工程
- 4,436 篇 信息与通信工程
- 4,257 篇 机械工程
- 3,956 篇 控制科学与工程
- 2,474 篇 生物工程
- 1,728 篇 生物医学工程（可授...
- 1,584 篇 仪器科学与技术
- 1,317 篇 电子科学与技术（可...
- 793 篇 化学工程与技术
- 698 篇 安全科学与工程
- 542 篇 交通运输工程
- 379 篇 建筑学
- 331 篇 土木工程
11,839 篇 理学
- 6,434 篇 物理学
- 5,405 篇 数学
- 2,761 篇 生物学
- 1,910 篇 统计学（可授理学、...
- 801 篇 化学
- 669 篇 系统科学
5,305 篇 医学
- 5,094 篇 临床医学
- 729 篇 基础医学(可授医学...
- 459 篇 药学(可授医学、理...
3,350 篇 管理学
- 1,953 篇 图书情报与档案管...
- 1,535 篇 管理科学与工程(可...
- 479 篇 工商管理
720 篇 艺术学
- 718 篇 设计学（可授艺术学...
428 篇 法学
- 401 篇 社会学
297 篇 农学
197 篇 教育学
163 篇 经济学
63 篇 文学
49 篇 军事学

主题

17,385 篇 computer vision
9,017 篇 pattern recognit...
4,196 篇 training
3,815 篇 feature extracti...
3,134 篇 cameras
2,870 篇 computational mo...
2,789 篇 image segmentati...
2,622 篇 visualization
2,573 篇 shape
2,533 篇 face recognition
2,171 篇 robustness
2,123 篇 computer science
1,973 篇 object detection
1,959 篇 computer archite...
1,878 篇 layout
1,853 篇 object recogniti...
1,802 篇 three-dimensiona...
1,725 篇 neural networks
1,708 篇 humans
1,691 篇 image recognitio...

机构

165 篇 univ chinese aca...
144 篇 tsinghua univers...
136 篇 national laborat...
108 篇 univ sci & techn...
104 篇 zhejiang univers...
100 篇 shanghai jiao to...
95 篇 microsoft resear...
94 篇 university of sc...
86 篇 zhejiang univ pe...
84 篇 shanghai ai lab ...
74 篇 school of comput...
69 篇 computer vision ...
68 篇 peking univ peop...
68 篇 chinese acad sci...
65 篇 chinese univ hon...
63 篇 institute of inf...
62 篇 google res mount...
61 篇 univ oxford oxfo...
59 篇 univ toronto on
57 篇 swiss fed inst t...

作者

91 篇 van gool luc
87 篇 umapada pal
76 篇 zhang lei
64 篇 lee seong-whan
49 篇 vittorio murino
42 篇 yang yi
34 篇 nassir navab
33 篇 li xin
33 篇 jie yang
32 篇 liu yang
31 篇 escalera sergio
31 篇 loy chen change
30 篇 ling haibin
30 篇 h. bischof
29 篇 zhou jie
29 篇 vasconcelos nuno
29 篇 jan-michael frah...
29 篇 hanqing lu
28 篇 blumenstein mich...
27 篇 jia yunde

语言

51,871 篇 英文
835 篇 其他
241 篇 中文
22 篇 土耳其文
5 篇 西班牙文
2 篇 日文
2 篇 葡萄牙文
2 篇 俄文

检索条件"任意字段=IEEE Conference on Computer Vision and Pattern Recognition"

共 52943 条记录，以下是71-80 订阅

全选清除本页清除全部题录导出标记到"检索档案"

详细简洁

排序：

One-Shot Open Affordance Learning with Foundation Models

One-Shot Open Affordance Learning with Foundation Models

引用

ieee/CVF conference on computer vision and pattern recognition (CVPR)

作者： Li, Gen Sun, Deqing Sevilla-Lara, Laura Jampani, Varun Univ Edinburgh Edinburgh Midlothian Scotland Google Res Mountain View CA USA Stabil AI London England

ISBN: (纸本)9798350353013;9798350353006

We introduce One-shot Open Affordance Learning (OOAL), where a model is trained with just one example per base object category, but is expected to identify novel objects and affordances. While vision-language models excel at recognizing novel objects and scenes, they often struggle to understand finer levels of granularity such as affordances. To handle this issue, we conduct a comprehensive analysis of existing foundation models, to explore their inherent understanding of affordances and assess the potential for data-limited affordance learning. We then propose a vision-language framework with simple and effective designs that boost the alignment between visual features and affordance text embeddings. Experiments on two affordance segmentation benchmarks show that the proposed method outperforms state-of-the-art models with less than 1% of the full training data, and exhibits reasonable generalization capability on unseen objects and affordances. Project page: https://***/ooal.

关键词： Affordance Foundation Models vision-Language Models Visual Affordance Learning

来源：评论

学校读者我要写书评

暂无评论

Deep Video Codec Control for vision Models

Deep Video Codec Control for Vision Models

引用

ieee/CVF conference on computer vision and pattern recognition (CVPR)

作者： Reich, Christoph Debnath, Biplob Patel, Deep Prangemeier, Tim Cremers, Daniel Chakradhar, Srimat NEC Labs Amer Inc San Jose CA 95110 USA Tech Univ Munich Munich Germany Tech Univ Darmstadt Darmstadt Germany Munich Ctr Machine Learning MCML Munich Germany

ISBN: (纸本)9798350365474

Standardized lossy video coding is at the core of almost all real-world video processing pipelines. Rate control is used to enable standard codecs to adapt to different network bandwidth conditions or storage constraints. However, standard video codecs (e.g., H.264) and their rate control modules aim to minimize video distortion w.r.t. human quality assessment. We demonstrate empirically that standard-coded videos vastly deteriorate the performance of deep vision models. To overcome the deterioration of vision performance, this paper presents the first end-to-end learnable deep video codec control that considers both bandwidth constraints and downstream deep vision performance, while adhering to existing standardization. We demonstrate that our approach better preserves downstream deep vision performance than traditional standard video coding.

关键词： Codec Coding for Machines Deep Video Codec Control Deep vision Models End-to-end Learning Rate Control Self-supervised Learning Standardization Video Coding

来源：评论

学校读者我要写书评

暂无评论

Label Propagation for Zero-shot Classification with vision-Language Models

Label Propagation for Zero-shot Classification with Vision-L...

引用

ieee/CVF conference on computer vision and pattern recognition (CVPR)

作者： Stojnic, Vladan Kalantidis, Yannis Tolias, Giorgos Czech Tech Univ FEE VRG Prague Czech Republic NAVER LABS Europe Meylan France

ISBN: (纸本)9798350353006

vision-Language Models (VLMs) have demonstrated impressive performance on zero-shot classification, i.e. classification when provided merely with a list of class names. In this paper, we tackle the case of zero-shot classification in the presence of unlabeled data. We leverage the graph structure of the unlabeled data and introduce ZLaP, a method based on label propagation (LP) that utilizes geodesic distances for classification. We tailor LP to graphs containing both text and image features and further propose an efficient method for performing inductive inference based on a dual solution and a sparsification step. We perform extensive experiments to evaluate the effectiveness of our method on 14 common datasets and show that ZLaP outperforms the latest related works. Code: https://***/vladan-stojnic/ZLaP

关键词： label propagation vision-language models zero-shot classification

来源：评论

学校读者我要写书评

暂无评论

PromptSync: Bridging Domain Gaps in vision-Language Models through Class-Aware Prototype Alignment and Discrimination

PromptSync: Bridging Domain Gaps in Vision-Language Models t...

引用

ieee/CVF conference on computer vision and pattern recognition (CVPR)

作者： Khandelwal, Anant Glance AI Bangalore Karnataka India

ISBN: (纸本)9798350365474

The potential for zero-shot generalization in vision-language (V-L) models such as CLIP has spurred their widespread adoption in addressing numerous downstream tasks. Previous methods have employed test-time prompt tuning to adapt the model to unseen domains, but they overlooked the issue of imbalanced class distributions. In this study, we explicitly address this problem by employing class-aware prototype alignment weighted by mean class probabilities obtained for the test sample and filtered augmented views. Additionally, we ensure that the class probabilities are as accurate as possible by performing prototype discrimination using contrastive learning. The combination of alignment and discriminative loss serves as a geometric regularizer, preventing the prompt representation from collapsing onto a single class and effectively bridging the distribution gap between the source and test domains. Our method, named PromptSync, synchronizes the prompts for each test sample on both the text and vision branches of the V-L model. In empirical evaluations on the domain generalization benchmark, our method outperforms previous best methods by 2.33% in overall performance, by 1% in base-to-novel generalization, and by 2.84% in cross-dataset transfer tasks.

关键词： Visual languages

来源：评论

学校读者我要写书评

暂无评论

Blur2Blur: Blur Conversion for Unsupervised Image Deblurring on Unknown Domains

Blur2Blur: Blur Conversion for Unsupervised Image Deblurring...

引用

ieee/CVF conference on computer vision and pattern recognition (CVPR)

作者： Bang-Dang Pham Phong Tran Anh Tran Cuong Pham Rang Nguyen Minh Hoai VinAI Res Hanoi Vietnam MBZUAI Abu Dhabi U Arab Emirates Posts & Telecommun Inst Tech Hanoi Vietnam Univ Adelaide Adelaide SA Australia

ISBN: (数字)9798350353006

ISBN: (纸本)9798350353013;9798350353006

This paper presents an innovative framework designed to train an image deblurring algorithm tailored to a specific camera device. This algorithm works by transforming a blurry input image, which is challenging to deblur, into another blurry image that is more amenable to deblurring. The transformation process, from one blurry state to another, leverages unpaired data consisting of sharp and blurry images captured by the target camera device. Learning this blur-to-blur transformation is inherently simpler than direct blur-to-sharp conversion, as it primarily involves modifying blur patterns rather than the intricate task of reconstructing fine image details. The efficacy of the proposed approach has been demonstrated through comprehensive experiments on various benchmarks, where it significantly outperforms state-of-the-art methods both quantitatively and qualitatively. Our code and data are available at https://***/VinAIResearch/Blur2Blur

关键词： computer vision Codes Benchmark testing Cameras Image restoration pattern recognition Image reconstruction

来源：评论

学校读者我要写书评

暂无评论

PixelRNN: In-pixel Recurrent Neural Networks for End-to-end-optimized Perception with Neural Sensors

PixelRNN: In-pixel Recurrent Neural Networks for End-to-end-...

引用

ieee/CVF conference on computer vision and pattern recognition (CVPR)

作者： So, Haley M. Bose, Laurie Dudek, Piotr Wetzstein, Gordon Stanford Univ Stanford CA 94305 USA Univ Manchester Manchester Lancs England

ISBN: (纸本)9798350353006

Conventional image sensors digitize high-resolution images at fast frame rates, producing a large amount of data that needs to be transmitted off the sensor for further processing. This is challenging for perception systems operating on edge devices, because communication is power inefficient and induces latency. Fueled by innovations in stacked image sensor fabrication, emerging sensor-processors offer programmability and processing capabili-ties directly on the sensor. We exploit these capabilities by developing an efficient recurrent neural network architecture, PixelRNN, that encodes spatio-temporal features on the sensor using purely binary operations. PixelRNN reduces the amount of data to be transmitted off the sensor by factors up to 256 compared to the raw sensor data while offering competitive accuracy for hand gesture recognition and lip reading tasks. We experimentally validate PixelRNN using a prototype implementation on the SCAMP-5 sensor-processor platform.

关键词： In-pixel compute machine perception neural sensors sensor-processors vision sensor

来源：评论

学校读者我要写书评

暂无评论

Learning Optimized Low-Light Image Enhancement for Edge vision Tasks

Learning Optimized Low-Light Image Enhancement for Edge Visi...

引用

ieee/CVF conference on computer vision and pattern recognition (CVPR)

作者： Sharif, S. M. A. Myrzabekov, Azamat Khujaev, Nodirkhuja Tsoy, Roman Kim, Seongwan Lee, Jaeho LG Sciencepk Seoul South Korea

ISBN: (纸本)9798350365474

Low-light image enhancement (LLIE) has a significant role in edge vision applications (EVA). Despite its widespread practicability, the existing LLIE methods are impractical due to their high computational costs. This study proposed a framework to learn optimized low-light image enhancement to tackle the limitations of existing enhancement methods for accelerating EVA. The proposed framework incorporates a lightweight and mobile-friendly deep network. We optimized our proposed model with INT8 precision with a post-training quantization strategy and deployed it on an edge device. The LLIE model has achieved over 199 frames per second (FPS) on a low-power edge board. Additionally, we evaluated the practicability of an optimized model for accelerating the vision application of an edge environment. The experimental results illustrate that our optimized method can significantly accelerate the performance of SOTA vision algorithms in challenging low-light conditions for numerous everyday vision tasks, including object detection and image registration.

关键词： Edge Device Edge LLIE Edge vision Task Low-light Image Enhancement Optimized LLIE Quantization

来源：评论

学校读者我要写书评

暂无评论

Frozen Feature Augmentation for Few-Shot Image Classification

Frozen Feature Augmentation for Few-Shot Image Classificatio...

引用

ieee/CVF conference on computer vision and pattern recognition (CVPR)

作者： Bar, Andreas Houlsby, Neil Dehghani, Mostafa Kumar, Manoj Google DeepMind London England Tech Univ Carolo Wilhelmina Braunschweig Braunschweig Germany

ISBN: (纸本)9798350353006

Training a linear classifier or lightweight model on top of pretrained vision model outputs, so-called 'frozen features', leads to impressive performance on a number of downstream few-shot tasks. Currently, frozen features are not modified during training. On the other hand, when networks are trained directly on images, data augmentation is a standard recipe that improves performance with no substantial overhead. In this paper, we conduct an extensive pilot study on few-shot image classification that explores applying data augmentations in the frozen feature space, dubbed 'frozen feature augmentation (FroFA)', covering twenty augmentations in total. Our study demonstrates that adopting a deceptively simple pointwise FroFA, such as brightness, can improve few-shot performance consistently across three network architectures, three large pre-training datasets, and eight transfer datasets.

关键词： data augmentation feature augmentation few-shot learning frozen features image classification large-scale pretraining transfer learning vision transformer

来源：评论

学校读者我要写书评

暂无评论

Question Aware vision Transformer for Multimodal Reasoning

Question Aware Vision Transformer for Multimodal Reasoning

引用

ieee/CVF conference on computer vision and pattern recognition (CVPR)

作者： Ganz, Roy Kittenplont, Yair Aberdam, Aviad Ben Avraham, Elad Nuriel, Oren Mazor, Shai Litmant, Ron Technion Haifa Israel AWS AI Labs Seattle WA 98019 USA

ISBN: (纸本)9798350353006

vision-Language (VL) models have gained significant research focus, enabling remarkable advances in multimodal reasoning. These architectures typically comprise a vision encoder, a Large Language Model (LLM), and a projection module that aligns visual features with the LLM's representation space. Despite their success, a critical limitation persists: the vision encoding process remains decoupled from user queries, often in the form of image-related questions. Consequently, the resulting visual features may not be optimally attuned to the query-specific elements of the image. To address this, we introduce QA-ViT, a Question Aware vision Transformer approach for multimodal reasoning, which embeds question awareness directly within the vision encoder. This integration results in dynamic visual features focusing on relevant image aspects to the posed question. QA-ViT is model-agnostic and can be incorporated efficiently into any VL architecture. Extensive experiments demonstrate the effectiveness of applying our method to various multimodal architectures, leading to consistent improvement across diverse tasks and showcasing its potential for enhancing visual and scene-text understanding.

关键词：

来源：评论

学校读者我要写书评

暂无评论

A vision Check-up for Language Models

A Vision Check-up for Language Models

引用

ieee/CVF conference on computer vision and pattern recognition (CVPR)

作者： Sharma, Pratyusha Shaham, Tamar Rott Baradad, Manel Fu, Stephanie Rodriguez-Munoz, Adrian Duggal, Shivam Isola, Phillip Torralba, Antonio MIT CSAIL Cambridge MA 02139 USA Univ Calif Berkeley Berkeley CA USA

ISBN: (纸本)9798350353006

What does learning to model relationships between strings teach Large Language Models (LLMs) about the visual world? We systematically evaluate LLMs' abilities to generate and recognize an assortment of visual concepts of increasing complexity and then demonstrate how a preliminary visual representation learning system can be trained using models of text. As language models lack the ability to consume or output visual information as pixels, we use code to represent images in our study. Although LLM-generated images do not look like natural images, results on image generation and the ability of models to correct these generated images indicate that precise modeling of strings can teach language models about numerous aspects of the visual world. Furthermore, experiments on self-supervised visual representation learning, utilizing images generated with text models, highlight the potential to train vision models capable of making semantic assessments of natural images using just LLMs.

关键词： Contrastive Learning

来源：评论

学校读者我要写书评

暂无评论

没有更多数据了...

全选清除本页清除全部题录导出标记到“检索档案”

共500页 << < 4 5 6 7 8 9 10 11 12 13 > >>

检索报告对象比较合并检索0

隐藏清空

合并搜索

回到顶部

执行限定条件

内容：

评分：

请选择保存的检索档案：

请选择收藏分类：

订阅名称：

通借通还

温馨提示：

图书名称：

借书校区：

取书校区：

手机号码：

邮箱地址：

一卡通帐号：

电话和邮箱必须正确填写，我们会与您联系确认。

联系人：

所在院系：

联系邮箱：

联系电话：

内蒙古自治区呼和浩特市赛罕区大学西街235号邮编: 010021

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：