检索结果-内蒙古大学图书馆

您好，读者！请登录

内蒙古大学图书馆

首页
概况
党建
资源
服务
科研支持
- 论文收录引用证明
- 科技查新
知识产权
档案馆
帮助

咨询与建议

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

您的常用邮箱：*

您的手机号码：*

问题描述：

当前已输入0个字，您还可以输入200个字

全部搜索
期刊论文
图书
学位论文
标准
纸本馆藏
外文资源发现
数据库导航
超星发现

高级检索

分类表

所选分类

>> <<

限定检索结果

标题

标题
作者
主题词
出版物名称
出版社
机构
学科分类号
摘要
ISBN
ISSN
基金资助
索书号

作者

作者
标题
主题词
出版物名称
出版社
机构
学科分类号
摘要
ISBN
ISSN
基金资助
索书号

文献类型

23,001 篇 会议
126 册 图书
92 篇 期刊文献

馆藏范围

23,218 篇 电子文献
1 种 纸本馆藏

日期分布

学科分类号

13,623 篇 工学
- 11,108 篇 计算机科学与技术...
- 3,479 篇 软件工程
- 2,445 篇 机械工程
- 1,716 篇 光学工程
- 1,075 篇 电气工程
- 1,014 篇 控制科学与工程
- 785 篇 信息与通信工程
- 412 篇 仪器科学与技术
- 352 篇 生物工程
- 251 篇 生物医学工程（可授...
- 196 篇 电子科学与技术（可...
- 114 篇 化学工程与技术
- 108 篇 安全科学与工程
- 100 篇 测绘科学与技术
- 88 篇 建筑学
- 87 篇 交通运输工程
- 84 篇 土木工程
3,494 篇 医学
- 3,481 篇 临床医学
- 81 篇 基础医学(可授医学...
3,242 篇 理学
- 1,939 篇 物理学
- 1,640 篇 数学
- 563 篇 统计学（可授理学、...
- 500 篇 生物学
- 249 篇 系统科学
- 107 篇 化学
522 篇 管理学
- 311 篇 图书情报与档案管...
- 224 篇 管理科学与工程(可...
- 76 篇 工商管理
276 篇 艺术学
- 276 篇 设计学（可授艺术学...
66 篇 法学
- 63 篇 社会学
38 篇 农学
28 篇 教育学
22 篇 经济学
10 篇 军事学
3 篇 文学

主题

10,187 篇 computer vision
3,967 篇 pattern recognit...
3,005 篇 training
2,007 篇 computational mo...
1,818 篇 visualization
1,815 篇 cameras
1,516 篇 feature extracti...
1,481 篇 shape
1,455 篇 three-dimensiona...
1,438 篇 image segmentati...
1,287 篇 robustness
1,205 篇 computer archite...
1,155 篇 semantics
1,147 篇 conferences
1,107 篇 layout
1,092 篇 computer science
1,087 篇 object detection
1,025 篇 benchmark testin...
970 篇 codes
922 篇 face recognition

机构

136 篇 univ sci & techn...
121 篇 univ chinese aca...
118 篇 chinese univ hon...
107 篇 carnegie mellon ...
101 篇 tsinghua univers...
101 篇 microsoft resear...
95 篇 swiss fed inst t...
93 篇 zhejiang univ pe...
82 篇 university of sc...
81 篇 zhejiang univers...
80 篇 university of ch...
77 篇 shanghai ai lab ...
72 篇 shanghai jiao to...
69 篇 national laborat...
67 篇 microsoft res as...
67 篇 alibaba grp peop...
64 篇 adobe research
61 篇 tsinghua univ pe...
60 篇 peking univ peop...
59 篇 univ oxford oxfo...

作者

81 篇 van gool luc
72 篇 timofte radu
64 篇 zhang lei
47 篇 luc van gool
40 篇 yang yi
40 篇 li stan z.
37 篇 loy chen change
34 篇 chen chen
33 篇 xiaoou tang
32 篇 liu yang
32 篇 qi tian
31 篇 tian qi
31 篇 sun jian
30 篇 murino vittorio
30 篇 pascal fua
29 篇 darrell trevor
29 篇 li fei-fei
28 篇 li xin
28 篇 ying shan
27 篇 vasconcelos nuno

语言

23,137 篇 英文
53 篇 其他
22 篇 中文
5 篇 土耳其文
2 篇 日文

检索条件"任意字段=IEEE Conference on Computer Vision and Pattern Recognition Workshops"

共 23219 条记录，以下是281-290 订阅

全选清除本页清除全部题录导出标记到"检索档案"

详细简洁

排序：

相关度排序

相关度排序
时效性降序
时效性升序

A Closer Look at the Few-Shot Adaptation of Large vision-Language Models

A Closer Look at the Few-Shot Adaptation of Large Vision-Lan...

引用

ieee/CVF conference on computer vision and pattern recognition (CVPR)

作者： Iguez, Julio Silva-Rodr Hajimiri, Sina Ben Ayed, Ismail Dolz, Jose ETS Montreal Montreal PQ Canada

ISBN: (纸本)9798350353006

Efficient transfer learning (ETL) is receiving increasing attention to adapt large pre-trained language-vision models on downstream tasks with a few labeled samples. While significant progress has been made, we reveal that state-of-the-art ETL approaches exhibit strong performance only in narrowly-defined experimental setups, and with a careful adjustment of hyperparameters based on a large corpus of labeled samples. In particular, we make two interesting, and surprising empirical observations. First, to out-perform a simple Linear Probing baseline, these methods require to optimize their hyper-parameters on each target task. And second, they typically underperform -sometimes dramatically-standard zero-shot predictions in the presence of distributional drifts. Motivated by the unrealistic assumptions made in the existing literature, i.e., access to a large validation set and case-specific grid-search for optimal hyperparameters, we propose a novel approach that meets the requirements of real-world scenarios. More concretely, we introduce a CLass-Adaptive linear Probe ( CLAP) objective, whose balancing term is optimized via an adaptation of the general Augmented Lagrangian method tailored to this context. We comprehensively evaluate CLAP on a broad span of datasets and scenarios, demonstrating that it consistently outperforms SoTA approaches, while yet being a much more efficient alternative. Code available at https://***/jusiro/CLAP.

关键词：

来源：评论

学校读者我要写书评

暂无评论

YOLO-World: Real-Time Open-Vocabulary Object Detection

YOLO-World: Real-Time Open-Vocabulary Object Detection

引用

ieee/CVF conference on computer vision and pattern recognition (CVPR)

作者： Cheng, Tianheng Sone, Lin Ge, Yixiao Liu, Wenyu Wang, Xinggang Shan, Yong Tencent AI Lab Shenzhen Guangdong Peoples R China Tencent PCG ARC Lab Shenzhen Guangdong Peoples R China Huazhong Univ Sci & Technol Sch EIC Wuhan Hubei Peoples R China

ISBN: (纸本)9798350353006

The You Only Look Once (YOLO) series of detectors have established themselves as efficient and practical tools. However, their reliance on predefined and trained object categories limits their applicability in open scenarios. Addressing this limitation, we introduce YOLO-World, an innovative approach that enhances YOLO with open-vocabulary detection capabilities through vision-language modeling and pre-training on large-scale datasets. Specifically, we propose a new Re-parameterizable vision-Language Path Aggregation Network (RepVL-PAN) and region-text contrastive loss to facilitate the interaction between visual and linguistic information. Our method excels in detecting a wide range of objects in a zero-shot manner with high efficiency. On the challenging LVIS dataset, YOLO- World achieves 35.4 AP with 52.0 FPS on V100, which outperforms many state-of-the-art methods in terms of both accuracy and speed. Furthermore, the fine-tuned YOLO-World achieves remarkable performance on several downstream tasks, including object detection and open-vocabulary instance segmentation. Code and models are available at: https://***/AILab-CVC/YOLO-World.

关键词： Object detection open-vocabulary object detection real-time object detection vision-language modeling vision-language pre-training YOLO

来源：评论

学校读者我要写书评

暂无评论

Unlocking Pre-trained Image Backbones for Semantic Image Synthesis

Unlocking Pre-trained Image Backbones for Semantic Image Syn...

引用

ieee/CVF conference on computer vision and pattern recognition (CVPR)

作者： Berrada, Tariq Verbeek, Jakob Couprie, Camille Alahari, Karteek Meta FAIR Menlo Pk CA 94025 USA Univ Grenoble Alpes LJK Grenoble INP InriaCNRS Grenoble France

ISBN: (纸本)9798350353013;9798350353006

Semantic image synthesis, i.e., generating images from user-provided semantic label maps, is an important conditional image generation task as it allows to control both the content as well as the spatial layout of generated images. Although diffusion models have pushed the state of the art in generative image modeling, the iterative nature of their inference process makes them computationally demanding. Other approaches such as GANs are more efficient as they only need a single feed-forward pass for generation, but the image quality tends to suffer when modeling large and diverse datasets. In this work, we propose a new class of GAN discriminators for semantic image synthesis that generates highly realistic images by exploiting feature backbones pre-trained for tasks such as image classification. We also introduce a new generator architecture with better context modeling and using cross-attention to inject noise into latent variables, leading to more diverse generated images. Our model, which we dub DP-SIMS, achieves state-of-the-art results in terms of image quality and consistency with the in-put label maps on ADE-20K, COCO-Stuff, and Cityscapes, surpassing recent diffusion models while requiring two orders of magnitude less compute for inference.

关键词： computer vision GAN Generative Modeling Image-to-image Semantic Synthesis

来源：评论

学校读者我要写书评

暂无评论

Task-aligned Part-aware Panoptic Segmentation through Joint Object-Part Representations

Task-aligned Part-aware Panoptic Segmentation through Joint ...

引用

ieee/CVF conference on computer vision and pattern recognition (CVPR)

作者： de Geus, Daan Dubbelman, Gijs Eindhoven Univ Technol Eindhoven Netherlands

ISBN: (纸本)9798350353013;9798350353006

Part-aware panoptic segmentation (PPS) requires (a) that each foreground object and background region in an image is segmented and classified, and (b) that all parts within foreground objects are segmented, classified and linked to their parent object. Existing methods approach PPS by separately conducting object-level and part-level segmentation. However, their part-level predictions are not linked to individual parent objects. Therefore, their learning objective is not aligned with the PPS task objective, which harms the PPS performance. To solve this, and make more accurate PPS predictions, we propose Task-Aligned Part-aware Panoptic Segmentation (TAPPS). This method uses a set of shared queries to jointly predict (a) object-level segments, and (b) the part-level segments within those same objects. As a result, TAPPS learns to predict part-level segments that are linked to individual parent objects, aligning the learning objective with the task objective, and allowing TAPPS to leverage joint object-part representations. With experiments, we show that TAPPS considerably outperforms methods that predict objects and parts separately, and achieves new state-of-the-art PPS results.

关键词： computer vision image segmentation panoptic segmentation part segmentation part-aware panoptic segmentation scene understanding semantic segmentation

来源：评论

学校读者我要写书评

暂无评论

SleepVST: Sleep Staging from Near-Infrared Video Signals using Pre-Trained Transformers

SleepVST: Sleep Staging from Near-Infrared Video Signals usi...

引用

ieee/CVF conference on computer vision and pattern recognition (CVPR)

作者： Carter, Jonathan F. Jorge, Joao Gibson, Oliver Tarassenkol, Lionel Univ Oxford Inst Biomed Engn Oxford England Oxehealth Ltd Oxford England

ISBN: (纸本)9798350353006

Advances in camera-based physiological monitoring have enabled the robust, non-contact measurement of respiration and the cardiac pulse, which are known to be indicative of the sleep stage. This has led to research into camera-based sleep monitoring as a promising alternative to "gold-standard" polysomnography, which is cumbersome, expensive to administer, and hence unsuitable for longer-term clinical studies. In this paper, we introduce SleepVST, a transformer model which enables state-of-the-art performance in camera-based sleep stage classification (sleep staging). After pre-training on contact sensor data, SleepVST outperforms existing methods for cardio-respiratory sleep staging on the SHHS and MESA datasets, achieving total Cohen's kappa scores of 0.75 and 0.77 respectively. We then show that SleepVST can be successfully transferred to cardio-respiratory waveforms extracted from video, enabling fully contact-free sleep staging. Using a video dataset of 50 nights, we achieve a total accuracy of 78.8% and a Cohen's. of 0.71 in four-class video-based sleep staging, setting a new state-of-the-art in the domain.

关键词： computer vision remote monitoring sleep staging transformers

来源：评论

学校读者我要写书评

暂无评论

Fooling Polarization-based vision using Locally Controllable Polarizing Projection

Fooling Polarization-based Vision using Locally Controllable...

引用

ieee/CVF conference on computer vision and pattern recognition (CVPR)

作者： Li, Zhuoxiao Zhong, Zhihang Nobuhara, Shohei Nishino, Ko Zheng, Yinqiang Univ Tokyo Tokyo Japan Shanghai Artificial Intelligence Lab Shanghai Peoples R China Kyoto Univ Kyoto Japan

ISBN: (纸本)9798350353006

Polarization is a fundamental property of light that encodes abundant information regarding surface shape, material, illumination and viewing geometry. The computer vision community has witnessed a blossom of polarization-based vision applications, such as reflection removal, shape-from-polarization (SfP), transparent object segmentation and color constancy, partially due to the emergence of single-chip mono/color polarization sensors that make polarization data acquisition easier than ever. However, is polarization-based vision vulnerable to adversarial attacks? If so, is that possible to realize these adversarial attacks in the physical world, without being perceived by human eyes? In this paper, we warn the community of the vulnerability of polarization-based vision, which can be more serious than RGB-based vision. By adapting a commercial LCD projector, we achieve locally controllable polarizing projection, which is successfully utilized to fool state-of-the-art polarization-based vision algorithms for glass segmentation and SfP. Compared with existing physical attacks on RGB-based vision, which always suffer from the trade-off between attack efficacy and eye conceivability, the adversarial attackers based on polarizing projection are contact-free and visually imperceptible, since naked human eyes can rarely perceive the difference of viciously manipulated polarizing light and ordinary illumination. This poses unprecedented risks on polarization-based vision, for which due attentions should be paid and counter measures be considered.

关键词：

来源：评论

学校读者我要写书评

暂无评论

From Pixels to Graphs: Open-Vocabulary Scene Graph Generation with vision-Language Models

From Pixels to Graphs: Open-Vocabulary Scene Graph Generatio...

引用

ieee/CVF conference on computer vision and pattern recognition (CVPR)

作者： Li, Rongjie Zhang, Songyang Lin, Dahua Chen, Kai He, Xuming ShanghaiTech Univ Sch Informat Sci & Technol Shanghai Peoples R China Shanghai AI Lab Shanghai Peoples R China Shanghai Engn Res Ctr Intelligent Vis & Imaging Shanghai Peoples R China

ISBN: (纸本)9798350353006

Scene graph generation (SGG) aims to parse a visual scene into an intermediate graph representation for down-stream reasoning tasks. Despite recent advancements, existing methods struggle to generate scene graphs with novel visual relation concepts. To address this challenge, we introduce a new open-vocabulary SGG framework based on sequence generation. Our framework leverages vision-language pre-trained models (VLM) by incorporating an image-to-graph generation paradigm. Specifically, we generate scene graph sequences via image-to-text generation with VLM and then construct scene graphs from these sequences. By doing so, we harness the strong capabilities of VLM for open-vocabulary SGG and seamlessly integrate explicit relational modeling for enhancing the VL tasks. Experimental results demonstrate that our design not only achieves superior performance with an open vocabulary but also enhances downstream vision-language task performance through explicit relation modeling knowledge.

关键词： Scene Graph Generation Scene Understanding vision-language

来源：评论

学校读者我要写书评

暂无评论

Exploring Regional Clues in CLIP for Zero-Shot Semantic Segmentation

Exploring Regional Clues in CLIP for Zero-Shot Semantic Segm...

引用

ieee/CVF conference on computer vision and pattern recognition (CVPR)

作者： Zhang, Yi Guo, Meng-Hao Wang, Miao Hu, Shi-Min Beihang Univ State Key Lab Virtual Real Technol & Syst SCSE Beijing Peoples R China Tsinghua Univ Dept Comp Sci & Technol BNRist Beijing Peoples R China Tsinghua Univ Beijing Peoples R China

ISBN: (纸本)9798350353013;9798350353006

CLIP has demonstrated marked progress in visual recognition due to its powerful pre-training on large-scale image-text pairs. However, it still remains a critical challenge: how to transfer image-level knowledge into pixel-level understanding tasks such as semantic segmentation. In this paper, to solve the mentioned challenge, we analyze the gap between the capability of the CLIP model and the requirement of the zero-shot semantic segmentation task. Based on our analysis and observations, we propose a novel method for zero-shot semantic segmentation, dubbed CLIP-RC (CLIP with Regional Clues), bringing two main insights. On the one hand, a region-level bridge is necessary to provide fine-grained semantics. On the other hand, overfitting should be mitigated during the training stage. Benefiting from the above discoveries, CLIP-RC achieves state-of-the-art performance on various zero-shot semantic segmentation benchmarks, including PASCAL VOC, PASCAL Context, and COCO-Stuff 164K. Code will be available at https://***/Jittor/JSeg.

关键词： computer vision semantic segmentation zero-shot semantic segmentation

来源：评论

学校读者我要写书评

暂无评论

Part-aware Unified Representation of Language and Skeleton for Zero-shot Action recognition

Part-aware Unified Representation of Language and Skeleton f...

引用

ieee/CVF conference on computer vision and pattern recognition (CVPR)

作者： Zhu, Anqi Ke, Qiuhong Gong, Mingming Bailey, James Univ Melbourne Melbourne Vic Australia Monash Univ Parkville Vic 3052 Australia Wellington Rd Clayton Vic 3800 Australia

ISBN: (纸本)9798350353006

While remarkable progress has been made on supervised skeleton-based action recognition, the challenge of zero-shot recognition remains relatively unexplored. In this paper, we argue that relying solely on aligning label-level semantics and global skeleton features is insufficient to effectively transfer locally consistent visual knowledge from seen to unseen classes. To address this limitation, we introduce Part-aware Unified Representation between Language and Skeleton (PURLS) to explore visual-semantic alignment at both local and global scales. PURLS introduces a new prompting module and a novel partitioning module to generate aligned textual and visual representations across different levels. The former leverages a pre-trained GPT-3 to infer refined descriptions of the global and local (body-part-based and temporal-interval-based) movements from the original action labels. The latter employs an adaptive sampling strategy to group visual features from all body joint movements that are semantically relevant to a given description. Our approach is evaluated on various skeleton/language backbones and three large-scale datasets, i.e., NTU-RGB+D 60, NTU-RGB+D 120, and a newly curated dataset Kinetics-skeleton 200. The results showcase the universality and superior performance of PURLS, surpassing prior skeleton-based solutions and standard baselines from other domains. The source codes can be accessed at https://***/azzh1/PURLS.

关键词： action recognition computer vision contrastive learning large language model representation learning skeleton action recognition zero-shot learning

来源：评论

学校读者我要写书评

暂无评论

Unlocking the Potential of Pre-trained vision Transformers for Few-Shot Semantic Segmentation through Relationship Descriptors

Unlocking the Potential of Pre-trained Vision Transformers f...

引用

ieee/CVF conference on computer vision and pattern recognition (CVPR)

作者： Zhou, Ziqin Xu, Hai-Ming Shu, Yangyang Liu, Lingqiao Univ Adelaide Adelaide SA Australia

ISBN: (纸本)9798350353013;9798350353006

The recent advent of pre-trained vision transformers has unveiled a promising property: their inherent capability to group semantically related visual concepts. In this paper, we explore to harnesses this emergent feature to tackle few-shot semantic segmentation, a task focused on classifying pixels in a test image with a few example data. A critical hurdle in this endeavor is preventing overfitting to the limited classes seen during training the few-shot segmentation model. As our main discovery, we find that the concept of "relationship descriptors", initially conceived for enhancing the CLIP model for zero-shot semantic segmentation, offers a potential solution. We adapt and refine this concept to craft a relationship descriptor construction tailored for few-shot semantic segmentation, extending its application across multiple layers to enhance performance. Building upon this adaptation, we proposed a few-shot semantic segmentation framework that is not only easy to implement and train but also effectively scales with the number of support examples and categories. Through rigorous experimentation across various datasets, including PASCAL-5(i) and COCO-20(i), we demonstrate a clear advantage of our method in diverse few-shot semantic segmentation scenarios, and a range of pre-trained vision transformer models. The findings clearly show that our method significantly outperforms current state-of-the-art techniques, highlighting the effectiveness of harnessing the emerging capabilities of vision transformers for few-shot semantic segmentation. We release the code at https://***/ZiqinZhou66/***.

关键词： Semantic Segmentation

来源：评论

学校读者我要写书评

暂无评论

没有更多数据了...

全选清除本页清除全部题录导出标记到“检索档案”

共500页 << < 25 26 27 28 29 30 31 32 33 34 > >>

检索报告对象比较合并检索0

隐藏清空

合并搜索

回到顶部

执行限定条件

内容：

评分：

请选择保存的检索档案：

请选择收藏分类：

订阅名称：

通借通还

温馨提示：

图书名称：

借书校区：

取书校区：

手机号码：

邮箱地址：

一卡通帐号：

电话和邮箱必须正确填写，我们会与您联系确认。

联系人：

所在院系：

联系邮箱：

联系电话：

内蒙古自治区呼和浩特市赛罕区大学西街235号邮编: 010021

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：