检索结果-内蒙古大学图书馆

您好，读者！请登录

内蒙古大学图书馆

首页
概况
党建
资源
服务
科研支持
- 论文收录引用证明
- 科技查新
知识产权
档案馆
帮助

咨询与建议

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

您的常用邮箱：*

您的手机号码：*

问题描述：

当前已输入0个字，您还可以输入200个字

全部搜索
期刊论文
图书
学位论文
标准
纸本馆藏
外文资源发现
数据库导航
超星发现

高级检索

分类表

所选分类

>> <<

限定检索结果

标题

标题
作者
主题词
出版物名称
出版社
机构
学科分类号
摘要
ISBN
ISSN
基金资助
索书号

作者

作者
标题
主题词
出版物名称
出版社
机构
学科分类号
摘要
ISBN
ISSN
基金资助
索书号

文献类型

12,844 篇 会议
13 篇 期刊文献
2 册 图书

馆藏范围

12,859 篇 电子文献
0 种 纸本馆藏

日期分布

学科分类号

7,573 篇 工学
- 6,863 篇 计算机科学与技术...
- 880 篇 机械工程
- 814 篇 软件工程
- 435 篇 控制科学与工程
- 360 篇 光学工程
- 306 篇 电气工程
- 209 篇 仪器科学与技术
- 124 篇 信息与通信工程
- 91 篇 生物工程
- 62 篇 生物医学工程（可授...
- 39 篇 电子科学与技术（可...
- 34 篇 安全科学与工程
- 26 篇 化学工程与技术
- 21 篇 交通运输工程
- 20 篇 建筑学
- 18 篇 土木工程
2,957 篇 医学
- 2,956 篇 临床医学
- 15 篇 基础医学(可授医学...
- 12 篇 药学(可授医学、理...
700 篇 理学
- 359 篇 物理学
- 225 篇 数学
- 175 篇 系统科学
- 95 篇 统计学（可授理学、...
- 93 篇 生物学
- 22 篇 化学
201 篇 艺术学
- 201 篇 设计学（可授艺术学...
84 篇 管理学
- 59 篇 图书情报与档案管...
- 25 篇 管理科学与工程(可...
- 14 篇 工商管理
23 篇 法学
- 21 篇 社会学
5 篇 农学
4 篇 教育学
2 篇 经济学
1 篇 军事学

主题

6,464 篇 computer vision
2,688 篇 training
2,437 篇 pattern recognit...
1,780 篇 computational mo...
1,522 篇 visualization
1,348 篇 three-dimensiona...
1,091 篇 computer archite...
1,063 篇 semantics
997 篇 benchmark testin...
976 篇 codes
970 篇 conferences
854 篇 feature extracti...
830 篇 cameras
771 篇 task analysis
707 篇 deep learning
646 篇 image segmentati...
611 篇 object detection
595 篇 shape
554 篇 transformers
538 篇 neural networks

机构

132 篇 univ sci & techn...
122 篇 carnegie mellon ...
120 篇 tsinghua univ pe...
114 篇 univ chinese aca...
113 篇 chinese univ hon...
94 篇 tsinghua univers...
91 篇 zhejiang univ pe...
91 篇 swiss fed inst t...
85 篇 peng cheng lab p...
81 篇 university of ch...
80 篇 zhejiang univers...
77 篇 shanghai ai lab ...
77 篇 peng cheng labor...
75 篇 university of sc...
69 篇 shanghai jiao to...
68 篇 shanghai jiao to...
67 篇 alibaba grp peop...
67 篇 stanford univ st...
66 篇 univ hong kong p...
64 篇 sensetime res pe...

作者

77 篇 timofte radu
63 篇 van gool luc
45 篇 zhang lei
36 篇 yang yi
36 篇 luc van gool
34 篇 tao dacheng
31 篇 loy chen change
29 篇 chen chen
28 篇 sun jian
28 篇 qi tian
25 篇 li xin
24 篇 liu yang
24 篇 tian qi
24 篇 ying shan
23 篇 wang xinchao
23 篇 zha zheng-jun
23 篇 boxin shi
21 篇 zhou jie
21 篇 vasconcelos nuno
20 篇 luo ping

语言

12,851 篇 英文
7 篇 其他
1 篇 中文

检索条件"任意字段=IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops"

共 12859 条记录，以下是251-260 订阅

全选清除本页清除全部题录导出标记到"检索档案"

详细简洁

排序：

CPLIP: Zero-Shot Learning for Histopathology with Comprehensive vision-Language Alignment

CPLIP: Zero-Shot Learning for Histopathology with Comprehens...

引用

ieee/cvf conference on computer vision and pattern recognition (CVPR)

作者： Javed, Sajid Mahmood, Arif Ganapathil, Iyyakutti Iyappan Dharej, Fayaz Ali Werghil, Naoufel Bennamoun, Mohammed Khalifa Univ Sci & Technol Dept Comp Sci Abu Dhabi U Arab Emirates Khalifa Univ Sci & Technol C2PS Abu Dhabi U Arab Emirates Informat Technol Univ Punjab Lahore Pakistan Univ Western Australia Perth WA Australia

ISBN: (纸本)9798350353006

This paper proposes Comprehensive Pathology Language Image Pre-training (CPLIP), a new unsupervised technique designed to enhance the alignment of images and text in histopathology for tasks such as classification and segmentation. This methodology enriches vision-language models by leveraging extensive data without needing ground truth annotations. CPLIP involves constructing a pathology-specific dictionary, generating textual descriptions for images using language models, and retrieving relevant images for each text snippet via a pre-trained model. The model is then fine-tuned using a many-to-many contrastive learning method to align complex interrelated concepts across both modalities. Evaluated across multiple histopathology tasks, CPLIP shows notable improvements in zero-shot learning scenarios, outperforming existing methods in both interpretability and robustness and setting a higher benchmark for the application of vision-language models in the field. To encourage further research and replication, the code for CPLIP is available on GitHub at https://***/

关键词： Cancer Detection Computational Pathology Contrastive Loss Histopathology Many-to-Many vision-Language Alignment vision Language Modeling Whole Slide Image Zero-shot Learning

来源：评论

学校读者我要写书评

暂无评论

Unlocking Pre-trained Image Backbones for Semantic Image Synthesis

Unlocking Pre-trained Image Backbones for Semantic Image Syn...

引用

ieee/cvf conference on computer vision and pattern recognition (CVPR)

作者： Berrada, Tariq Verbeek, Jakob Couprie, Camille Alahari, Karteek Meta FAIR Menlo Pk CA 94025 USA Univ Grenoble Alpes LJK Grenoble INP InriaCNRS Grenoble France

ISBN: (纸本)9798350353013;9798350353006

Semantic image synthesis, i.e., generating images from user-provided semantic label maps, is an important conditional image generation task as it allows to control both the content as well as the spatial layout of generated images. Although diffusion models have pushed the state of the art in generative image modeling, the iterative nature of their inference process makes them computationally demanding. Other approaches such as GANs are more efficient as they only need a single feed-forward pass for generation, but the image quality tends to suffer when modeling large and diverse datasets. In this work, we propose a new class of GAN discriminators for semantic image synthesis that generates highly realistic images by exploiting feature backbones pre-trained for tasks such as image classification. We also introduce a new generator architecture with better context modeling and using cross-attention to inject noise into latent variables, leading to more diverse generated images. Our model, which we dub DP-SIMS, achieves state-of-the-art results in terms of image quality and consistency with the in-put label maps on ADE-20K, COCO-Stuff, and Cityscapes, surpassing recent diffusion models while requiring two orders of magnitude less compute for inference.

关键词： computer vision GAN Generative Modeling Image-to-image Semantic Synthesis

来源：评论

学校读者我要写书评

暂无评论

Situational Awareness Matters in 3D vision Language Reasoning

Situational Awareness Matters in 3D Vision Language Reasonin...

引用

ieee/cvf conference on computer vision and pattern recognition (CVPR)

作者： Man, Yunze Gui, Liang-Yan Wang, Yu-Xiong Univ Illinois Urbana IL 61801 USA

ISBN: (纸本)9798350353006

Being able to carry out complicated vision language reasoning tasks in 3D space represents a significant milestone in developing household robots and human-centered embodied AI. In this work, we demonstrate that a critical and distinct challenge in 3D vision language reasoning is the situational awareness, which incorporates two key components: (1) The autonomous agent grounds its self-location based on a language prompt. (2) The agent answers open-ended questions from the perspective of its calculated position. To address this challenge, we introduce SIG3D, an end-to-end Situation-Grounded model for 3D vision language reasoning. We tokenize the 3D scene into sparse voxel representation, and propose a language-grounded situation estimator, followed by a situated question answering module. Experiments on the SQA3D and ScanQA datasets show that SIG3D outperforms state-of-the-art models in situational estimation and question answering by a large margin (e.g., an enhancement of over 30% on situation accuracy). Subsequent analysis corroborates our architectural design choices, explores the distinct functions of visual and textual tokens, and highlights the importance of situational awareness in the domain of 3D question-answering. Project page is available at https://***/situation3d.

关键词： vision-Language Multi-modal 3D Reasoning

来源：评论

学校读者我要写书评

暂无评论

WALT3D: Generating Realistic Training Data from Time-Lapse Imagery for Reconstructing Dynamic Objects under Occlusion

WALT3D: Generating Realistic Training Data from Time-Lapse I...

引用

ieee/cvf conference on computer vision and pattern recognition (CVPR)

作者： Khiem Vuong Reddy, N. Dinesh Tamburo, Robert Narasimhan, Srinivasa G. Carnegie Mellon Univ Pittsburgh PA 15213 USA Amazon Seattle WA USA

ISBN: (纸本)9798350353006

Current methods for 2D and 3D object understanding struggle with severe occlusions in busy urban environments, partly due to the lack of large-scale labeled groundtruth annotations for learning occlusion. In this work, we introduce a novel framework for automatically generating a large, realistic dataset of dynamic objects under occlusions using freely available time-lapse imagery. By leveraging off-the-shelf 2D (bounding box, segmentation, keypoint) and 3D (pose, shape) predictions as pseudo-groundtruth, unoccluded 3D objects are identified automatically and composited into the background in a clip-art style, ensuring realistic appearances and physically accurate occlusion configurations. The resulting clip-art image with pseudo-groundtruth enables efficient training of object reconstruction methods that are robust to occlusions. Our method demonstrates significant improvements in both 2D and 3D reconstruction, particularly in scenarios with heavily occluded objects like vehicles and people in urban scenes.

关键词： 3D from single images computer vision

来源：评论

学校读者我要写书评

暂无评论

Deep Prototypical-Parts Ease Morphological Kidney Stone Identification and are Competitively Robust to Photometric Perturbations

Deep Prototypical-Parts Ease Morphological Kidney Stone Iden...

引用

ieee/cvf conference on computer vision and pattern recognition (CVPR)

作者： Flores-Araiza, Daniel Lopez-Tiro, Francisco El-Beze, Jonathan Hubert, Jacques Gonzalez, Miguel Ruiz, Gilberto Ochoa Daul, Christian Tecnol Monterrey Sch Engn Mexico City DF Mexico CHU Nancy Serv Urol Brabois Nancy France Univ Lorraine CRAN UMR 7039 Nancy France

ISBN: (纸本)9798350302493

Identifying the type of kidney stones can allow urologists to determine their cause of formation, improving the prescription of appropriate treatments to diminish future relapses. Currently, the associated ex-vivo diagnosis (known as Morpho-constitutional Analysis, MCA) is time-consuming, expensive and requires a great deal of experience, as it requires a visual analysis component that is highly operator dependant. Recently, machine learning methods have been developed for in-vivo endoscopic stone recognition. Deep Learning (DL) based methods outperform non-DL methods in terms of accuracy but lack explainability. Despite this trade-off, when it comes to making high-stakes decisions, its important to prioritize understandable computer-Aided Diagnosis (CADx) that suggests a course of action based on reasonable evidence, rather than a model prescribing a course of action. In this proposal, we learn Prototypical Parts (PPs) per kidney stone subtype, which are used by the DL model to generate an output classification. Using PPs in the classification task enables case-based reasoning explanations for such output, thus making the model interpretable. In addition, we modify global visual characteristics to describe their relevance to the PPs and the sensitivity of our models performance. With this, we provide explanations with additional information at the sample, class and model levels in contrast to previous works. Although our implementations average accuracy is lower than state-of-the-art (SOTA) non-interpretable DL models by 1.5%, our models perform 2.8% better on perturbed images with a lower standard deviation, without adversarial training. Thus, Learning PPs has the potential to create more robust DL models. Code at: https://***/DanielF29/Prototipical_Parts

关键词： computer aided diagnosis

来源：评论

学校读者我要写书评

暂无评论

LQMFormer: Language-aware Query Mask Transformer for Referring Image Segmentation

LQMFormer: Language-aware Query Mask Transformer for Referri...

引用

ieee/cvf conference on computer vision and pattern recognition (CVPR)

作者： Shah, Nisarg A. Vibashan, V. S. Patel, Vishal M. Johns Hopkins Univ Baltimore MD 21218 USA

ISBN: (纸本)9798350353006

Referring Image Segmentation (RIS) aims to segment objects from an image based on a language description. Recent advancements have introduced transformer-based methods that leverage cross-modal dependencies, significantly enhancing performance in referring segmentation tasks. These methods are designed such that each query predicts different masks. However, RIS inherently requires a single-mask prediction, leading to a phenomenon known as Query Collapse, where all queries yield the same mask prediction. This reduces the generalization capability of the RIS model for complex or novel scenarios. To address this issue, we propose a Multi-modal Query Feature Fusion technique, characterized by two innovative designs: (1) Gaussian enhanced Multi-Modal Fusion, a novel visual grounding mechanism that enhances overall representation by extracting rich local visual information and global visual-linguistic relationships, and (2) A Dynamic Query Module that produces a diverse set of queries through a scoring network where the network selectively focuses on queries for objects referred to in the language description. Moreover, we show that including an auxiliary loss to increase the distance between mask representations of different queries further enhances performance and mitigates query collapse. Extensive experiments conducted on four benchmark datasets validate the effectiveness of our framework.

关键词： Image segmentation Multimodal Transformer vision-Language

来源：评论

学校读者我要写书评

暂无评论

Learning to Predict Activity Progress by Self-Supervised Video Alignment

Learning to Predict Activity Progress by Self-Supervised Vid...

引用

ieee/cvf conference on computer vision and pattern recognition (CVPR)

作者： Donahue, Gerard Elhamifar, Ehsan Northwestern Univ Boston MA 02115 USA

ISBN: (纸本)9798350353006

In this paper, we tackle the problem of self-supervised video alignment and activity progress prediction using in-the-wild videos. Our proposed self-supervised representation learning method carefully addresses different action orderings, redundant actions, and background frames to generate improved video representations compared to previous methods. Our model generalizes temporal cycleconsistency learning to allow for more flexibility in determining cycle-consistent neighbors. More specifically, to handle repeated actions, we propose a multi-neighbor cycle consistency and a multi-cycle-back regression loss by finding multiple soft nearest neighbors using a Gaussian Mixture Model. To handle background and redundant frames, we introduce a context-dependent drop function in our framework, discouraging the alignment of droppable frames. On the other hand, to learn from videos of multiple activities jointly, we propose a multi-head crosstask network, allowing us to embed a video and estimate progress without knowing its activity label. Experiments on multiple datasets show that our method outperforms the state-of-the-art for video alignment and progress prediction. (1)

关键词： computer vision in the wild procedural learning progress progress prediction representation learning self-supervised self-supervised representation learning unconstrained unconstrained videos Video Alignment video understanding

来源：评论

学校读者我要写书评

暂无评论

SleepVST: Sleep Staging from Near-Infrared Video Signals using Pre-Trained Transformers

SleepVST: Sleep Staging from Near-Infrared Video Signals usi...

引用

ieee/cvf conference on computer vision and pattern recognition (CVPR)

作者： Carter, Jonathan F. Jorge, Joao Gibson, Oliver Tarassenkol, Lionel Univ Oxford Inst Biomed Engn Oxford England Oxehealth Ltd Oxford England

ISBN: (纸本)9798350353006

Advances in camera-based physiological monitoring have enabled the robust, non-contact measurement of respiration and the cardiac pulse, which are known to be indicative of the sleep stage. This has led to research into camera-based sleep monitoring as a promising alternative to "gold-standard" polysomnography, which is cumbersome, expensive to administer, and hence unsuitable for longer-term clinical studies. In this paper, we introduce SleepVST, a transformer model which enables state-of-the-art performance in camera-based sleep stage classification (sleep staging). After pre-training on contact sensor data, SleepVST outperforms existing methods for cardio-respiratory sleep staging on the SHHS and MESA datasets, achieving total Cohen's kappa scores of 0.75 and 0.77 respectively. We then show that SleepVST can be successfully transferred to cardio-respiratory waveforms extracted from video, enabling fully contact-free sleep staging. Using a video dataset of 50 nights, we achieve a total accuracy of 78.8% and a Cohen's. of 0.71 in four-class video-based sleep staging, setting a new state-of-the-art in the domain.

关键词： computer vision remote monitoring sleep staging transformers

来源：评论

学校读者我要写书评

暂无评论

ZePT: Zero-Shot Pan-Tumor Segmentation via Query-Disentangling and Self-Prompting

ZePT: Zero-Shot Pan-Tumor Segmentation via Query-Disentangli...

引用

ieee/cvf conference on computer vision and pattern recognition (CVPR)

作者： Jiang, Yankai Huang, Zhongzhen Zhang, Rongzhao Zhang, Xiaofan Zhang, Shaoting Shanghai AI Lab Shanghai Peoples R China Shanghai Jiao Tong Univ Shanghai Peoples R China SenseTime Res Shanghai Peoples R China

ISBN: (纸本)9798350353006

The long-tailed distribution problem in medical image analysis reflects a high prevalence of common conditions and a low prevalence of rare ones, which poses a significant challenge in developing a unified model capable of identifying rare or novel tumor categories not encountered during training. In this paper, we propose a new Zero-shot Pan-Tumor segmentation framework (ZePT) based on query-disentangling and self-prompting to segment unseen tumor categories beyond the training set. ZePT disentangles the object queries into two subsets and trains them in two stages. Initially, it learns a set of fundamental queries for organ segmentation through an object-aware feature grouping strategy, which gathers organ-level visual features. Subsequently, it refines the other set of advanced queries that focus on the auto-generated visual prompts for unseen tumor segmentation. Moreover, we introduce query-knowledge alignment at the feature level to enhance each query's discriminative representation and generalizability. Extensive experiments on various tumor segmentation tasks demonstrate the performance superiority of ZePT, which surpasses the previous counterparts and evidences the promising ability for zero-shot tumor segmentation in real-world settings.

关键词： Medical Image Segmentation vision-Language Model

来源：评论

学校读者我要写书评

暂无评论

A Closer Look at the Few-Shot Adaptation of Large vision-Language Models

A Closer Look at the Few-Shot Adaptation of Large Vision-Lan...

引用

ieee/cvf conference on computer vision and pattern recognition (CVPR)

作者： Iguez, Julio Silva-Rodr Hajimiri, Sina Ben Ayed, Ismail Dolz, Jose ETS Montreal Montreal PQ Canada

ISBN: (纸本)9798350353006

Efficient transfer learning (ETL) is receiving increasing attention to adapt large pre-trained language-vision models on downstream tasks with a few labeled samples. While significant progress has been made, we reveal that state-of-the-art ETL approaches exhibit strong performance only in narrowly-defined experimental setups, and with a careful adjustment of hyperparameters based on a large corpus of labeled samples. In particular, we make two interesting, and surprising empirical observations. First, to out-perform a simple Linear Probing baseline, these methods require to optimize their hyper-parameters on each target task. And second, they typically underperform -sometimes dramatically-standard zero-shot predictions in the presence of distributional drifts. Motivated by the unrealistic assumptions made in the existing literature, i.e., access to a large validation set and case-specific grid-search for optimal hyperparameters, we propose a novel approach that meets the requirements of real-world scenarios. More concretely, we introduce a CLass-Adaptive linear Probe ( CLAP) objective, whose balancing term is optimized via an adaptation of the general Augmented Lagrangian method tailored to this context. We comprehensively evaluate CLAP on a broad span of datasets and scenarios, demonstrating that it consistently outperforms SoTA approaches, while yet being a much more efficient alternative. Code available at https://***/jusiro/CLAP.

关键词：

来源：评论

学校读者我要写书评

暂无评论

没有更多数据了...

全选清除本页清除全部题录导出标记到“检索档案”

共500页 << < 22 23 24 25 26 27 28 29 30 31 > >>

检索报告对象比较合并检索0

隐藏清空

合并搜索

回到顶部

执行限定条件

内容：

评分：

请选择保存的检索档案：

请选择收藏分类：

订阅名称：

通借通还

温馨提示：

图书名称：

借书校区：

取书校区：

手机号码：

邮箱地址：

一卡通帐号：

电话和邮箱必须正确填写，我们会与您联系确认。

联系人：

所在院系：

联系邮箱：

联系电话：

内蒙古自治区呼和浩特市赛罕区大学西街235号邮编: 010021

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：