检索结果-内蒙古大学图书馆

您好，读者！请登录

内蒙古大学图书馆

首页
概况
党建
资源
服务
科研支持
- 论文收录引用证明
- 科技查新
知识产权
档案馆
帮助

咨询与建议

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

您的常用邮箱：*

您的手机号码：*

问题描述：

当前已输入0个字，您还可以输入200个字

全部搜索
期刊论文
图书
学位论文
标准
纸本馆藏
外文资源发现
数据库导航
超星发现

高级检索

分类表

所选分类

>> <<

限定检索结果

标题

标题
作者
主题词
出版物名称
出版社
机构
学科分类号
摘要
ISBN
ISSN
基金资助
索书号

作者

作者
标题
主题词
出版物名称
出版社
机构
学科分类号
摘要
ISBN
ISSN
基金资助
索书号

文献类型

12,844 篇 会议
13 篇 期刊文献
2 册 图书

馆藏范围

12,859 篇 电子文献
0 种 纸本馆藏

日期分布

学科分类号

7,573 篇 工学
- 6,863 篇 计算机科学与技术...
- 880 篇 机械工程
- 814 篇 软件工程
- 435 篇 控制科学与工程
- 360 篇 光学工程
- 306 篇 电气工程
- 209 篇 仪器科学与技术
- 124 篇 信息与通信工程
- 91 篇 生物工程
- 62 篇 生物医学工程（可授...
- 39 篇 电子科学与技术（可...
- 34 篇 安全科学与工程
- 26 篇 化学工程与技术
- 21 篇 交通运输工程
- 20 篇 建筑学
- 18 篇 土木工程
2,957 篇 医学
- 2,956 篇 临床医学
- 15 篇 基础医学(可授医学...
- 12 篇 药学(可授医学、理...
700 篇 理学
- 359 篇 物理学
- 225 篇 数学
- 175 篇 系统科学
- 95 篇 统计学（可授理学、...
- 93 篇 生物学
- 22 篇 化学
201 篇 艺术学
- 201 篇 设计学（可授艺术学...
84 篇 管理学
- 59 篇 图书情报与档案管...
- 25 篇 管理科学与工程(可...
- 14 篇 工商管理
23 篇 法学
- 21 篇 社会学
5 篇 农学
4 篇 教育学
2 篇 经济学
1 篇 军事学

主题

6,464 篇 computer vision
2,688 篇 training
2,437 篇 pattern recognit...
1,780 篇 computational mo...
1,522 篇 visualization
1,348 篇 three-dimensiona...
1,091 篇 computer archite...
1,063 篇 semantics
997 篇 benchmark testin...
976 篇 codes
970 篇 conferences
854 篇 feature extracti...
830 篇 cameras
771 篇 task analysis
707 篇 deep learning
646 篇 image segmentati...
611 篇 object detection
595 篇 shape
554 篇 transformers
538 篇 neural networks

机构

132 篇 univ sci & techn...
122 篇 carnegie mellon ...
120 篇 tsinghua univ pe...
114 篇 univ chinese aca...
113 篇 chinese univ hon...
94 篇 tsinghua univers...
91 篇 zhejiang univ pe...
91 篇 swiss fed inst t...
85 篇 peng cheng lab p...
81 篇 university of ch...
80 篇 zhejiang univers...
77 篇 shanghai ai lab ...
77 篇 peng cheng labor...
75 篇 university of sc...
69 篇 shanghai jiao to...
68 篇 shanghai jiao to...
67 篇 alibaba grp peop...
67 篇 stanford univ st...
66 篇 univ hong kong p...
64 篇 sensetime res pe...

作者

77 篇 timofte radu
63 篇 van gool luc
45 篇 zhang lei
36 篇 yang yi
36 篇 luc van gool
34 篇 tao dacheng
31 篇 loy chen change
29 篇 chen chen
28 篇 sun jian
28 篇 qi tian
25 篇 li xin
24 篇 liu yang
24 篇 tian qi
24 篇 ying shan
23 篇 wang xinchao
23 篇 zha zheng-jun
23 篇 boxin shi
21 篇 zhou jie
21 篇 vasconcelos nuno
20 篇 luo ping

语言

12,851 篇 英文
7 篇 其他
1 篇 中文

检索条件"任意字段=IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops"

共 12859 条记录，以下是71-80 订阅

全选清除本页清除全部题录导出标记到"检索档案"

详细简洁

排序：

相关度排序

相关度排序
时效性降序
时效性升序

Pseudo-label based unsupervised fine-tuning of a monocular 3D pose estimation model for sports motions

Pseudo-label based unsupervised fine-tuning of a monocular 3...

引用

ieee/cvf conference on computer vision and pattern recognition (CVPR)

作者： Suzuki, Tomohiro Tanaka, Ryota Takeda, Kazuya Fujii, Keisuke Nagoya Univ Nagoya Aichi Japan

ISBN: (纸本)9798350365474

Accurate motion capture is useful for sports motion analysis, but requires higher acquisition costs. Monocular or few camera multi-view pose estimation provides an accessible but less accurate alternative, especially for sports motion, due to training on datasets of daily activities. In addition, multi-view estimation is still costly due to camera calibration. Therefore, it is desirable to develop an accurate and cost-effective motion capture system for the daily training in sports. In this paper, we propose an accurate and convenient sports motion capture system based on unsupervised fine-tuning. The proposed system estimates 3D joint positions by multi-view estimation based on automatic calibration with the human body. These results are used as pseudo-labels for fine-tuning of the recent higher performance monocular 3D pose estimation model. Since the fine-tuning improves the model accuracy for sports motion, we can choose multi-view or monocular estimation depending on the situation. We evaluated the system using a running motion dataset and ASPset-510, and showed that fine-tuning improved the performance of monocular estimation to the same level as that of multi-view estimation for running motion. Our proposed system can be useful for the daily motion analysis in sports.

关键词： computer vision Pose estimation Running Sports

来源：评论

学校读者我要写书评

暂无评论

Exploring the Benefits of vision Foundation Models for Unsupervised Domain Adaptation

Exploring the Benefits of Vision Foundation Models for Unsup...

引用

ieee/cvf conference on computer vision and pattern recognition (CVPR)

作者： Englert, Bruno B. Piva, Fabrizio J. Kerssies, Tommie de Geus, Daan Dubbelman, Gijs Eindhoven Univ Technol Eindhoven Netherlands

ISBN: (纸本)9798350365474

Achieving robust generalization across diverse data domains remains a significant challenge in computer vision. This challenge is important in safety-critical applications, where deep-neural-network-based systems must perform reliably under various environmental conditions not seen during training. Our study investigates whether the generalization capabilities of vision Foundation Models (VFMs) and Unsupervised Domain Adaptation (UDA) methods for the semantic segmentation task are complementary. Results show that combining VFMs with UDA has two main benefits: (a) it allows for better UDA performance while maintaining the out-of-distribution performance of VFMs, and (b) it makes certain time-consuming UDA components redundant, thus enabling significant inference speedups. Specifically, with equivalent model sizes, the resulting VFM-UDA method achieves an 8.4x speed increase over the prior non-VFM state of the art, while also improving performance by +1.2 mIoU in the UDA setting and by +6.1 mIoU in terms of out-of-distribution generalization. Moreover, when we use a VFM with 3.6x more parameters, the VFM-UDA approach maintains a 3.3x speed up, while improving the UDA performance by +3.1 mIoU and the out-of-distribution performance by +10.3 mIoU. These results underscore the significant benefits of combining VFMs with UDA, setting new standards and baselines for Unsupervised Domain Adaptation in semantic segmentation. The implementation is available at https://***/tue-mps/vfmuda.

关键词： foundation model generalization semantic segmentation unsupervised domain adaptation vision foundation model

来源：评论

学校读者我要写书评

暂无评论

DVMSR: Distillated vision Mamba for Efficient Super-Resolution

DVMSR: Distillated Vision Mamba for Efficient Super-Resoluti...

引用

ieee/cvf conference on computer vision and pattern recognition (CVPR)

作者： Lei, Xiaoyan Zhang, Wenlong Cao, Weifeng Zhengzhou Univ Light Ind Zhengzhou Peoples R China HongKong Polytech Univ Hong Kong Peoples R China

ISBN: (纸本)9798350365474

Efficient Image Super-Resolution (SR) aims to accelerate SR network inference by minimizing computational complexity and network parameters while preserving performance. Existing state-of-the-art Efficient Image Super-Resolution methods are based on convolutional neural networks. Few attempts have been made with Mamba to harness its long-range modeling capability and efficient computational complexity, which have shown impressive performance on high-level vision tasks. In this paper, we propose DVMSR, a novel lightweight Image SR network that incorporates vision Mamba and a distillation strategy. The network of DVMSR consists of three modules: feature extraction convolution, multiple stacked Residual State Space Blocks (RSSBs), and a reconstruction module. Specifically, the deep feature extraction module is composed of several residual state space blocks (RSSB), each of which has several vision Mamba Moudles(ViMM) together with a residual connection. To achieve efficiency improvement while maintaining comparable performance, we employ a distillation strategy to the vision Mamba network for superior performance. Specifically, we leverage the rich representation knowledge of teacher network as additional supervision for the output of lightweight student networks. Extensive experiments have demonstrated that our proposed DVMSR can outperform state-of-the-art efficient SR methods in terms of model parameters while maintaining the performance of both PSNR and SSIM. The source code is available at https://***/nathan66666/***

关键词： Efficient Image Super-Resolution vision Mamba

来源：评论

学校读者我要写书评

暂无评论

Knowledge Distillation for Efficient Instance Semantic Segmentation with Transformers

Knowledge Distillation for Efficient Instance Semantic Segme...

引用

ieee/cvf conference on computer vision and pattern recognition (CVPR)

作者： Li, Maohui Halstead, Michael McCool, Chris Univ Bonn Bonn Germany Lamarr Inst Machine Learning & Artificial Intelli Dortmund Germany

ISBN: (纸本)9798350365474

Instance-based semantic segmentation provides detailed per-pixel scene understanding information crucial for both computer vision and robotics applications. However, state-of-the-art approaches such as Mask2Former are computationally expensive and reducing this computational burden while maintaining high accuracy remains challenging. Knowledge distillation has been regarded as a potential way to compress neural networks, but to date limited work has explored how to apply this to distill information from the output queries of a model such as Mask2Former. In this paper, we match the output queries of the student and teacher models to enable a query-based knowledge distillation scheme. We independently match the teacher and the student to the groundtruth and use this to define the teacher to student relationship for knowledge distillation. Using this approach we show that it is possible to perform knowledge distillation where the student models can have a lower number of queries and the backbone can be changed from a Transformer architecture to a convolutional neural network architecture. Experiments on two challenging agricultural datasets, sweet pepper (BUP20) and sugar beet (SB20), and Cityscapes demonstrate the efficacy of our approach. Across the three datasets the student models obtain an average absolute performance improvement in AP of 1.8 and 1.9 points for ResNet-50 and Swin-Tiny backbone respectively. To the best of our knowledge, this is the first work to propose knowledge distillation schemes for instance semantic segmentation with transformer-based models.

关键词： computer vision for Agriculture Automation Knowledge Distillation Efficient Instance Segmentation Transformercomputer vision for Agriculture Automation Knowledge Distillation Efficient Instance Segmentation Transformer

来源：评论

学校读者我要写书评

暂无评论

CAGE: Circumplex Affect Guided Expression Inference

CAGE: Circumplex Affect Guided Expression Inference

引用

ieee/cvf conference on computer vision and pattern recognition (CVPR)

作者： Wagner, Niklas Maetzler, Felix Vossberg, Samed R. Schneider, Helen Pavlitska, Svetlana Zoellner, J. Marius Karlsruhe Inst Technol KIT Karlsruhe Germany FZI Res Ctr Informat Technol Karlsruhe Germany

ISBN: (纸本)9798350365474

Understanding emotions and expressions is a task of interest across multiple disciplines, especially for improving user experiences. Contrary to the common perception, it has been shown that emotions are not discrete entities but instead exist along a continuum. People understand discrete emotions differently due to a variety of factors, including cultural background, individual experiences, and cognitive biases. Therefore, most approaches to expression understanding, particularly those relying on discrete categories, are inherently biased. In this paper, we present a comparative in-depth analysis of two common datasets (AffectNet and EMOTIC) equipped with the components of the circumplex model of affect. Further, we propose a model for the prediction of facial expressions tailored for lightweight applications. Using a small-scaled MaxViT-based model architecture, we evaluate the impact of discrete expression category labels in training with the continuous valence and arousal labels. We show that considering valence and arousal in addition to discrete category labels helps to significantly improve expression inference. The proposed model outperforms the current state-of-the-art models on AffectNet, establishing it as the best-performing model for inferring valence and arousal achieving a 7% lower RMSE. Training scripts and trained weights to reproduce our results can be found here: https:// ***/wagner-niklas/CAGE_expression_inference.

关键词： computer vision Expression Inference Transformer

来源：评论

学校读者我要写书评

暂无评论

NICE: CVPR 2023 Challenge on Zero-shot Image Captioning

NICE: CVPR 2023 Challenge on Zero-shot Image Captioning

引用

ieee/cvf conference on computer vision and pattern recognition (CVPR)

作者： Kim, Taehoon Ahn, Pyunghwan Kim, Sangyun Lee, Sihaeng Marsden, Mark Sala, Alessandra Kim, Seung Hwan Han, Bohyung Lee, Kyoung Mu Lee, Honglak Bae, Kyounghoon Wu, Xiangyu Gao, Yi Zhang, Hailiang Yang, Yang Guo, Weili Lu, Jianfeng Oh, Youngtaek Cho, Jae Won Kim, Dong-Jin Kweon, In So Kim, Junmo Kang, Wooyoung Jhoo, Won Young Roh, Byungseok Mun, Jonghwan Oh, Solgil Ak, Kenan Emir Lee, Gwang-Gook Xu, Yan Shen, Mingwei Hwang, Kyomin Shin, Wonsik Lee, Kamin Park, Wonhark Lee, Dongkwan Kwak, Nojun Wang, Yujin Wang, Yimu Gu, Tiancheng Lv, Xingchang Sun, Mingmao

ISBN: (纸本)9798350365474

In this report, we introduce NICE (New frontiers for zero-shot Image Captioning Evaluation) project1 and share the results and outcomes of 2023 challenge. This project is designed to challenge the computer vision community to develop robust image captioning models that advance the state-of-the-art both in terms of accuracy and fairness. Through the challenge, the image captioning models were tested using a new evaluation dataset that includes a large variety of visual concepts from many domains. There was no specific training data provided for the challenge, and therefore the challenge entries were required to adapt to new types of image descriptions that had not been seen during training. This report includes information on the newly proposed NICE dataset, evaluation methods, challenge results, and technical details of top-ranking entries. We expect that the outcomes of the challenge will contribute to the improvement of AI models on various vision-language tasks.

关键词： Image captioning Multimodal representation vision-language models

来源：评论

学校读者我要写书评

暂无评论

NTIRE 2024 Challenge on Stereo Image Super-Resolution: Methods and Results

NTIRE 2024 Challenge on Stereo Image Super-Resolution: Metho...

引用

ieee/cvf conference on computer vision and pattern recognition (CVPR)

作者： Wang, Longguang Guo, Yulan Li, Juncheng Liu, Hongda Zhao, Yang Wang, Yingqian Jin, Zhi Gu, Shuhang Timofte, Radu Aviation University of Air Force Sun Yat-sen University The Shenzhen Campus of Sun Yat-sen University China National University of Defense Technology China Shanghai University China University of Electronic Science and Technology of China China Computer Vision Lab University of Würzburg Germany

ISBN: (纸本)9798350365474

This paper summarizes the 3rd NTIRE challenge on stereo image super-resolution (SR) with a focus on new solutions and results. The task of this challenge is to super-resolve a low-resolution stereo image pair to a high-resolution one with a magnification factor of x4 under a limited computational budget. Compared with single image SR, the major challenge of this challenge lies in how to exploit additional information in another viewpoint and how to maintain stereo consistency in the results. This challenge has 2 tracks, including one track on bicubic degradation and one track on real degradations. In total, 108 and 70 participants were successfully registered for each track, respectively. In the test phase, 14 and 13 teams successfully submitted valid results with PSNR (RGB) scores better than the baseline. This challenge establishes a new benchmark for stereo image SR.

关键词： Stereocenters

来源：评论

学校读者我要写书评

暂无评论

InVERGe: Intelligent Visual Encoder for Bridging Modalities in Report Generation

InVERGe: Intelligent Visual Encoder for Bridging Modalities ...

引用

ieee/cvf conference on computer vision and pattern recognition (CVPR)

作者： Deria, Ankan Kumar, Komal Chakraborty, Snehashis Mahapatra, Dwarikanath Roy, Sudipta Jio Inst Artificial Intelligence & Data Sci Navi Mumbai 410206 India Incept Inst Artificial Intelligence Abu Dhabi U Arab Emirates

ISBN: (纸本)9798350365474

Medical image captioning plays an important role in modern healthcare, improving clinical report generation and aiding radiologists in detecting abnormalities and reducing misdiagnosis. The complex visual and textual data biases make this task more challenging. Recent advancements in transformer-based models have significantly improved the generation of radiology reports from medical images. However, these models require substantial computational resources for training and have been observed to produce unnatural language outputs when trained solely on raw image-text pairs. Our aim is to generate more detailed reports specific to images and to explain the reasoning behind the generated text through image-text alignment. Given the high computational demands of end-to-end model training, we introduce a two-step training methodology with an Intelligent Visual Encoder for Bridging Modalities in Report Generation (InVERGe) model. This model incorporates a lightweight transformer known as the Cross-Modal Query Fusion Layer (CMQFL), which utilizes the output from a frozen encoder to identify the most relevant text-grounded image embedding. This layer bridges the gap between the encoder and decoder, significantly reducing the workload on the decoder and enhancing the alignment between vision and language. Our experimental results, conducted using the MIMIC-CXR, Indiana University chest X-ray images, and CDD-CESM breast images datasets, demonstrate the effectiveness of our approach. Code: https://***/labsroy007/InVERGe

关键词： computer vision Deep Learning Large Language Model Medical Imaging Report Generation

来源：评论

学校读者我要写书评

暂无评论

Hierarchical NeuroSymbolic Approach for Comprehensive and Explainable Action Quality Assessment

Hierarchical NeuroSymbolic Approach for Comprehensive and Ex...

引用

ieee/cvf conference on computer vision and pattern recognition (CVPR)

作者： Okamoto, Lauren Parmar, Paritosh Princeton Univ Princeton NJ 08544 USA ASTAR IHPC Singapore Singapore

ISBN: (纸本)9798350365474

Action quality assessment (AQA) applies computer vision to quantitatively assess the performance or execution of a human action. Current AQA approaches are end-to-end neural models, which lack transparency and tend to be biased because they are trained on subjective human judgements as ground-truth. To address these issues, we introduce a neuro-symbolic paradigm for AQA, which uses neural networks to abstract interpretable symbols from video data and makes quality assessments by applying rules to those symbols. We take diving as the case study. We found that domain experts prefer our system and find it more informative than purely neural approaches to AQA in diving. Our system also achieves state-of-the-art action recognition and temporal segmentation, and automatically generates a detailed report that breaks the dive down into its elements and provides objective scoring with visual evidence. As verified by a group of domain experts, this report may be used to assist judges in scoring, help train judges, and provide feedback to divers. Annotated training data and code: https://***/laurenok24/NSAQA.

关键词： action quality assessment action recognition AI Coach AI Diving Coach AI Diving Judge AI Olympics Judge explainable AI fairness in AI interpretable action analysis interpretable action quality assessment interpretable fine-grained action quality assessment neuro-symbolic computer vision neurosymbolic action assessment neurosymbolic action scoring neurosymbolic AI neurosymbolic fine-grained action analysis neurosymbolic fine-grained action quality assessment neurosymbolic fine-grained action recogntion neurosymbolic fine-grained action understanding neurosymbolic skills assessment neurosymbolic temporal segmentation neurosymbolic video understanding Olympics Scoring representation learning skills assessment temporal segmentation transparent AI XAI

来源：评论

学校读者我要写书评

暂无评论

How to Benchmark vision Foundation Models for Semantic Segmentation?

How to Benchmark Vision Foundation Models for Semantic Segme...

引用

ieee/cvf conference on computer vision and pattern recognition (CVPR)

作者： Kerssies, Tommie de Geus, Daan Dubbelman, Gijs Eindhoven Univ Technol Eindhoven Netherlands

ISBN: (纸本)9798350365474

Recent vision foundation models (VFMs) have demonstrated proficiency in various tasks but require supervised fine-tuning to perform the task of semantic segmentation effectively. Benchmarking their performance is essential for selecting current models and guiding future model developments for this task. The lack of a standardized benchmark complicates comparisons. Therefore, the primary objective of this paper is to study how VFMs should be benchmarked for semantic segmentation. To do so, various VFMs are finetuned under various settings, and the impact of individual settings on the performance ranking and training time is assessed. Based on the results, the recommendation is to finetune the ViT-B variants of VFMs with a 16 x 16 patch size and a linear decoder, as these settings are representative of using a larger model, more advanced decoder and smaller patch size, while reducing training time by more than 13 times. Using multiple datasets for training and evaluation is also recommended, as the performance ranking across datasets and domain shifts varies. Linear probing, a common practice for some VFMs, is not recommended, as it is not representative of end-to-end fine-tuning. The benchmarking setup recommended in this paper enables a performance analysis of VFMs for semantic segmentation. The findings of such an analysis reveal that pretraining with promptable segmentation is not beneficial, whereas masked image modeling (MIM) with abstract representations is crucial, even more important than the type of supervision used. The code for efficiently fine-tuning VFMs for semantic segmentation can be accessed through the project page(1).

关键词： benchmark semantic segmentation vision foundation models

来源：评论

学校读者我要写书评

暂无评论

没有更多数据了...

全选清除本页清除全部题录导出标记到“检索档案”

共500页 << < 4 5 6 7 8 9 10 11 12 13 > >>

检索报告对象比较合并检索0

隐藏清空

合并搜索

回到顶部

执行限定条件

内容：

评分：

请选择保存的检索档案：

请选择收藏分类：

订阅名称：

通借通还

温馨提示：

图书名称：

借书校区：

取书校区：

手机号码：

邮箱地址：

一卡通帐号：

电话和邮箱必须正确填写，我们会与您联系确认。

联系人：

所在院系：

联系邮箱：

联系电话：

内蒙古自治区呼和浩特市赛罕区大学西街235号邮编: 010021

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：