检索结果-内蒙古大学图书馆

您好，读者！请登录

内蒙古大学图书馆

首页
概况
党建
资源
服务
科研支持
- 论文收录引用证明
- 科技查新
知识产权
档案馆
帮助

咨询与建议

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

您的常用邮箱：*

您的手机号码：*

问题描述：

当前已输入0个字，您还可以输入200个字

全部搜索
期刊论文
图书
学位论文
标准
纸本馆藏
外文资源发现
数据库导航
超星发现

高级检索

分类表

所选分类

>> <<

限定检索结果

标题

标题
作者
主题词
出版物名称
出版社
机构
学科分类号
摘要
ISBN
ISSN
基金资助
索书号

作者

作者
标题
主题词
出版物名称
出版社
机构
学科分类号
摘要
ISBN
ISSN
基金资助
索书号

文献类型

283 篇 会议
41 册 图书
3 篇 期刊文献

馆藏范围

327 篇 电子文献
0 种 纸本馆藏

日期分布

学科分类号

171 篇 工学
- 152 篇 计算机科学与技术...
- 93 篇 软件工程
- 47 篇 光学工程
- 35 篇 信息与通信工程
- 28 篇 生物工程
- 22 篇 电气工程
- 16 篇 控制科学与工程
- 9 篇 电子科学与技术（可...
- 8 篇 化学工程与技术
- 8 篇 生物医学工程（可授...
- 6 篇 网络空间安全
- 5 篇 机械工程
- 5 篇 安全科学与工程
- 3 篇 仪器科学与技术
- 3 篇 材料科学与工程（可...
- 3 篇 建筑学
- 3 篇 农业工程
78 篇 理学
- 48 篇 物理学
- 30 篇 生物学
- 25 篇 数学
- 11 篇 统计学（可授理学、...
- 8 篇 化学
21 篇 医学
- 16 篇 临床医学
- 4 篇 公共卫生与预防医...
- 3 篇 基础医学(可授医学...
- 3 篇 特种医学
18 篇 管理学
- 10 篇 管理科学与工程(可...
- 9 篇 图书情报与档案管...
- 5 篇 工商管理
4 篇 农学
- 4 篇 作物学
2 篇 法学
2 篇 教育学

主题

37 篇 computer vision
25 篇 image processing...
21 篇 artificial intel...
20 篇 pattern recognit...
20 篇 computer imaging...
17 篇 machine learning
14 篇 computer applica...
13 篇 computer systems...
12 篇 signal, image an...
11 篇 deep learning
8 篇 image processing
6 篇 computers and ed...
5 篇 vision transform...
5 篇 low-level vision
4 篇 image enhancemen...
4 篇 object detection
4 篇 cell microscopy
4 篇 image segmentati...
4 篇 graphics process...
4 篇 stereo image pro...

机构

17 篇 microsoft resear...
17 篇 microsoft res as...
12 篇 tsinghua univers...
7 篇 university of sc...
7 篇 univ sci & techn...
6 篇 national key lab...
6 篇 shanghai collabo...
6 篇 shanghai key lab...
5 篇 university of te...
5 篇 dalian universit...
5 篇 university of sy...
5 篇 institute of aut...
5 篇 shenzhen univers...
4 篇 key laboratory o...
4 篇 national enginee...
4 篇 university of ud...
4 篇 tsinghua univ pe...
4 篇 microsoft cloud ...
4 篇 university of yo...
4 篇 vision and image...

作者

8 篇 han hu
7 篇 hu han
6 篇 zuxuan wu
6 篇 min xu
5 篇 hui huang
5 篇 jing dong
5 篇 yu-gang jiang
5 篇 jiwen lu
5 篇 risheng liu
5 篇 wanli ouyang
5 篇 huchuan lu
4 篇 gian luca forest...
4 篇 qi dai
4 篇 zhiguo cao
4 篇 dongdong chen
4 篇 edwin hancock
4 篇 boxin shi
4 篇 jean-jacques rou...
4 篇 zheng zhang
4 篇 andrea fusiello

语言

321 篇 英文
6 篇 其他
3 篇 中文

检索条件"任意字段=2023 Asia Conference on Computer Vision, Image Processing and Pattern Recognition, CVIPPR 2023"

共 327 条记录，以下是41-50 订阅

全选清除本页清除全部题录导出标记到"检索档案"

详细简洁

排序：

相关度排序

相关度排序
时效性降序
时效性升序

SIEDOB: Semantic image Editing by Disentangling Object and Background

SIEDOB: Semantic Image Editing by Disentangling Object and B...

引用

IEEE/CVF conference on computer vision and pattern recognition (CVPR)

作者： Luo, Wuyang Yang, Su Zhang, Xinjian Zhang, Weishan Fudan Univ Shanghai Key Lab Intelligent Informat Proc Sch Comp Sci Shanghai Peoples R China China Univ Petr Sch Comp Sci & Technol Qingdao Peoples R China

ISBN: (纸本)9798350301298

Semantic image editing provides users with a flexible tool to modify a given image guided by a corresponding segmentation map. In this task, the features of the foreground objects and the backgrounds are quite different. However, all previous methods handle backgrounds and objects as a whole using a monolithic model. Consequently, they remain limited in processing content-rich images and suffer from generating unrealistic objects and texture-inconsistent backgrounds. To address this issue, we propose a novel paradigm, Semantic image Editing by Disentangling Object and Background (SIEDOB), the core idea of which is to explicitly leverages several heterogeneous subnetworks for objects and backgrounds. First, SIEDOB disassembles the edited input into background regions and instance-level objects. Then, we feed them into the dedicated generators. Finally, all synthesized parts are embedded in their original locations and utilize a fusion network to obtain a harmonized result. Moreover, to produce high-quality edited images, we propose some innovative designs, including Semantic-Aware Self-Propagation Module, Boundary-Anchored Patch Discriminator, and Style-Diversity Object Generator, and integrate them into SIEDOB. We conduct extensive experiments on Cityscapes and ADE20K-Room datasets and exhibit that our method remarkably outperforms the baselines, especially in synthesizing realistic and diverse objects and texture-consistent backgrounds. Code is available at https://***/WuyangLuo/SIEDOB.

关键词： vision applications and systems

来源：评论

学校读者我要写书评

暂无评论

Elevating Perception: Unified recognition Framework and vision-Language Pre-Training Using Three-Dimensional image Reconstruction 2

Elevating Perception: Unified Recognition Framework and Visi...

引用

2nd International conference on Artificial Intelligence, Human-computer Interaction and Robotics, AIHCIR 2023

作者： Wang, ZhiQiang Joshi, Aryan Zhang, GuiYing Ren, WenJia Jia, Feng Sun, XiaoHang Tianjin University of Technology and Education Dept. of Information Tech & Engineering Tianjin China Tianjin Ai Shen de Si Technology Co. Ltd Dept. of Information Tech & Engineering Tianjin China

ISBN: (纸本)9798350360363

This research project explores a paradigm shift in perceptual enhancement by integrating a Unified recognition Framework and vision-Language Pre-Training in three-dimensional image reconstruction. Through the synergy of advanced algorithms from computer vision & language processing, the project tries to enhance the precision and depth of perception in reconstructed images. This innovative approach holds the potential to revolutionize fields such as medical imaging, virtual reality, and computer-aided design, providing a comprehensive perspective on the intersection of multimodal data processing and perceptual advancement. The anticipated research outcomes are expected to significantly contribute to the evolution of technologies that rely on accurate and contextually rich three-dimensional reconstructions. Moreover, the research aims to reduce the constant need for new datasets by improving pattern recognition through 3D image patterning on backpropagation. This continuous improvement of vectors is envisioned to enhance the efficiency and accuracy of pattern recognition, contributing to the optimization of perceptual systems over time. © 2023 IEEE.

关键词： image enhancement

来源：评论

学校读者我要写书评

暂无评论

Detection of Palmprint from Contactless Smartphone-Based Video Hand Dataset 8th

Detection of Palmprint from Contactless Smartphone-Based Vid...

引用

8th National conference on computer vision, pattern recognition, image processing and Graphics-NCVPRIPG

作者： Chodvadiya, Charulkumar Singh, Kinshuk Gaurav Vyas, Ritesh Pandit Deendayal Energy Univ Gandhinagar Gujarat India

ISBN: (纸本)9789819752119;9789819752126

A fast-expanding topic is the study of palmprint biometric identification in contactless scenario, which uses techniques from computer vision and machine learning to identify and authenticate people. In this study, we utilized a handcrafted video dataset with 60 distinct classes, each labelled as either a left or right hand, to investigate palmprint detection and matching tasks. The dataset showcases various variations in palmprint patterns, like distance from the sensor, orientation, finger positioning, and deformation, making it an ideal candidate for the development of robust and accurate palmprint recognition models. The major goal of the study is to identify palmprints in the video collection and match them with the right class or pattern. To accomplish this task, different machine learning (ML) and deep learning (DL) models were trained and evaluated. To find the best method for palmprint identification in a contactless manner, the accuracy of each model was tested. In conclusion, our study adds to the expanding body of knowledge on biometric palmprint identification and introduces a fresh handmade video dataset that can be used to compare the effectiveness of various models.

关键词： Palmprint biometric identification computer vision Deep learning model Palmprint patterns Transfer learning

来源：评论

学校读者我要写书评

暂无评论

TinyMIM: An Empirical Study of Distilling MIM Pre-trained Models

TinyMIM: An Empirical Study of Distilling MIM Pre-trained Mo...

引用

IEEE/CVF conference on computer vision and pattern recognition (CVPR)

作者： Ren, Sucheng Wei, Fangyun Zhang, Zhang Hu, Han Microsoft Res Asia Beijing Peoples R China

ISBN: (纸本)9798350301298

Masked image modeling (MIM) performs strongly in pre-training large vision Transformers (ViTs). However, small models that are critical for real-world applications cannot or only marginally benefit from this pre-training approach. In this paper, we explore distillation techniques to transfer the success of large MIM-based pre-trained models to smaller ones. We systematically study different options in the distillation framework, including distilling targets, losses, input, network regularization, sequential distillation, etc, revealing that: 1) Distilling token relations is more effective than CLS token- and feature-based distillation;2) An intermediate layer of the teacher network as target perform better than that using the last layer when the depth of the student mismatches that of the teacher;3) Weak regularization is preferred;etc. With these findings, we achieve significant fine-tuning accuracy improvements over the scratch MIM pre-training on imageNet-1K classification, using all the ViT-Tiny, ViT-Small, and ViT-base models, with +4.2%/+2.4%/+1.4% gains, respectively. Our TinyMIM model of base size achieves 52.2 mIoU in AE20K semantic segmentation, which is +4.1 higher than the MAE baseline. Our TinyMIM model of tiny size achieves 79.6% top-1 accuracy on imageNet-1K image classification, which sets a new record for small vision models of the same size and computation budget. This strong performance suggests an alternative way for developing small vision Transformer models, that is, by exploring better training methods rather than introducing inductive biases into architectures as in most previous works. Code is available at https://***/OliverRensu/TinyMIM.

关键词： Self-supervised or unsupervised representation learning

来源：评论

学校读者我要写书评

暂无评论

Deep Dehazing Powered by image processing Network

Deep Dehazing Powered by Image Processing Network

引用

2023 IEEE/CVF conference on computer vision and pattern recognition Workshops, CVPRW 2023

作者： Kim, Guisik Park, Jinhee Kwon, Junseok Chung-Ang University Korea Electronics Technology Institute Seoul Korea Republic of Chung-Ang University School of Computer Science and Engineering Seoul Korea Republic of

ISBN: (纸本)9798350302493

image processing is a very fundamental technique in the field of low-level vision. However, with the development of deep learning over the past five years, most low-level vision methods tend to ignore this technique. Recent dehazing methods also refrain from using conventional image processing techniques, whereas only focusing on the development of new deep neural network (DNN) architectures. Unlike this recent trend, we show that image processing techniques are still competitive, if they are incorporated into DNNs. In this paper, we utilize conventional image processing techniques (i.e. curve adjustment, retinex decomposition, and multiple image fusion) for accurate dehazing. Moreover, we employ direct learning for stable dehazing performance. The proposed method can perform with low computational cost and easy to learn. The experimental results demonstrate that the proposed method produces accurate dehazing results compared to recent algorithms. © 2023 IEEE.

关键词： image fusion

来源：评论

学校读者我要写书评

暂无评论

Visual Semantic Relatedness Dataset for image Captioning

Visual Semantic Relatedness Dataset for Image Captioning

引用

2023 IEEE/CVF conference on computer vision and pattern recognition Workshops, CVPRW 2023

作者： Sabir, Ahmed Moreno-Noguer, Francesc Padró, Lluís Universitat Politècnica de Catalunya TALP Research Center Barcelona Spain CSIC-UPC Institut de Ròbotica i Informàtica Industrial Barcelona Spain

ISBN: (纸本)9798350302493

Modern image captioning system relies heavily on extracting knowledge from images to capture the concept of a static story. In this paper, we propose a textual visual context dataset for captioning, in which the publicly available dataset COCO Captions [30] has been extended with information about the scene (such as objects in the image). Since this information has a textual form, it can be used to leverage any NLP task, such as text similarity or semantic relation methods, into captioning systems, either as an end-to-end training strategy or a post-processing based approach. Our dataset is available at https://***/ahmedssabir/Textual-Visual-Semantic-Dataset. © 2023 IEEE.

关键词： computer vision

来源：评论

学校读者我要写书评

暂无评论

Revealing the Dark Secrets of Masked image Modeling

Revealing the Dark Secrets of Masked Image Modeling

引用

IEEE/CVF conference on computer vision and pattern recognition (CVPR)

作者： Xie, Zhenda Geng, Zigang Hu, Jingcheng Zhang, Zheng Hu, Han Cao, Yue Tsinghua Univ Beijing Peoples R China Univ Sci & Technol China Hefei Peoples R China Microsoft Res Asia Beijing Peoples R China

ISBN: (纸本)9798350301298

Masked image modeling (MIM) as pre-training is shown to be effective for numerous vision downstream tasks, but how and where MIM works remain unclear. In this paper, we compare MIM with the long-dominant supervised pre-trained models from two perspectives, the visualizations and the experiments, to uncover their key representational differences. From the visualizations, we find that MIM brings locality inductive bias to all layers of the trained models, but supervised models tend to focus locally at lower layers but more globally at higher layers. That may be the reason why MIM helps vision Transformers that have a very large receptive field to optimize. Using MIM, the model can maintain a large diversity on attention heads in all layers. But for supervised models, the diversity on attention heads almost disappears from the last three layers and less diversity harms the fine-tuning performance. From the experiments, we find that MIM models can perform significantly better on geometric and motion tasks with weak semantics or fine-grained classification tasks, than their supervised counterparts. Without bells and whistles, a standard MIM pre-trained SwinV2-L could achieve state-of-the-art performance on pose estimation (78.9 AP on COCO test-dev and 78.0 AP on CrowdPose), depth estimation (0.287 RMSE on NYUv2 and 1.966 RMSE on KITTI), and video object tracking (70.7 SUC on LaSOT). For the semantic understanding datasets where the categories are sufficiently covered by the supervised pre-training, MIM models can still achieve highly competitive transfer performance. With a deeper understanding of MIM, we hope that our work can inspire new and solid research in this direction. Code will be available at https://***/zdaxie/MIM-DarkSecrets.

关键词： Self-supervised or unsupervised representation learning

来源：评论

学校读者我要写书评

暂无评论

Magic3D: High-Resolution Text-to-3D Content Creation

Magic3D: High-Resolution Text-to-3D Content Creation

引用

IEEE/CVF conference on computer vision and pattern recognition (CVPR)

作者： Lin, Chen-Hsuan Gao, Jun Tang, Luming Takikawa, Towaki Zeng, Xiaohui Huang, Xun Kreis, Karsten Fidler, Sanja Liu, Ming-Yu Lin, Tsung-Yi NVIDIA Corp Santa Clara CA 95051 USA

ISBN: (纸本)9798350301298

DreamFusion [31] has recently demonstrated the utility of a pre-trained text-to-image diffusion model to optimize Neural Radiance Fields (NeRF) [23], achieving remarkable text-to-3D synthesis results. However, the method has two inherent limitations: (a) extremely slow optimization of NeRF and (b) low-resolution image space supervision on NeRF, leading to low-quality 3D models with a long processing time. In this paper, we address these limitations by utilizing a two-stage optimization framework. First, we obtain a coarse model using a low-resolution diffusion prior and accelerate with a sparse 3D hash grid structure. Using the coarse representation as the initialization, we further optimize a textured 3D mesh model with an efficient differentiable renderer interacting with a high-resolution latent diffusion model. Our method, dubbed Magic3D, can create high quality 3D mesh models in 40 minutes, which is 2x faster than DreamFusion (reportedly taking 1.5 hours on average), while also achieving higher resolution. User studies show 61.7% raters to prefer our approach over DreamFusion. Together with the image-conditioned generation capabilities, we provide users with new ways to control 3D synthesis, opening up new avenues to various creative applications.

关键词： vision + graphics

来源：评论

学校读者我要写书评

暂无评论

Spatio-temporal Perceiving Network Based vision Transformer for 6-Hour Precipitation Prediction Using Multi-meteorological Factors 7th

Spatio-temporal Perceiving Network Based Vision Transformer ...

引用

7th Chinese conference on pattern recognition and computer vision

作者： Hu, Jing Zheng, Peng Zhang, Honghu Wu, Xi Chengdu Univ Informat Technol Chengdu 610225 Peoples R China

ISBN: (纸本)9789819784899;9789819784905

Precipitation is crucial for the future development of mankind. However, accurately predicting it remains a formidable challenge. Due to the low efficiency of traditional Numerical Weather Prediction (NWP), deep-learning based methods are highly preferred. However, most deep learning methods focus on predicting the spatio-temporal behavior of the single precipitation variable, often ignoring the interplay between various meteorological factors and precipitation. Furthermore, they tend to underestimate it. Therefore, this paper proposes a new neural network model called Spatio-temporal Perceiving Network Based vision Transformer (ST-ViT), which integrates spatio-temporal and channel perception mechanisms to build the relationship between precipitation and meteorological elements. Additionally, an adaptive differential loss function is proposed to accurately capture precipitation intensity. We evaluated the ST-ViT on ERA5 from Southeast asia for 6h prediction. The quantitative results demonstrate that our method achieved superior accuracy and lower errors compared to other deep learning methods. Specifically, it shows great potential to alleviate the situation of underestimated precipitation from the reconstructed predicted image.

关键词： Precipitation prediction Multiple meteorological factors Deep learning vision transformer

来源：评论

学校读者我要写书评

暂无评论

A Video Face recognition Leveraging Temporal Information Based on vision Transformer 6th

A Video Face Recognition Leveraging Temporal Information Bas...

引用

6th Chinese conference on pattern recognition and computer vision (PRCV)

作者： Zhang, Hui Yang, Jiewen Dong, Xingbo Lv, Xingguo Jia, Wei Jin, Zhe Li, Xuejun Anhui Univ Sch Comp Sci & Technol Anhui Prov Int Joint Res Ctr Adv Technol Med Imag Hefei 230093 Peoples R China Hong Kong Univ Sci & Technol Dept Elect & Comp Engn Hong Kong Peoples R China Anhui Univ Sch Artificial Intelligence Anhui Prov Key Lab Secure Artificial Intelligence Hefei 230093 Peoples R China Hefei Univ Technol Sch Comp Sci & Informat Hefei 230009 Peoples R China

ISBN: (纸本)9789819984688;9789819984695

Video face recognition (VFR) has gained significant attention as a promising field combining computer vision and artificial intelligence, revolutionizing identity authentication and verification. Unlike traditional image-based methods, VFR leverages the temporal dimension of video footage to extract comprehensive and accurate facial information. However, VFR heavily relies on robust computing power and advanced noise processing capabilities to ensure optimal recognition performance. This paper introduces a novel length-adaptive VFR framework based on a recurrent-mechanism-driven vision Transformer, termed TempoViT. TempoViT efficiently captures spatial and temporal information from face videos, enabling accurate and reliable face recognition while mitigating the high GPU memory requirements associated with video processing. By leveraging the reuse of hidden states from previous frames, the framework establishes recurring links between frames, allowing the modeling of long-term dependencies. Experimental results validate the effectiveness of TempoViT, demonstrating its state-of-the-art performance in video face recognition tasks on benchmark datasets including iQIYI-ViD, YTF, IJB-C, and Honda/UCSD.

关键词： Video face recognition vision Transformer Temporal information

来源：评论

学校读者我要写书评

暂无评论

没有更多数据了...

全选清除本页清除全部题录导出标记到“检索档案”

共33页 << < 1 2 3 4 5 6 7 8 9 10 > >>

检索报告对象比较合并检索0

隐藏清空

合并搜索

回到顶部

执行限定条件

内容：

评分：

请选择保存的检索档案：

请选择收藏分类：

订阅名称：

通借通还

温馨提示：

图书名称：

借书校区：

取书校区：

手机号码：

邮箱地址：

一卡通帐号：

电话和邮箱必须正确填写，我们会与您联系确认。

联系人：

所在院系：

联系邮箱：

联系电话：

内蒙古自治区呼和浩特市赛罕区大学西街235号邮编: 010021

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：