检索结果-内蒙古大学图书馆

您好，读者！请登录

内蒙古大学图书馆

首页
概况
党建
资源
服务
科研支持
- 论文收录引用证明
- 科技查新
知识产权
档案馆
帮助

咨询与建议

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

您的常用邮箱：*

您的手机号码：*

问题描述：

当前已输入0个字，您还可以输入200个字

全部搜索
期刊论文
图书
学位论文
标准
纸本馆藏
外文资源发现
数据库导航
超星发现

高级检索

时间限定

出版年份：

文献类型

图书期刊文献学位论文多媒体

馆藏选择

电子馆藏纸本馆藏

核心期刊

全部期刊 SCI 收录期刊 SSCI 收录期刊 EI 收录期刊 CSCD 收录期刊 CSSCI 收录期刊

语言

中文英文

文献类型

期刊文献图书学位论文标准纸本馆藏

帮助

文字说明：

T=题名（书名、题名），A=作者（责任者），K=主题词，P=出版物名称，PU=出版社名称，O=机构（作者单位、学位授予单位、专利申请人），L=中图分类号，C=学科分类号，U=全部字段，Y=年（出版发行年、学位年度、标准发布年）

检索规则说明：

AND代表“并且”；OR代表“或者”；NOT代表“不包含”；(注意必须大写,运算符两边需空一格)

检索范例：

范例一：(K=图书馆学 OR K=情报学) AND A=范并思 AND Y=1982-2016
范例二：P=计算机应用与软件 AND (U=C++ OR U=Basic) NOT K=Visual AND Y=2011-2016

分类表

所选分类

>> <<

限定检索结果

文献类型

17,674 篇 会议
259 册 图书
190 篇 期刊文献
1 篇 学位论文

馆藏范围

18,123 篇 电子文献
2 种 纸本馆藏

日期分布

学科分类号

10,534 篇 工学
- 6,233 篇 计算机科学与技术...
- 4,015 篇 电气工程
- 3,829 篇 控制科学与工程
- 2,908 篇 软件工程
- 1,924 篇 信息与通信工程
- 1,554 篇 光学工程
- 1,406 篇 机械工程
- 997 篇 仪器科学与技术
- 583 篇 电子科学与技术（可...
- 549 篇 生物医学工程（可授...
- 433 篇 生物工程
- 232 篇 材料科学与工程（可...
- 195 篇 交通运输工程
- 163 篇 安全科学与工程
- 154 篇 化学工程与技术
- 139 篇 力学（可授工学、理...
- 115 篇 建筑学
- 110 篇 土木工程
- 104 篇 航空宇航科学与技...
3,400 篇 理学
- 2,548 篇 物理学
- 805 篇 数学
- 486 篇 生物学
- 295 篇 系统科学
- 209 篇 统计学（可授理学、...
- 134 篇 化学
1,654 篇 医学
- 1,577 篇 临床医学
- 185 篇 基础医学(可授医学...
761 篇 管理学
- 582 篇 管理科学与工程(可...
- 190 篇 图书情报与档案管...
- 120 篇 工商管理
107 篇 农学
78 篇 法学
43 篇 经济学
42 篇 教育学
39 篇 艺术学
37 篇 军事学
18 篇 文学

主题

2,736 篇 computer vision
1,686 篇 cameras
1,489 篇 signal processin...
1,442 篇 robot vision sys...
1,355 篇 image processing
1,175 篇 robot sensing sy...
912 篇 signal processin...
874 篇 mobile robots
839 篇 feature extracti...
769 篇 machine vision
549 篇 image segmentati...
504 篇 object detection
440 篇 visualization
424 篇 deep learning
409 篇 robustness
393 篇 estimation
367 篇 stereo vision
358 篇 navigation
343 篇 training
319 篇 robot kinematics

机构

83 篇 centre for visio...
63 篇 xi an jiao tong ...
54 篇 centre for visio...
37 篇 school of electr...
36 篇 centre for visio...
29 篇 carnegie mellon ...
28 篇 chinese acad sci...
27 篇 shanghai jiao to...
27 篇 center for machi...
27 篇 university of ch...
23 篇 centre for visio...
23 篇 harbin inst tech...
21 篇 univ chinese aca...
21 篇 nanyang technol ...
17 篇 centre for visio...
16 篇 university of sc...
16 篇 tsinghua univers...
13 篇 chinese acad sci...
13 篇 univ sci & techn...
13 篇 chinese univ hon...

作者

52 篇 j. kittler
40 篇 josef kittler
28 篇 nakadai kazuhiro
19 篇 anil fernando
18 篇 wang wei
15 篇 chen chen
14 篇 yang yang
13 篇 jing zhang
13 篇 liu yang
13 篇 sun fuchun
13 篇 nascimento jacin...
12 篇 sun lining
12 篇 hansung kim
11 篇 zhang lei
11 篇 bartolozzi chiar...
11 篇 hong liu
10 篇 wang lei
10 篇 li yang
10 篇 aguiar pedro m. ...
10 篇 qiuqiang kong

语言

17,940 篇 英文
86 篇 中文
84 篇 其他
12 篇 土耳其文
3 篇 俄文
2 篇 西班牙文

检索条件"任意字段=International Conference on Robot Vision and Signal Processing"

共 18124 条记录，以下是431-440 订阅

全选清除本页清除全部题录导出标记到"检索档案"

详细简洁

排序：

ChatMap: A Wearable Platform Based on the Multi-modal Foundation Model to Augment Spatial Cognition for People with Blindness and Low vision

ChatMap: A Wearable Platform Based on the Multi-modal Founda...

引用

2024 international conference on Intelligent robots and Systems

作者： Hao, Yu Magay, Alexey Huang, Hao Yuan, Shuaihang Wen, Congcong Fang, Yi NYU Tandon Embodied AI & Robot AIR Lab Brooklyn NY 11201 USA NYU Abu Dhabi Abu Dhabi U Arab Emirates

ISBN: (纸本)9798350377712;9798350377705

Spatial cognition refers to the ability to gain knowledge about their surroundings and utilize this information to identify their location, acquire resources, and navigate their way back to familiar places. People with blindness and low vision (pBLV) face significant challenges with spatial cognition due to the reliance on visual input. Without the full range of visual cues, pBLV individuals often find it difficult to grasp a comprehensive understanding of their environment, leading to obstacles in scene recognition and precise object localization, especially in unfamiliar environments. This limitation extends to their ability to independently detect and avoid potential tripping hazards, making navigation and interaction with their environment more challenging. In this paper, we present a pioneering wearable platform tailored to enhance the spatial cognition of pBLV through the integration of multi-modal foundation model. The proposed platform integrates a wearable camera with audio module and leverages the advanced capabilities of vision language foundation model (i.e., GPT-4 and GPT-4V), for the nuanced processing of visual and textual data. Specifically, we employ vision language models to bridge the gap between visual information and the proprioception of visually impaired users, offering more intelligible guidance by aligning visual data with the natural perception of space and movement. Then we apply prompt engineering to guide the large language model to act as an assistant tailored specifically for pBLV users to produce accurate answers. Another innovation in our model is the incorporation of a chain of thought reasoning process, which enhances the accuracy and interpretability of the model, facilitating the generation of more precise responses to complex user inquiries across diverse environmental contexts. To assess the practical impact of our proposed wearable platform, we carried out a series of real-world experiments across three tasks that are commonly

关键词： Visual languages

来源：评论

学校读者我要写书评

暂无评论

LEARNING TO DRAW THROUGH A MULTI-STAGE ENVIRONMENT MODEL BASED REINFORCEMENT LEARNING 30

LEARNING TO DRAW THROUGH A MULTI-STAGE ENVIRONMENT MODEL BAS...

引用

30th IEEE international conference on Image processing (ICIP)

作者： Qiu, Ji Lu, Peng Peng, Xujun Beijing Univ Posts & Telecommun Sch Artificial Intelligence Beijing Peoples R China

ISBN: (纸本)9781728198354

Machine drawing has gradually become a hot research topic in computer vision and robotics domains recently. However, decomposing a given target image from raster space into an ordered sequence and reconstructing those strokes is a challenging task. In this work, we focus on the drawing task for the images in various styles where the distribution of stroke parameters differs. We propose a multi-stage environment model based reinforcement learning (RL) drawing framework with fine-grained perceptual reward to guide the agent under this framework to draw details and an overall outline of the target image accurately. The experiments show that the visual quality of our method slightly outperforms SOTAmethod in nature and doodle style, while it outperforms the SOTA approaches by a large margin with high efficiency in sketch style.

关键词： vision Reinforcement Learning Drawing

来源：评论

学校读者我要写书评

暂无评论

Time-series signal Classification Based on Adaptive Temporal Swin Transformer Network Model 5

Time-series Signal Classification Based on Adaptive Temporal...

引用

5th international conference on Computer vision, Image and Deep Learning, CVIDL 2024

作者： Zhang, Fangzuo Wuhan University of Technology Wuhan430000 China

ISBN: (纸本)9798350373820

The classification of temporal signals plays a significant role in deep learning tasks. However, it poses unique challenges due to the need for specialized architectures that can effectively capture the temporal dependencies and dynamic characteristics inherent in sequential data. In this paper, we propose an adaptive temporal Swin Transformer classification network model. The model's adaptive window mechanism allows the model to automatically adjust the size and shifting step of the shifting window to adapt to the multi-scale dynamic pattern. In addition, our model improves the data processing format, substantially reduces the computational complexity of processing high-dimensional signal data, and achieves the learning ability for different scale features of brain electrical signals, greatly improving the accuracy of signal classification tasks. Experimental results demonstrate that compared to other classification network models, our model performs better in capturing the complex temporal dependencies. © 2024 IEEE.

关键词： Deep learning

来源：评论

学校读者我要写书评

暂无评论

PYRAMID MASKED IMAGE MODELING FOR TRANSFORMER-BASED AERIAL OBJECT DETECTION 30

PYRAMID MASKED IMAGE MODELING FOR TRANSFORMER-BASED AERIAL O...

引用

30th IEEE international conference on Image processing (ICIP)

作者： Zhang, Cong Liu, Tianshan Ju, Yakun Lam, Kin-Man Hong Kong Polytech Univ Dept Elect & Informat Engn Kowloon Hong Kong Peoples R China

ISBN: (纸本)9781728198354

Two obstacles, the scarcity of annotated samples and the difficulty in preserving multi-scale hierarchical representations, hinder the advancement of vision Transformer-based aerial object detection. The emergence of self-supervised learning has inspired some solutions to the first issue. However, most solutions focus on single-scale features, conflicting with solving the second issue. To bridge this gap, this paper proposes a novel pyramid masked image modeling (MIM) framework, termed PyraMIM, for self-supervised pretraining in aerial scenarios. Without manual annotation, PyraMIM enables establishing pyramid representations during pretraining, which can be seamlessly adapted to downstream aerial object detection for performance improvement. Experimental results demonstrate the effectiveness and superiority of our method.

关键词： vision Transformer Masked Image Modeling Self-Supervised Learning Pyramid Architecture Aerial Object Detection

来源：评论

学校读者我要写书评

暂无评论

vision2Touch: Imaging Estimation of Surface Tactile Physical Properties 48

Vision2Touch: Imaging Estimation of Surface Tactile Physical...

引用

48th IEEE international conference on Acoustics, Speech and signal processing, ICASSP 2023

作者： Chen, Jie Zhou, Shizhe Hunan University College of Computer Science and Electronic Engneering Changsha China

ISBN: (纸本)9781728163277

Similar to the human's multiple perception system, the robot can also benefit from cross-modal learning. The connection between visual input and tactile perception is potentially important for automated operations. However, establishing an algorithmic mapping of the visual modal to the tactile modal is a challenging task. In this work, we use the framework of GANs to propose a cross-modal imaging method for estimating the tactile physical properties values based on the Gramian Summation Angular Field, combined with visual-tactile embedding cluster fusion and feature matching methods. The approach estimates 15 tactile properties. In particular, the task attempts to predict unknown surface properties based on "learned knowledge". Our results surpass the state-of-the-art approach on most tactile dimensions of the publicly available dataset. Additionally, we conduct a robustness study to verify the effect of angle and complex environment on the network prediction performance. © 2023 IEEE.

关键词： Physical properties

来源：评论

学校读者我要写书评

暂无评论

Diagnosis of Plant Leaf Disease using vision Transformer 10

Diagnosis of Plant Leaf Disease using Vision Transformer

引用

10th international conference on Communication and signal processing, ICCSP 2024

作者： Sundaraj, Ashik Isravel, Deva Priya Dhas, Julia Punitha Malar Karunya Institute of Technology and Sciences Division of Data Science and Cyber Security Coimbatore India Karunya Institute of Technology and Sciences Division of Computer Science and Engineering Coimbatore India

ISBN: (纸本)9798350353068

In precision agriculture, it is crucial to have a reliable system for identifying diseases and suggesting measures to maintain crop health and enhance yield performance. Addressing the persistent challenge of accurate and timely identification of plant illnesses, this study introduces a new understanding of the approach utilizing the vision Transformer (ViT) architecture. Leveraging a diverse dataset encompassing both diseased and healthy plant images, the ViT model demonstrates efficacy by achieving an precision of 94% in recognizing and categorizing diseases based on plant leaves. The success of this transformer-based methodology underscores its potential to significantly enhance the field of agricultural diagnostics, offering a promising solution to improve crop management practices and mitigate potential losses in yield. © 2024 IEEE.

关键词： Convolutional neural networks

来源：评论

学校读者我要写书评

暂无评论

CST-YOLO: A NOVEL METHOD FOR BLOOD CELL DETECTION BASED ON IMPROVED YOLOV7 AND CNN-SWIN TRANSFORMER 31

CST-YOLO: A NOVEL METHOD FOR BLOOD CELL DETECTION BASED ON I...

引用

2024 international conference on Image processing

作者： Kang, Ming Ting, Chee-Ming Ting, Fung Fung Phan, Raphael C-W. Monash Univ Sch Informat Technol Malaysia Campus Subang Jaya Malaysia

ISBN: (纸本)9798350349405;9798350349399

Blood cell detection is a typical small-scale object detection problem in computer vision. In this paper, we propose a CST-YOLO model for blood cell detection based on YOLOv7 architecture and enhance it with the CNN-Swin Transformer (CST), which is a new attempt at CNN-Transformer fusion. We also introduce three other useful modules: Weighted Efficient Layer Aggregation Networks (W-ELAN), Multiscale Channel Split (MCS), and Concatenate Convolutional Layers (CatConv) in our CST-YOLO to improve small-scale object detection precision. Experimental results show that the proposed CST-YOLO achieves 92.7%, 95.6%, and 91.1% mAP@0.5, respectively, on three blood cell datasets, outperforming state-of-the-art object detectors, e.g., RT-DETR, YOLOv5, and YOLOv7. Our code is available at https://***/mkang315/CST-YOLO.

关键词： Medical image processing small object detection You Only Look Once (YOLO) Swin Transformer CNN-Transformer fusion

来源：评论

学校读者我要写书评

暂无评论

Federated Prototype Guided Adaption for vision-Language Models

Federated Prototype Guided Adaption for Vision-Language Mode...

引用

2025 IEEE international conference on Acoustics, Speech, and signal processing, ICASSP 2025

作者： Liu, Youchao Huang, Dingjiang School of Data Science and Engineering East China Normal University Shanghai China

ISBN: (纸本)9798350368741

Federated Learning (FL) is a new pivotal paradigm for decentralized training on heterogeneous data. Recently fine-tuning of vision-Language Models (VLMs) has been extended to the federated setting to improve overall performance. Unfortunately, in this case, FL still faces two critical challenges that hinder its actual performance: data distribution heterogeneity and high resource costs brought by large VLMs. In this paper, we introduce FedPGA, a prototype-guided method, for achieving performance improvements in the federated setting for VLMs. Concretely, we design a prototype-based adapter for the vision-language model, CLIP. The lightweight adapter updates the prior knowledge encoded in CLIP to enhance its adaption capability further and avoid the effects of data distribution heterogeneity in the federated setting. Simultaneously, small-scale operations can mitigate the computational and communication burden caused by large VLMs. Our comprehensive empirical evaluations of nine diverse image classification datasets show that our method is superior to existing FL methods under VLMs. © 2025 IEEE.

关键词： federated learning few-shot classification prototype-based adapter vision-language model

来源：评论

学校读者我要写书评

暂无评论

Development of Methods for Image Filtering in Noise and their Implementation for a Web Service 42

Development of Methods for Image Filtering in Noise and thei...

引用

42nd international conference on Electronics and Nanotechnology

作者： Palahin, Dmytro Palahina, Elena Palahin, Volodymyr Univ Sorbonne Paris Nord Inst Galilee Paris France Cherkasy State Technol Univ Dept Robot & Telecommun Syst & Cybersecur Cherkassy Ukraine

ISBN: (纸本)9798350368185;9798350368178

Known image processing software systems use filtering methods such as the Gaussian filter, the median filter, and others, which often do not have satisfactory performance in processing certain types of noise. This leads to a partial loss of the useful signal and a deterioration in image quality. The work is devoted to improving the quality of image filtering in various kinds of noise. A model of additive interaction of signals in additive impulse noise is proposed. A new method of least finite differences (MLFD) for image filtering has been developed. An interactive web service for the implementation of MLFD was created. Various filtering methods have been proposed to reduce the effect of impulsive noise on images. The proposed data processing is based on a priori information about the type of noise. The web service is a flexible and efficient solution for filtering digital images and is not available in well-known graphics packages. The new web service is a good compromise between the quality of filtering and the simplicity of the solution compared to complex and resource-intensive systems for graphic image processing.

关键词： digital images impulse noise additive Gaussian noise data filtering and recovery gradient analysis

来源：评论

学校读者我要写书评

暂无评论

SALIENT GUIDED TEXT DETECTION IN E-COMMERCE IMAGES 31

SALIENT GUIDED TEXT DETECTION IN E-COMMERCE IMAGES

引用

2024 international conference on Image processing

作者： Yin, Boon Yin Japar, Nurul Univ Malaya Fac Comp Sci & Informat Technol Kuala Lumpur Malaysia

ISBN: (纸本)9798350349405;9798350349399

Text detection is a fundamental task in computer vision that involves identifying and locating text within images or videos. It has been the subject of extensive research, with numerous approaches primarily tailored for open-scene text, but there are limited studies dedicated to practical industries such as e-commerce. E-commerce images are designed to capture human attention, and effective text detection can amplify this marketing strategy. Yet, identifying text in ecommerce images poses particular challenges due to their distinct visual attributes, which set them apart from open-scene images. Therefore, this paper aims to address this gap by exploring how human attention can aid text detection on e-commerce images. The proposed model merges high-level text features with low-level and saliency features and exploits both local and semantic characteristics of image regions. Leveraging visual cues, low-level and saliency features aid in predicting the saliency map, which is then employed to aid text detection. The proposed method achieves better localization of text, outperforming current state-of-the-art models on the benchmark e-commerce SalECI dataset. The code for this study is available at https: //***/bebbieyin/SalientTextDet.

关键词： text-detection visual saliency e-commerce image-processing

来源：评论

学校读者我要写书评

暂无评论

没有更多数据了...

全选清除本页清除全部题录导出标记到“检索档案”

共500页 << < 40 41 42 43 44 45 46 47 48 49 > >>

检索报告对象比较合并检索0

隐藏清空

合并搜索

回到顶部

执行限定条件

内容：

评分：

请选择保存的检索档案：

请选择收藏分类：

订阅名称：

通借通还

温馨提示：

图书名称：

借书校区：

取书校区：

手机号码：

邮箱地址：

一卡通帐号：

电话和邮箱必须正确填写，我们会与您联系确认。

联系人：

所在院系：

联系邮箱：

联系电话：

内蒙古自治区呼和浩特市赛罕区大学西街235号邮编: 010021

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：