检索结果-内蒙古大学图书馆

A Survey on Graph Neural Networks and Graph Transformers in Computer vision: A Task-Oriented Perspective

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND machine INTELLIGENCE 2024年第12期46卷 10297-10318页

作者： Chen, Chaoqi Wu, Yushuang Dai, Qiyuan Zhou, Hong-Yu Xu, Mutian Yang, Sibei Han, Xiaoguang Yu, Yizhou Univ Hong Kong Dept Comp Sci Hong Kong Peoples R China Chinese Univ Hong Kong Sch Sci & Engn Shenzhen 518172 Peoples R China Chinese Univ Hong Kong Future Network Intelligence Inst Shenzhen 518172 Peoples R China ShanghaiTech Univ Sch Informat Sci & Technol Shanghai 201210 Peoples R China Shanghai Engn Res Ctr Intelligent Vis & Imagi Shanghai 201210 Peoples R China

Graph Neural Networks (GNNs) have gained momentum in graph representation learning and boosted the state of the art in a variety of areas, such as data mining (e.g., social network analysis and recommender systems), computer vision (e.g., object detection and point cloud learning), and natural language processing (e.g., relation extraction and sequence learning), to name a few. With the emergence of Transformers in natural language processing and computer vision, graph Transformers embed a graph structure into the Transformer architecture to overcome the limitations of local neighborhood aggregation while avoiding strict structural inductive biases. In this paper, we present a comprehensive review of GNNs and graph Transformers in computer vision from a task-oriented perspective. Specifically, we divide their applications in computer vision into five categories according to the modality of input data, i.e., 2D natural images, videos, 3D data, vision + language, and medical images. In each category, we further divide the applications according to a set of vision tasks. Such a task-oriented taxonomy allows us to examine how each task is tackled by different GNN-based approaches and how well these approaches perform. Based on the necessary preliminaries, we provide the definitions and challenges of the tasks, in-depth coverage of the representative approaches, as well as discussions regarding insights, limitations, and future directions.

关键词： Task analysis Computer vision Three-dimensional displays Transformers Point cloud compression Visualization Videos Computer vision graph transformers graph neural networks medical image analysis point clouds and meshes vision and language

来源：评论

学校读者我要写书评

暂无评论

On-machine Wear Measurement for Milling Cutter Based on machine vision 5

On-machine Wear Measurement for Milling Cutter Based on Mach...

引用

5th International Conference on Mechatronics Technology and Intelligent Manufacturing (ICMTIM)

作者： Yu, Jiarui Zan, Tao Liu, Weibo Li, Yikun Peng, Junxi Lei, Qichang Beijing Univ Technol Coll Mech & Energy Engn Beijing Peoples R China Northeastern Univ Coll Informat Sci & Engn Shenyang Peoples R China Beijing Univ Technol Coll Beijing Dublin Int Beijing Peoples R China

ISBN: (纸本)9798350363272;9798350363265

As a key factor in the milling process, the wear status of the milling cutter has a significant impact on the machining quality of the workpiece. To detect wear on a milling machine efficiently and precisely, this paper presents the development of a milling machine wear detection system based on machine vision and digital image processing. The system including link mechanisms and industrial camera is designed for auxiliary localization and collection of on-machine images of milling cutter status. The image preprocessing method based on automatic threshold segmentation and Canny edge detection operator is proposed to identify the edge of cutter wear. The Maximum connected domains algorithm is used to screen the wear area of the milling cutter and the amount of wear is obtained based on a calibrated scaling method. Experimental results show that the proposed system is suitable for industrial use due to its rapid detection speed and strong recognition accuracy, which are desirable for engineering applications.

关键词： milling cutter wear detection machine vision image processing auxiliary localization mechanism

来源：评论

学校读者我要写书评

暂无评论

Enhancing crop yield estimation from remote sensing data: a comparative study of the Quartile Clean image method and vision transformer

引用

DISCOVER APPLIED SCIENCES 2024年第11期6卷 610页

作者： Thakkar, Manan Vanzara, Rakeshkumar Ganpat Univ Comp Sci & Informat Technol Mehsana Gujarat India Ganpat Univ Informat Technol Mehsana Gujarat India

The use of high-altitude remote sensing (RS) data from aerial and satellite platforms presents considerable challenges for agricultural monitoring and crop yield estimation due to the presence of noise caused by atmospheric interference, sensor anomalies, and outlier pixel values. This paper introduces a "Quartile Clean image" pre-processing technique to address these data issues by analyzing quartile pixel values in local neighborhoods to identify and adjust outliers. Applying this technique to 20,946 Moderate Resolution Imaging Spectroradiometer (MODIS) images from 2002 to 2015, improved the mean peak signal-to-noise ratio (PSNR) to 40.91 dB. Integrating Quartile Clean data with Convolutional Neural Networks (CNN) models with exponential decay learning rate scheduling achieved RMSE improvements up to 5.88% for soybeans and 21.85% for corn, while Long Short-Term Memory (LSTM) models demonstrated RMSE reductions up to 11.52% for soybeans and 29.92% for corn using exponential decay learning rates. To compare the proposed method with state-of-the-art technique, we introduce the vision Transformer (ViT) model for crop yield estimation. The ViT model, applied to the same dataset, achieves remarkable performance without explicit pre-processing, with R2 scores ranging from 0.9752 to 0.9875 for soybean and 0.9540 to 0.9888 for corn yield estimation. The RMSE values range from 7.75086 to 9.76838 for soybean and 26.25265 to 34.20382 for corn, demonstrating the ViT model's robustness. This research contributes by (1) introducing the Quartile Clean image method for enhancing RS data quality and improving crop yield estimation accuracy, and (2) comparing it with the state-of-the-art ViT model. The results demonstrate the effectiveness of the proposed approach and highlight the potential of the ViT model for crop yield estimation, representing a valuable advancement in processing high-altitude imagery for precision agriculture applications. Novel Quartile Clean image technique i

关键词： machine learning Deep learning Quartile clean image MODIS Remote sensing Data pre-processing Noisy feature handling vision transformer

来源：评论

学校读者我要写书评

暂无评论

Transformer-Based Visual Segmentation: A Survey

引用

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND machine INTELLIGENCE 2024年第12期46卷 10138-10163页

作者： Li, Xiangtai Ding, Henghui Yuan, Haobo Zhang, Wenwei Pang, Jiangmiao Cheng, Guangliang Chen, Kai Liu, Ziwei Loy, Chen Change Nanyang Technol Univ S Lab Singapore 639798 Singapore Fudan Univ Inst Big Data Shanghai 200437 Peoples R China Shanghai AI Lab Shanghai 200240 Peoples R China Univ Liverpool Liverpool L69 7ZX Merseyside England

Visual segmentation seeks to partition images, video frames, or point clouds into multiple segments or groups. This technique has numerous real-world applications, such as autonomous driving, image editing, robot sensing, and medical analysis. Over the past decade, deep learning-based methods have made remarkable strides in this area. Recently, transformers, a type of neural network based on self-attention originally designed for natural language processing, have considerably surpassed previous convolutional or recurrent approaches in various vision processing tasks. Specifically, vision transformers offer robust, unified, and even simpler solutions for various segmentation tasks. This survey provides a thorough overview of transformer-based visual segmentation, summarizing recent advancements. We first review the background, encompassing problem definitions, datasets, and prior convolutional methods. Next, we summarize a meta-architecture that unifies all recent transformer-based approaches. Based on this meta-architecture, we examine various method designs, including modifications to the meta-architecture and associated applications. We also present several specific subfields, including 3D point cloud segmentation, foundation model tuning, domain-aware segmentation, efficient segmentation, and medical segmentation. Additionally, we compile and re-evaluate the reviewed methods on several well-established datasets. Finally, we identify open challenges in this field and propose directions for future research.

关键词： image segmentation Transformers Surveys Task analysis Measurement Object detection Visualization vision transformer review dense prediction image segmentation video segmentation scene understanding

来源：评论

学校读者我要写书评

暂无评论

Technique of Real-Time Detection of Technical Surface Defects

引用

JOURNAL OF FRICTION AND WEAR 2023年第6期44卷 383-390页

作者： Markova, L. V. Belarusian Natl Tech Univ Minsk 220013 BELARUS

A technique and an algorithm of digital surface image processing are proposed to increase the validity of real-time detection of small size defects. The algorithm is implemented in the MATLAB programming environment. The technique is based on segmentation of the high-frequency component of surface texture because small size defects are especially pronounced in this component. The high-frequency component, in particular roughness, is extracted by means of wavelet transform for frequency components separation and homomorphic filtration for compensation of low-frequency distortion caused by nonuniform illumination of test surface. Segmentation of the high-frequency texture component consists in formation of a binary image using the texture descriptors derived from the gray-level co-occurrence matrix as the segmentation threshold. The proposed technique and algorithm are approved in applications to defect detection for a simulated surface, for real ground surface of hardened steel, and for surfaces of carbon fiber reinforced plastic composite. Extraction efficiency of the high-frequency component of surface texture is shown. It is found that texture descriptors, "contrast' and "energy," can be applied as segmentation thresholds for defect extraction/determination on the ground (anisotropic) surface while segmentation of an image of a plastic composite (isotropic) surface is effective just with "energy" as a threshold. The proposed technique can be applied for simultaneously real-time monitoring the surface texture and detecting the small size defect in machine vision systems during production and operation of tribosystems.

关键词： technical surface surface texture defect detection machine vision digital processing image segmentation

来源：评论

学校读者我要写书评

暂无评论

vision-guided robot application for metal surface edge grinding

引用

SN APPLIED SCIENCES 2023年第9期5卷 236页

作者： Li, Chunlei Dun, Xiaofeng Li, Liang Nan, Rui Baoji Univ Arts & Sci Sch Mech Engn Baoji 721016 Peoples R China Shaanxi Key Lab Adv Mfg & Evaluat Robot Key Compon Baoji 721016 Peoples R China Nanjing ESTUN Automat Co Ltd Applicat Proc Res Dept Nanjing 211102 Peoples R China

The combination of machine vision and grinding robots can be visualized as a collaboration between human eyes and limbs to achieve a deep integration between external perception and execution actions. This combination will give the grinding robot more operability and flexibility, which will enable it to better realize the purpose of replacing humans with machines. In response to the demand for flexible grinding of titanium surface edges proposed by a titanium manufacturer, this paper conducts an in-depth study on the prototype system of vision-guided grinding robots and related applications. Firstly, this study analyzes the shortcomings of the existing robotic regrinding process and achieves the improvement of the regrinding process by introducing machine vision technology. Subsequently, this study further utilizes machine vision and image processing algorithms to achieve high-quality recognition and high-precision positioning of metal surface edges. Then, the D-H parameter model of the regrinding robot is established, and the planning and simulation of the regrinding trajectory is carried out using the position information of the identified regrinding edges. Finally, the simulation-validated grinding trajectory is introduced into the grinding robot, and the effectiveness of the proposed scheme is verified by actual grinding experiments.

关键词： Grinding robot machine vision image processing Grinding trajectory planning Simulation modeling

来源：评论

学校读者我要写书评

暂无评论

Electron Density Specification in the Inner Magnetosphere From the Narrow Band Receiver Onboard DSX

引用

RADIO SCIENCE 2024年第2期59卷 1-20页

作者： Su, Yi-Jiun Carilli, John A. Parham, J. Brent Chu, Xiangning Galkin, Ivan A. Ginet, Gregory P. AF Res Lab Space Vehicles Directorate Kirtland AFB NM 87117 USA MIT Lincoln Lab Cambridge MA USA Univ Colorado Boulder Lab Atmospher & Space Phys Boulder CO USA Univ Massachusetts Lowell Lowell MA USA

Electron density plays an important role in the study of wave propagation and is known to be associated with the index of refraction and radiation belt diffusion coefficients. The primary objective of our investigation is to explore the possibility of implementing an onboard signal processing algorithm to automatically obtain electron densities from the upper hybrid resonance traces of wave spectrograms for future missions. U-Net, developed for biomedical image segmentation, has been adapted as our deep learning architecture with results being compared with those extracted from a more traditional semi-automated method. As a product, electron densities and cyclotron frequencies for the entire DSX mission between 2019 and 2021 are acquired for further analysis and applications. Due to limited space measurements, a synthetic image generator based on data statistics and randomization is proposed as an initial step toward the development of a generative adversarial network in hopes of providing unlimited realistic data sources for advanced machine learning. Plain Language Summary Electron density is the most important fundamental plasma parameter, however, it is very difficult to directly measure in situ due to spacecraft potential. A convolutional neural network (CNN), developed to recognize features from biomedical images, has been adapted to pull out the resonance traces from space wave receivers automatically specifying densities along satellite orbits. The comparison between computer vision based on a CNN and human vision based on a semi-automated extraction is demonstrated in this paper. With additional development and refinement, our proof-of-concept study may be matured to a level suitable for incorporation into onboard signal processing units to reduce human labor and human-in-the-loop induced operational errors during future space missions.

关键词： deep machine learning electron density satellite wave receiver plasmasphere image processing space instrument software development

来源：评论

学校读者我要写书评

暂无评论

Non-invasive coronary artery disease identification through the iris and bio-demographic health profile features using stacking learning

引用

image AND vision COMPUTING 2024年 146卷

作者： Ozbilgin, Ferdi Kurnaz, Cetin Aydin, Ertan Giresun Univ Dept Elect & Elect Engn Giresun Turkiye Ondokuz Mayis Univ Dept Elect & Elect Engn Samsun Turkiye Giresun Univ Fac Med Dept Cardiol Giresun Turkiye

This study proposes a non-invasive method for predicting Coronary Artery Disease (CAD) using iris analysis, patient data, and machine Learning (ML), primarily with iris images. It involved 281 participants, comprising 155 CAD patients and 126 non -patient controls, with eye images and biodemographic data collected at a Cardiology outpatient clinic. The study explored three scenarios: Scenario -I focused on biodemographic data, Scenario -ii on iris features, and Scenario -iiI combined iris images and data. Iris processing included location determination, normalization, and heart region selection, with image enhancement via adaptive histogram equalization. Feature extraction through a 2 -level wavelet transform generated 272 attributes, including statistical, Gray Level Co -occurrence Matrix, and Gray Level Run Length Matrix features for eight subcomponents. Correlation -based selection identified the best features, and classification employed ML techniques and incorporated stacking learning to enhance the results. Scenario -I achieved the highest accuracy at 83.57% among all evaluated algorithms. In Scenario -ii, the proposed algorithm consistently outperformed others, achieving 94.88% accuracy and strong performance in other metrics, highlighting its effectiveness. In Scenario -iiI, the algorithm maintained superiority with 96.07% accuracy, specificity, recall, and area under the curve values. The proposed algorithm consistently outperforms other methods across scenarios, indicating its potential for CAD diagnosis, making it a promising choice for future CAD systems. The proposed algorithm presents a novel approach to the preliminary diagnosis of CAD, eliminating the necessity for electrocardiography, echocardiography, or effort tests. It also enables seamless integration into telemedicine systems, allowing for tele -diagnosis to conduct preliminary assessments before routine clinical practice.

关键词： Coronary artery disease iris, image processing Bio-demographic data Stacking machine learning

来源：评论

学校读者我要写书评

暂无评论

A fast, lightweight deep learning vision pipeline for autonomous UAV landing support with added robustness

引用

ENGINEERING applications OF ARTIFICIAL INTELLIGENCE 2024年 131卷

作者： Pieczynski, Dominik Ptak, Bartosz Kraft, Marek Piechocki, Mateusz Aszkowski, Przemyslaw Poznan Univ Tech Inst Robot & Machine Intelligence Piotrowo 3A PL-60965 Poznan Poland

Despite massive development in aerial robotics, precise and autonomous landing in various conditions is still challenging. This process is affected by many factors, such as terrain shape, weather conditions, and the presence of obstacles. This paper describes a deep learning-accelerated image processing pipeline for accurate detection and relative pose estimation of the UAV with respect to the landing pad. Moreover, the system provides increased safety and robustness by implementing human presence detection and error estimation for both landing target detection and pose computation. Human presence and landing pad location are performed by estimating the presence probability via segmentation. This is followed by the landing pad keypoints' location regression algorithm, which, in addition to coordinates, provides the uncertainty of presence for each defined landing pad landmark. To perform the aforementioned tasks, a set of lightweight neural network models was selected and evaluated. The resulting measurements of the system's performance and accuracy are presented for each component individually and for the whole processing pipeline. The measurements are performed using onboard embedded UAV hardware and confirm that the method can provide accurate, low-latency feedback information for safe landing support.

关键词： Unmanned aerial vehicle Landing support image processing Deep learning On-board processing

来源：评论

学校读者我要写书评

暂无评论

Guest Editorial Introduction to the Special Section on Transformer Models in vision

引用

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND machine INTELLIGENCE 2023年第11期45卷 12721-12725页

作者： Khan, Salman Khan, Fahad Shahbaz Vaswani, Ashish Parmar, Niki Yang, Ming-Hsuan Shah, Mubarak Mohamed Bin Zayed Univ Artificial Intelligence Abu Dhabi U Arab Emirates Australian Natl Univ Canberra 2601 Australia Linkoping Univ S-58183 Linkoping Sweden Stealth Mt View WY USA Univ Calif Merced Merced CA 95343 USA Univ Cent Florida Orlando FL 32816 USA

Transformer models have achieved outstanding results on a variety of language tasks, such as text classification, ma- chine translation, and question answering. This success in the field of Natural Language processing (NLP) has sparked interest in the computer vision community to apply these models to vision and multi-modal learning tasks. However, visual data has a unique structure, requiring the need to rethink network designs and training methods. As a result, Transformer models and their variations have been suc- cessfully used for image recognition, object detection, seg- mentation, image super-resolution, video understanding, image generation, text-image synthesis, and visual question answering, among other applications.

关键词： Special issues and sections Transformers Text categorization machine translation Natural language processing

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：