检索结果-内蒙古大学图书馆

Enhanced Classification System for Real-Time Embedded vision applications

IEEE ACCESS 2024年 12卷 162311-162326页

作者： Khelifi, Ramzi Nini, Brahim Berkane, Mohamed Univ Oum El Bouaghi Res Lab Comp Sci Complex Syst ReLa CS 2 Oum El Bouaghi 04000 Algeria Univ Oum El Bouaghi Artificial Intelligence & Autonomous Things Lab Oum El Bouaghi 04000 Algeria

Embedded computer vision systems are increasingly being adopted across various domains, playing a pivotal role in enabling advanced technologies such as autonomous vehicles and industrial automation. Their cost-effectiveness, compact size, and portability make them particularly well-suited for diverse implementations and operations. In real-time scenarios, these systems must process visual data with minimal latency, which is crucial for immediate decision-making. However, these solutions continue to face significant challenges related to computational efficiency, memory usage, and accuracy. This research addresses these challenges by enhancing classification methodologies, specifically in Gray Level Co-occurrence Matrix (GLCM) feature extraction and Support Vector machine (SVM) classifiers. To maintain a high level of accuracy while preserving performance, a smaller feature set is selected following a comprehensive complexity analysis and is further refined through Correlation-based Feature Selection (CFS). The proposed method achieves an overall classification accuracy of 84.76% with a feature set reduced by 79.2%, resulting in a 72.45% decrease in processing time, a 50% reduction in storage requirements, and up to a 77.8% decrease in memory demand during prediction. These improvements demonstrate the effectiveness of the proposed approach in improving the adaptability and capabilities of embedded vision systems (EVS), optimizing their performance under the constraints of real-time limited-resource environments.

关键词： Accuracy Support vector machines Real-time systems Feature extraction Memory management Computer vision Surveillance Bandwidth Wildlife machine learning image processing Embedded computer vision limited resource systems machine learning pattern classification real-time image processing

来源：评论

学校读者我要写书评

暂无评论

Ensemble of vision transformer architectures for efficient Alzheimer's Disease classification

引用

BRAIN INFORMATICS 2024年第1期11卷 25页

作者： Shaffi, Noushath Viswan, Vimbi Mahmud, Mufti Sultan Qaboos Univ Coll Sci Dept Comp Sci POB 36 Muscat 123 Oman Univ Technol & Appl Sci Coll Comp & Informat Sci Sohar OM311 Oman Nottingham Trent Univ Dept Comp Sci Nottingham NG11 8NS England Nottingham Trent Univ Med Technol Innovat Facil Nottingham NG11 England Nottingham Trent Univ Comp & Informat Res Ctr Nottingham NG11 8NS England

Transformers have dominated the landscape of Natural Language processing (NLP) and revolutionalized generative AI applications. vision Transformers (VT) have recently become a new state-of-the-art for computer vision applications. Motivated by the success of VTs in capturing short and long-range dependencies and their ability to handle class imbalance, this paper proposes an ensemble framework of VTs for the efficient classification of Alzheimer's Disease (AD). The framework consists of four vanilla VTs, and ensembles formed using hard and soft-voting approaches. The proposed model was tested using two popular AD datasets: OASIS and ADNI. The ADNI dataset was employed to assess the models' efficacy under imbalanced and data-scarce conditions. The ensemble of VT saw an improvement of around 2% compared to individual models. Furthermore, the results are compared with state-of-the-art and custom-built Convolutional Neural Network (CNN) architectures and machine Learning (ML) models under varying data conditions. The experimental results demonstrated an overall performance gain of 4.14% and 4.72% accuracy over the ML and CNN algorithms, respectively. The study has also identified specific limitations and proposes avenues for future research. The codes used in the study are made publicly available.

关键词： vision transformer Convolutional neural networks machine learning models Alzheimer's Disease Swin transformer Data efficient image transformers Bidirectional encoder representation from image transformers

来源：评论

学校读者我要写书评

暂无评论

Evaluation of machine learning in recognizing images of reinforced concrete damage

引用

MULTIMEDIA TOOLS AND applications 2023年第19期82卷 30221-30246页

作者： Fan, Ching-Lung Republ China Mil Acad Dept Civil Engn 1 Weiwu RdFengshan Kaohsiung 83059 Taiwan

Damage to reinforced concrete (RC) facilities occurs through the process of natural deterioration. machine learning can be employed to effectively identify various damage areas and ensure safety. The performance of machine vision methods depends on image quality. In this study, five image types (Types I-V) with combinations of image deficiencies pertaining to uniform illuminance, uneven illuminance, orthoimage, tilt angle, and image blur were used to evaluate the damage recognition capabilities of maximum likelihood (MLH), support vector machine (SVM), and random forest (RF) methods. Type I images were orthoimages with uniform illuminance, Type II images were tilted images with uniform illuminance, Type III images were orthoimages with uneven illuminance, Type iv images were tilted images with uneven illuminance, and Type V images were tilted, blurred images with uneven illuminance. MLH was most accurate (98.6%) in Type I images, and RF was the least accurate (62.8%) in Type V images. image tilt (in Type II images) did not diminish the damage recognition capabilities of the three types of machine learning methods (mean accuracy = 97.2%). For tilted images with uneven illuminance (Type iv), a severe expansion effect was produced, reducing the mean accuracy to 70.1%. Type III images were recognized with a mean accuracy of 87.1%;uneven illuminance increased the error rate for three classes of damage. By testing various image types, the impact of image quality on the variability of machine learning recognition is understood, and the ability of automated machine learning recognition in the future is improved.

关键词： Computer vision Maximum likelihood Support vector machine Random forest image quality

来源：评论

学校读者我要写书评

暂无评论

Computer vision on X-Ray Data in Industrial Production and Security applications: A Comprehensive Survey

引用

IEEE ACCESS 2023年 11卷 2445-2477页

作者： Rafiei, Mehdi Raitoharju, Jenni Iosifidis, Alexandros Aarhus Univ DIGIT Dept Elect & Comp Engn Aarhus Denmark Univ Jyvaskyla Fac Informat Technol Jyvaskyla 40100 Finland

X-ray imaging technology has been used for decades in clinical tasks to reveal the internal condition of different organs, and in recent years, it has become more common in other areas such as industry, security, and geography. The recent development of computer vision and machine learning techniques has also made it easier to automatically process X-ray images and several machine learning-based object (anomaly) detection, classification, and segmentation methods have been recently employed in X-ray image analysis. Due to the high potential of deep learning in related image processing applications, it has been used in most of the studies. This survey reviews the recent research on using computer vision and machine learning for X-ray analysis in industrial production and security applications and covers the applications, techniques, evaluation metrics, datasets, and performance comparison of those techniques on publicly available datasets. We also highlight some drawbacks in the published research and give recommendations for future research in computer vision-based X-ray analysis.

关键词： X-ray imaging Security Computer vision Imaging Industrial engineering Three-dimensional displays Deep learning deep learning X-ray industrial applications security applications

来源：评论

学校读者我要写书评

暂无评论

Development of Computer vision-based Intravenous (iv) Therapy Infusion Drip Rate Measuring Mobile Application

引用

JOURNAL OF ELECTRICAL ENGINEERING & TECHNOLOGY 2025年 1-10页

作者： Han, Geng Paik, Woojin Konkuk Univ Dept Comp Engn Glocal Campus Chungju Si South Korea

We introduce a high-performance computer vision based Intravenous (iv) infusion speed measurement system as a camera application on an iPhone or Android phone. Our system uses You Only Look Once version 5 (YOLOv5) as it was designed for real-time object detection, making it substantially faster than two-stage algorithms such as R-CNN. In addition, YOLOv5 offers greater precision than its predecessors, making it more competitive with other object detection methods. However, YOLOv5 can be challenging to use on a mobile device for several reasons as it requires substantial computational resources for image processing and prediction generation. Thus, we chose the model optimization approach because it requires the least effort to implement. Because NCNN (Neural Network Computing) is a high-performance neural network inference framework optimized for mobile platforms such as Android and iOS, we converted a YOLOv5 model to an NCNN (Novel Convolutional Neural Network) model. Compared to the previous research, our application showed less variability and higher consistency in the infusion flow rate measurement.

关键词： Computer vision Intravenous (iv) infusion Mobile applications Speed measurement You Only Look Once (YOLO)

来源：评论

学校读者我要写书评

暂无评论

Enhancing the Energy Efficiency and Robustness of tinyML Computer vision Using Coarsely-quantized Log-gradient Input images

引用

ACM TRANSACTIONS ON EMBEDDED COMPUTING SYSTEMS 2024年第3期23卷 1-20页

作者： Lu, Qianyun Murmann, Boris Stanford Univ 420 Via Palou Mall Stanford CA 94305 USA

This article studies the merits of applying log-gradient input images to convolutional neural networks (CNNs) for tinyML computer vision (CV). We show that log gradients enable: (i) aggressive 1-bit quantization of first-layer inputs, (ii) potential CNN resource reductions, (iii) inherent insensitivity to illumination changes (1.7% accuracy loss across 2(-5)... 2(3) brightness variation vs. up to 10% for JPEG), and (iv) robustness to adversarial attacks (>10% higher accuracy than JPEG-trained models). We establish these results using the PASCAL RAW image dataset and through a combination of experiments using quantization threshold search, neural architecture search, and a fixed three-layer network. The latter reveals that training on log-gradient images leads to higher filter similarity, making the CNN more prunable. The combined benefits of aggressive first-layer quantization, CNN resource reductions, and operation without tight exposure control and image signal processing (ISP) are helpful for pushing tinyML CV toward its ultimate efficiency limits.

关键词： Adversarial examples computer vision pipeline filter similarity illumination invariance image signal processing log gradients neural network quantization quantization threshold search sensor datasets

来源：评论

学校读者我要写书评

暂无评论

A Regret Bound for the AdaMax Algorithm With image Segmentation Application

引用

MATHEMATICAL METHODS IN THE APPLIED SCIENCES 2025年第9期48卷 10208-10214页

作者： Jirakipuwapat, Wachirapong King Mongkuts Univ Technol North Bangkok KMUTNB Fac Sci Energy & Environm Rayong Thailand

The AdaMax algorithm provides enhanced convergence properties for stochastic optimization problems. In this paper, we present a regret bound for the AdaMax algorithm, offering a tighter and more refined analysis compared to existing bounds. This theoretical advancement provides deeper insights into the optimization landscape of machine learning algorithms. Specifically, the You Only Look Once (YOLO) framework has become well-known as an extremely effective object segmentation tool, mostly because of its extraordinary accuracy in real-time processing, which makes it a preferred option for many computer vision applications. Finally, we used this algorithm for image segmentation.

关键词： AdaMax deep learning image segmentation regret bound YOLO

来源：评论

学校读者我要写书评

暂无评论

Real-Time image processing applications in Automatic BGA Inspection System

引用

IEEE ACCESS 2025年 13卷 40621-40631页

作者： Chen, Chiung-Hsing Chiu, Cheng-Chang Kao, Shao-En Li, Hsiang Natl Kaohsiung Univ Sci & Technol Dept Telecommun Engn Kaohsiung 811213 Taiwan Natl Cheng Kung Univ Dept Engn Sci Tainan 701401 Taiwan

With the rapid advancement in wafer packaging technology, especially the surging demand for chips, enhancing product quality and process efficiency has become increasingly crucial. This article delves into the automatic detection of pins on Ball Grid Array (BGA) within wafer packaging processes. This system is engineered with a flexible software and hardware architecture to address evolving industrial requirements, facilitating swift adaptation to new processing standards and technological demands. By utilizing Programmable Logic Controller (PLC) to control a three-axis gantry slide combined with industrial camera imaging technology, this system achieves high efficiency and precise positioning, thereby delivering high-quality image. This article utilizes YOLOv10 image processing technology and machine learning algorithms to effectively achieve accurate identification and classification of BGA defects. The YOLOv10 is chosen for its outstanding recognition capabilities and swift processing speed, enabling the rapid and accurate identification of minor defects, such as bent pins, missing pins, and solder ball defects. Through large image analysis, this system has been proven to enhance detection accuracy and reduce errors of manual detection. This article primarily addresses issues in semiconductor manufacturing processes and improves the product yield rate in current production lines. By effectively integrating AI-based detection technology into semiconductor manufacturing, it replaces labor-intensive tasks, enhancing efficiency and precision.

关键词： Pins Inspection Imaging Production Cameras Packaging machine vision Accuracy Reliability Costs Wafer packaging technology automatic detection three-axis gantry slide YOLOv10 machine learning algorithms

来源：评论

学校读者我要写书评

暂无评论

From methods to datasets: A survey on image-Caption Generators

引用

MULTIMEDIA TOOLS AND applications 2024年第9期83卷 28077-28123页

作者： Agarwal, Lakshita Verma, Bindu Delhi Technol Univ Dept Informat Technol Delhi 110042 India

image - Caption Generator is a popular Artificial Intelligence research tool that works with image comprehension and language definition. Creating well-structured sentences requires a thorough understanding of language in a systematic and semantic way. Being able to describe the substance of an image using well-structured phrases is a difficult undertaking, but it can have a significant impact in terms of assisting visually impaired people in better understanding the images' content. image captions has gained a lot of attention as a study subject for various computer vision and natural language processing (NLP) applications. The goal of image captions is to create logical and accurate natural language phrases that describes an image. It relies on the caption model to see items and appropriately characterise their relationships. Intuitively, it is also difficult for a machine to see a typical image in the same way that humans do. It does, however, provide the foundation for intelligent exploration in deep learning. In this review paper, we will focus on the latest in-depth advanced captions techniques for image captioning. This paper highlights related methodologies and focuses on aspects that are crucial in computer recognition, as well as on the numerous strategies and procedures being developed for the development of image captions. It was also observed that Recurrent neural networks (RNNs) are used in the bulk of research works (45%), followed by attention-based models (30%), transformer-based models (15%) and other methods (10%). An overview of the approaches utilised in image captioning research is discussed in this paper. Furthermore, the benefits and drawbacks of these methodologies are explored, as well as the most regularly used data sets and evaluation processes in this sector are being studied.

关键词： image- Caption Generator Natural language processing Computer vision Intelligent exploration Deep learning

来源：评论

学校读者我要写书评

暂无评论

Fast no-reference deep image dehazing

引用

machine vision AND applications 2024年第5期35卷 122页

作者： Qin, Hongyi Belyaev, Alexander G. Imperial Coll London Dept Elect & Elect Engn London England Heriot Watt Univ Sch Engn & Phys Sci Inst Sensors Signals & Syst Edinburgh Scotland

This paper presents a deep learning method for image dehazing and clarification. The main advantages of the method are high computational speed and using unpaired image data for training. The method adapts the Zero-DCE approach (Li et al. in IEEE Trans Pattern Anal Mach Intell 44(8):4225-4238, 2021) for the image dehazing problem and uses high-order curves to adjust the dynamic range of images and achieve dehazing. Training the proposed dehazing neural network does not require paired hazy and clear datasets but instead utilizes a set of loss functions, assessing the quality of dehazed images to drive the training process. Experiments on a large number of real-world hazy images demonstrate that our proposed network effectively removes haze while preserving details and enhancing brightness. Furthermore, on an affordable GPU-equipped laptop, the processing speed can reach 1000 FPS for images with 2K resolution, making it highly suitable for real-time dehazing applications.

关键词： image dehazing and clarification No-reference neural network training Real-time dehazing Zero-DCE

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：