检索结果-内蒙古大学图书馆

A Regret Bound for the AdaMax Algorithm With image Segmentation Application

MATHEMATICAL METHODS IN THE APPLIED SCIENCES 2025年第9期48卷 10208-10214页

作者： Jirakipuwapat, Wachirapong King Mongkuts Univ Technol North Bangkok KMUTNB Fac Sci Energy & Environm Rayong Thailand

The AdaMax algorithm provides enhanced convergence properties for stochastic optimization problems. In this paper, we present a regret bound for the AdaMax algorithm, offering a tighter and more refined analysis compared to existing bounds. This theoretical advancement provides deeper insights into the optimization landscape of machine learning algorithms. Specifically, the You Only Look Once (YOLO) framework has become well-known as an extremely effective object segmentation tool, mostly because of its extraordinary accuracy in real-time processing, which makes it a preferred option for many computer vision applications. Finally, we used this algorithm for image segmentation.

关键词： AdaMax deep learning image segmentation regret bound YOLO

来源：评论

学校读者我要写书评

暂无评论

Improve accuracy in CNNs while using approximate computing methods

引用

JOURNAL OF SUPERCOMPUTING 2025年第2期81卷 1-23页

作者： Rafieinejad, Mohammadreza Marvasti, Mohammadreza Binesh Asghari, Seyyed Amir Shahbakhti, Kimiya Kharazmi Univ Dept Elect & Comp Engn Tehran Iran

Artificial neural networks have been one of the science's most influential and essential branches in the past decades. Neural networks have found applications in various fields including medical and pharmaceutical services, voice and speech recognition, computer vision, natural language processing, and video and image processing. Neural networks have many layers and consume much energy. Approximate computing is a promising way to reduce energy consumption in applications that can tolerate a degree of accuracy reduction. This paper proposes an effective method to prevent accuracy reduction after using approximate computing methods in the CNNs. The method exploits the k-means clustering algorithm to label pixels in the first convolutional layer. Then, using one of the existing pruning methods, different pruning amounts have been applied to all layers. The experimental results on three CNNs and four different datasets show that the accuracy of the proposed method has significantly improved (by 17%) compared to the baseline network.

关键词： Approximate computing Computer vision Convolutional neural network Energy consumption machine learning Neural network

来源：评论

学校读者我要写书评

暂无评论

Real-Time image processing applications in Automatic BGA Inspection System

引用

IEEE ACCESS 2025年 13卷 40621-40631页

作者： Chen, Chiung-Hsing Chiu, Cheng-Chang Kao, Shao-En Li, Hsiang Natl Kaohsiung Univ Sci & Technol Dept Telecommun Engn Kaohsiung 811213 Taiwan Natl Cheng Kung Univ Dept Engn Sci Tainan 701401 Taiwan

With the rapid advancement in wafer packaging technology, especially the surging demand for chips, enhancing product quality and process efficiency has become increasingly crucial. This article delves into the automatic detection of pins on Ball Grid Array (BGA) within wafer packaging processes. This system is engineered with a flexible software and hardware architecture to address evolving industrial requirements, facilitating swift adaptation to new processing standards and technological demands. By utilizing Programmable Logic Controller (PLC) to control a three-axis gantry slide combined with industrial camera imaging technology, this system achieves high efficiency and precise positioning, thereby delivering high-quality image. This article utilizes YOLOv10 image processing technology and machine learning algorithms to effectively achieve accurate identification and classification of BGA defects. The YOLOv10 is chosen for its outstanding recognition capabilities and swift processing speed, enabling the rapid and accurate identification of minor defects, such as bent pins, missing pins, and solder ball defects. Through large image analysis, this system has been proven to enhance detection accuracy and reduce errors of manual detection. This article primarily addresses issues in semiconductor manufacturing processes and improves the product yield rate in current production lines. By effectively integrating AI-based detection technology into semiconductor manufacturing, it replaces labor-intensive tasks, enhancing efficiency and precision.

关键词： Pins Inspection Imaging Production Cameras Packaging machine vision Accuracy Reliability Costs Wafer packaging technology automatic detection three-axis gantry slide YOLOv10 machine learning algorithms

来源：评论

学校读者我要写书评

暂无评论

From methods to datasets: A survey on image-Caption Generators

引用

MULTIMEDIA TOOLS AND applications 2024年第9期83卷 28077-28123页

作者： Agarwal, Lakshita Verma, Bindu Delhi Technol Univ Dept Informat Technol Delhi 110042 India

image - Caption Generator is a popular Artificial Intelligence research tool that works with image comprehension and language definition. Creating well-structured sentences requires a thorough understanding of language in a systematic and semantic way. Being able to describe the substance of an image using well-structured phrases is a difficult undertaking, but it can have a significant impact in terms of assisting visually impaired people in better understanding the images' content. image captions has gained a lot of attention as a study subject for various computer vision and natural language processing (NLP) applications. The goal of image captions is to create logical and accurate natural language phrases that describes an image. It relies on the caption model to see items and appropriately characterise their relationships. Intuitively, it is also difficult for a machine to see a typical image in the same way that humans do. It does, however, provide the foundation for intelligent exploration in deep learning. In this review paper, we will focus on the latest in-depth advanced captions techniques for image captioning. This paper highlights related methodologies and focuses on aspects that are crucial in computer recognition, as well as on the numerous strategies and procedures being developed for the development of image captions. It was also observed that Recurrent neural networks (RNNs) are used in the bulk of research works (45%), followed by attention-based models (30%), transformer-based models (15%) and other methods (10%). An overview of the approaches utilised in image captioning research is discussed in this paper. Furthermore, the benefits and drawbacks of these methodologies are explored, as well as the most regularly used data sets and evaluation processes in this sector are being studied.

关键词： image- Caption Generator Natural language processing Computer vision Intelligent exploration Deep learning

来源：评论

学校读者我要写书评

暂无评论

Fast no-reference deep image dehazing

引用

machine vision AND applications 2024年第5期35卷 122页

作者： Qin, Hongyi Belyaev, Alexander G. Imperial Coll London Dept Elect & Elect Engn London England Heriot Watt Univ Sch Engn & Phys Sci Inst Sensors Signals & Syst Edinburgh Scotland

This paper presents a deep learning method for image dehazing and clarification. The main advantages of the method are high computational speed and using unpaired image data for training. The method adapts the Zero-DCE approach (Li et al. in IEEE Trans Pattern Anal Mach Intell 44(8):4225-4238, 2021) for the image dehazing problem and uses high-order curves to adjust the dynamic range of images and achieve dehazing. Training the proposed dehazing neural network does not require paired hazy and clear datasets but instead utilizes a set of loss functions, assessing the quality of dehazed images to drive the training process. Experiments on a large number of real-world hazy images demonstrate that our proposed network effectively removes haze while preserving details and enhancing brightness. Furthermore, on an affordable GPU-equipped laptop, the processing speed can reach 1000 FPS for images with 2K resolution, making it highly suitable for real-time dehazing applications.

关键词： image dehazing and clarification No-reference neural network training Real-time dehazing Zero-DCE

来源：评论

学校读者我要写书评

暂无评论

Hypercomplex Signal and image processing: Part 1

引用

IEEE SIGNAL processing MAGAZINE 2024年第2期41卷 11-13页

作者： Valous, Nektarios A. Hitzer, Eckhard Vitabile, Salvatore Bernstein, Swanhild Lavor, Carlile Abbott, Derek Luna-Elizarraras, Maria Elena Lopes, Wilder German Canc Res Ctr D-69120 Heidelberg Germany Natl Ctr Tumor Dis Heidelberg D-69120 Heidelberg Germany Int Christian Univ Phys & Math Tokyo 1818585 Japan Univ Palermo Dept Biomed Neurosci & Adv Diagnost Comp Sci I-90127 Palermo Italy TU Bergakad Freiberg Inst Appl Anal D-09599 Freiberg Germany Univ Estadual Campinas Dept Appl Math BR-13083859 Campinas Brazil Univ Adelaide Adelaide SA 5005 Australia Holon Inst Technol HIT IL-5810201 Holon Israel Ogre Run Tulsa OK 74114 USA

Novel computational signal and image analysis methodologies based on feature-rich mathematical/computational frameworks continue to push the limits of the technological envelope, thus providing optimized and efficient solutions. Hypercomplex signal and image processing is a fascinating field that extends conventional methods by using hypercomplex numbers in a unified framework for algebra and geometry. Methodologies that are developed within this field can lead to more effective and powerful ways to analyze signals and images. processing audio, video, images, and other types of data in the hypercomplex domain allows for more complex and intuitive representations with algebraic properties that can lead to new insights and optimizations. applications in image processing, signal filtering, and deep learning (just to name a few) have shown that working in the hypercomplex domain can lead to more efficient and robust outcomes. As research in this field progresses and software tools become more widely available, we can expect to see increasingly sophisticated applications in many areas of research, e.g., computer vision, machine learning, and so on.

关键词： Special issues and sections Signal processing image analysis Mathematical models Software tools Optimization methods Filtering Deep learning Computer vision Computational efficiency Audio-visual systems

来源：评论

学校读者我要写书评

暂无评论

Plant trait estimation and classification studies in plant phenotyping using machine vision - A review

引用

Information processing in Agriculture 2023年第1期10卷 114-135页

作者： Shrikrishna Kolhar Jayant Jagtap Symbiosis International(Deemed University)(SIU) LavalePuneIndia VPKBIET BaramatiMaharashtraIndia Symbiosis Institute of Technology(SIT) Symbiosis International(Deemed University)(SIU)LavalePuneMaharashtraIndia

Today there is a rapid development taking place in phenotyping of plants using non-destructive image based machine vision *** vision based plant phenotyping ranges from single plant trait estimation to broad assessment of crop canopy for thousands of plants in the *** phenotyping systems either use single imaging method or integrative approach signifying simultaneous use of some of the imaging techniques like visible red,green and blue(RGB)imaging,thermal imaging,chlorophyll fluorescence imaging(CFIM),hyperspectral imaging,3-dimensional(3-D)imaging or high resolution volumetric *** paper provides an overview of imaging techniques and their applications in the field of plant *** paper presents a comprehensive survey on recent machine vision methods for plant trait estimation and *** this paper,information about publicly available datasets is provided for uniform comparison among the state-of-the-art phenotyping *** paper also presents future research directions related to the use of deep learning based machine vision algorithms for structural(2-D and 3-D),physiological and temporal trait estimation,and classification studies in plants.

关键词： Plant phenotyping machine vision Plant trait estimation Imaging techniques Leaf segmentation and counting Plant classification studies

来源：评论

学校读者我要写书评

暂无评论

Natural Language processing Pretraining Language Model for Computer Intelligent Recognition Technology

引用

ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION processing 2024年第8期23卷 1-12页

作者： Dong, Jun Ludong Univ Coll Liberal Arts Yantai 264025 Shandong Peoples R China Shandong Prov Key Lab Language Resources Dev & App Yantai 264025 Shandong Peoples R China Chinese Lexicograpyh Res Ctr Yantai 264025 Shandong Peoples R China Chinese Language Promot Base Yantai 264025 Shandong Peoples R China

Computer intelligent recognition technology refers to the use of computer vision, Natural Language processing (NLP), machine learning and other technologies to enable computers to recognize, analyze, understand and answer human language and behavior. The common applications of computer intelligent recognition technology include image recognition, NLP, face recognition, target tracking, and other fields. NLP is a field of computer science, which involves the interaction between computers and natural languages. NLP technology can be used to process, analyze and generate natural language data, such as text, voice and image. Common NLP technology applications include language translation, emotion analysis, text classification, speech recognition and question answering system. Language model is a machine learning model, which uses a large number of text data for training to learn language patterns and relationships in text data. Although the language model has made great progress in the past few years, it still faces some challenges, including: poor semantic understanding, confusion in multilingual processing, slow language processing and other shortcomings. Therefore, in order to optimize these shortcomings, this article would study the pre-training language model based on NLP technology, which aimed at using NLP technology to optimize and improve the performance of the language model, thus optimizing the computer intelligent recognition technology. The model had a higher language understanding ability and more accurate prediction ability. In addition, the model could learn language rules and structures by using a large number of corpus, so as to better understand natural language. Through experiments, it could be known that the data size and total computing time of the traditional Generative Pretrained Transformer-2 (GPT-2) language model were 10 GB and 97 hours respectively. The data size and total computing time of BERT (Bidirectional Encoder Representations from Tra

关键词： Natural Language processing Computer Intelligent Recognition Technology Pre-trained Language Model Computer vision machine Learning

来源：评论

学校读者我要写书评

暂无评论

Enhancing manufacturing process accuracy: A multidisciplinary approach integrating computer vision, machine learning, and control systems

引用

JOURNAL OF MANUFACTURING PROCESSES 2025年 142卷 453-467页

作者： Ramesh, Kaki Deshmukh, Sandip Ray, Tathagata Parimi, Chandu Birla Inst Sci & Technol Pilani Dept Mech Engn Hyderabad Campus Hyderabad 500078 Telangana India Birla Inst Sci & Technol Pilani Dept Comp Sci & Informat Syst Hyderabad Campus Hyderabad 500078 Telangana India Birla Inst Sci & Technol Pilani Dept Civil Engn Hyderabad Campus Hyderabad 500078 Telangana India

Manufacturing industries face significant challenges in producing high-quality, faultless products within limited timeframes. Conventional human-based inspection methods are still prone to errors and cannot guarantee precise component placement, potentially leading to product failures, user hazards, and substantial financial and reputational losses. This research presents a workflow to automate an inspection system that integrates computer vision, machine learning, image processing, and control systems to address these challenges. The proposed system employs a microcontroller and stepper motors to control a highly calibrated camera, enabling precise and efficient product inspection. At its core, the system utilizes the YOLOv5 model for object detection, specifically identifying hole marks and holes on products pre-assembly. This deep learning model was chosen for its real-time detection capabilities and high accuracy, achieving a mean Average Precision (mAP) of 0.95, which surpasses many current industry standards. Following object detection, advanced image processing techniques are applied to determine the precise position of detected features. Our approach achieves a notable error rate of 0.2 %, offering improvements over traditional inspection methods. Our system offers the potential to reduce inspection processing time and improve fault identification accuracy in real-time applications. Our research contributes to the field of industrial automation by introducing a seamless integration of state-of-the-art computer vision techniques with practical control systems. The system's modular design allows for easy adaptation to various manufacturing environments, benefiting industries with complex assembly processes, such as electronics, automotive manufacturing, etc. While the current implementation focuses on hole detection, future work will explore expanding the system's capabilities to identify a broader range of defects and adapt to different product types. This re

关键词： Deep learning image processing Assembly Manufacturing industries Quality assurance

来源：评论

学校读者我要写书评

暂无评论

An edge extraction approach for hot forging images based on discrete grayscale surface properties

引用

INTERNATIONAL JOURNAL OF ADVANCED MANUFACTURING TECHNOLOGY 2025年第3-4期137卷 1869-1890页

作者： Pan, Xiaoyu Wang, Zhi Wang, Delun Dalian Univ Technol Sch Mech Engn Dalian 116024 Peoples R China

machine vision measurement is desirable to permit real-time non-contact measuring and positioning for hot forgings, among which edge extraction is a most essential issue to extract the contour and effective area. However, conventional edge detection methods are prone to get unsatisfactory edging extraction results, thus have poor effectiveness, and are not suitable for hot forging images. In this paper, an efficient and robust edge extraction approach for passive vision images of hot forgings is proposed. Grayscale images of hot forgings converted into discrete gray surface, the approach is based on the geometric properties and the continuity of the equivalent discrete grayscale surface. The presented algorithm detects three types of edges by various continuity criterions, which are corresponded to the geometric properties and vary with the primary and secondary edges. The geometric properties dependent nature of the algorithm ensures the primary and the secondary edges of the forges are identified in the different environmental conditions and for forging parts with various heat radiation intensities. Moreover, an edge thinning and connection approach is presented by defining the edging direction, which can be used to improve the qualities of types of edges. Finally, experimentations for images of various sorts of hot forgings are carried out to extract three types of edges;the relevant experimental results and validation indicators show that the proposed method takes better performance as 17.4453 in PSNR and 0.1146 in entropy for G0 edge for a typical forging image while 0.0342 for G2 edge compared with existing methods. The results demonstrate that the proposed approach is validated to have satisfactory performance, as well as efficacy and robustness.

关键词： Computer vision Forging Grayscale surface properties image edge detection machine vision Multimedia applications Stereo image processing

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：