The AdaMax algorithm provides enhanced convergence properties for stochastic optimization problems. In this paper, we present a regret bound for the AdaMax algorithm, offering a tighter and more refined analysis compa...
详细信息
The AdaMax algorithm provides enhanced convergence properties for stochastic optimization problems. In this paper, we present a regret bound for the AdaMax algorithm, offering a tighter and more refined analysis compared to existing bounds. This theoretical advancement provides deeper insights into the optimization landscape of machine learning algorithms. Specifically, the You Only Look Once (YOLO) framework has become well-known as an extremely effective object segmentation tool, mostly because of its extraordinary accuracy in real-time processing, which makes it a preferred option for many computer visionapplications. Finally, we used this algorithm for image segmentation.
Artificial neural networks have been one of the science's most influential and essential branches in the past decades. Neural networks have found applications in various fields including medical and pharmaceutical...
详细信息
Artificial neural networks have been one of the science's most influential and essential branches in the past decades. Neural networks have found applications in various fields including medical and pharmaceutical services, voice and speech recognition, computer vision, natural language processing, and video and imageprocessing. Neural networks have many layers and consume much energy. Approximate computing is a promising way to reduce energy consumption in applications that can tolerate a degree of accuracy reduction. This paper proposes an effective method to prevent accuracy reduction after using approximate computing methods in the CNNs. The method exploits the k-means clustering algorithm to label pixels in the first convolutional layer. Then, using one of the existing pruning methods, different pruning amounts have been applied to all layers. The experimental results on three CNNs and four different datasets show that the accuracy of the proposed method has significantly improved (by 17%) compared to the baseline network.
With the rapid advancement in wafer packaging technology, especially the surging demand for chips, enhancing product quality and process efficiency has become increasingly crucial. This article delves into the automat...
详细信息
With the rapid advancement in wafer packaging technology, especially the surging demand for chips, enhancing product quality and process efficiency has become increasingly crucial. This article delves into the automatic detection of pins on Ball Grid Array (BGA) within wafer packaging processes. This system is engineered with a flexible software and hardware architecture to address evolving industrial requirements, facilitating swift adaptation to new processing standards and technological demands. By utilizing Programmable Logic Controller (PLC) to control a three-axis gantry slide combined with industrial camera imaging technology, this system achieves high efficiency and precise positioning, thereby delivering high-quality image. This article utilizes YOLOv10 imageprocessing technology and machine learning algorithms to effectively achieve accurate identification and classification of BGA defects. The YOLOv10 is chosen for its outstanding recognition capabilities and swift processing speed, enabling the rapid and accurate identification of minor defects, such as bent pins, missing pins, and solder ball defects. Through large image analysis, this system has been proven to enhance detection accuracy and reduce errors of manual detection. This article primarily addresses issues in semiconductor manufacturing processes and improves the product yield rate in current production lines. By effectively integrating AI-based detection technology into semiconductor manufacturing, it replaces labor-intensive tasks, enhancing efficiency and precision.
image - Caption Generator is a popular Artificial Intelligence research tool that works with image comprehension and language definition. Creating well-structured sentences requires a thorough understanding of languag...
详细信息
image - Caption Generator is a popular Artificial Intelligence research tool that works with image comprehension and language definition. Creating well-structured sentences requires a thorough understanding of language in a systematic and semantic way. Being able to describe the substance of an image using well-structured phrases is a difficult undertaking, but it can have a significant impact in terms of assisting visually impaired people in better understanding the images' content. image captions has gained a lot of attention as a study subject for various computer vision and natural language processing (NLP) applications. The goal of image captions is to create logical and accurate natural language phrases that describes an image. It relies on the caption model to see items and appropriately characterise their relationships. Intuitively, it is also difficult for a machine to see a typical image in the same way that humans do. It does, however, provide the foundation for intelligent exploration in deep learning. In this review paper, we will focus on the latest in-depth advanced captions techniques for image captioning. This paper highlights related methodologies and focuses on aspects that are crucial in computer recognition, as well as on the numerous strategies and procedures being developed for the development of image captions. It was also observed that Recurrent neural networks (RNNs) are used in the bulk of research works (45%), followed by attention-based models (30%), transformer-based models (15%) and other methods (10%). An overview of the approaches utilised in image captioning research is discussed in this paper. Furthermore, the benefits and drawbacks of these methodologies are explored, as well as the most regularly used data sets and evaluation processes in this sector are being studied.
This paper presents a deep learning method for image dehazing and clarification. The main advantages of the method are high computational speed and using unpaired image data for training. The method adapts the Zero-DC...
详细信息
This paper presents a deep learning method for image dehazing and clarification. The main advantages of the method are high computational speed and using unpaired image data for training. The method adapts the Zero-DCE approach (Li et al. in IEEE Trans Pattern Anal Mach Intell 44(8):4225-4238, 2021) for the image dehazing problem and uses high-order curves to adjust the dynamic range of images and achieve dehazing. Training the proposed dehazing neural network does not require paired hazy and clear datasets but instead utilizes a set of loss functions, assessing the quality of dehazed images to drive the training process. Experiments on a large number of real-world hazy images demonstrate that our proposed network effectively removes haze while preserving details and enhancing brightness. Furthermore, on an affordable GPU-equipped laptop, the processing speed can reach 1000 FPS for images with 2K resolution, making it highly suitable for real-time dehazing applications.
Novel computational signal and image analysis methodologies based on feature-rich mathematical/computational frameworks continue to push the limits of the technological envelope, thus providing optimized and efficient...
详细信息
Novel computational signal and image analysis methodologies based on feature-rich mathematical/computational frameworks continue to push the limits of the technological envelope, thus providing optimized and efficient solutions. Hypercomplex signal and imageprocessing is a fascinating field that extends conventional methods by using hypercomplex numbers in a unified framework for algebra and geometry. Methodologies that are developed within this field can lead to more effective and powerful ways to analyze signals and images. processing audio, video, images, and other types of data in the hypercomplex domain allows for more complex and intuitive representations with algebraic properties that can lead to new insights and optimizations. applications in imageprocessing, signal filtering, and deep learning (just to name a few) have shown that working in the hypercomplex domain can lead to more efficient and robust outcomes. As research in this field progresses and software tools become more widely available, we can expect to see increasingly sophisticated applications in many areas of research, e.g., computer vision, machine learning, and so on.
Today there is a rapid development taking place in phenotyping of plants using non-destructive image based machinevision *** vision based plant phenotyping ranges from single plant trait estimation to broad assessmen...
详细信息
Today there is a rapid development taking place in phenotyping of plants using non-destructive image based machinevision *** vision based plant phenotyping ranges from single plant trait estimation to broad assessment of crop canopy for thousands of plants in the *** phenotyping systems either use single imaging method or integrative approach signifying simultaneous use of some of the imaging techniques like visible red,green and blue(RGB)imaging,thermal imaging,chlorophyll fluorescence imaging(CFIM),hyperspectral imaging,3-dimensional(3-D)imaging or high resolution volumetric *** paper provides an overview of imaging techniques and their applications in the field of plant *** paper presents a comprehensive survey on recent machinevision methods for plant trait estimation and *** this paper,information about publicly available datasets is provided for uniform comparison among the state-of-the-art phenotyping *** paper also presents future research directions related to the use of deep learning based machinevision algorithms for structural(2-D and 3-D),physiological and temporal trait estimation,and classification studies in plants.
Computer intelligent recognition technology refers to the use of computer vision, Natural Language processing (NLP), machine learning and other technologies to enable computers to recognize, analyze, understand and an...
详细信息
Computer intelligent recognition technology refers to the use of computer vision, Natural Language processing (NLP), machine learning and other technologies to enable computers to recognize, analyze, understand and answer human language and behavior. The common applications of computer intelligent recognition technology include image recognition, NLP, face recognition, target tracking, and other fields. NLP is a field of computer science, which involves the interaction between computers and natural languages. NLP technology can be used to process, analyze and generate natural language data, such as text, voice and image. Common NLP technology applications include language translation, emotion analysis, text classification, speech recognition and question answering system. Language model is a machine learning model, which uses a large number of text data for training to learn language patterns and relationships in text data. Although the language model has made great progress in the past few years, it still faces some challenges, including: poor semantic understanding, confusion in multilingual processing, slow language processing and other shortcomings. Therefore, in order to optimize these shortcomings, this article would study the pre-training language model based on NLP technology, which aimed at using NLP technology to optimize and improve the performance of the language model, thus optimizing the computer intelligent recognition technology. The model had a higher language understanding ability and more accurate prediction ability. In addition, the model could learn language rules and structures by using a large number of corpus, so as to better understand natural language. Through experiments, it could be known that the data size and total computing time of the traditional Generative Pretrained Transformer-2 (GPT-2) language model were 10 GB and 97 hours respectively. The data size and total computing time of BERT (Bidirectional Encoder Representations from Tra
Manufacturing industries face significant challenges in producing high-quality, faultless products within limited timeframes. Conventional human-based inspection methods are still prone to errors and cannot guarantee ...
详细信息
Manufacturing industries face significant challenges in producing high-quality, faultless products within limited timeframes. Conventional human-based inspection methods are still prone to errors and cannot guarantee precise component placement, potentially leading to product failures, user hazards, and substantial financial and reputational losses. This research presents a workflow to automate an inspection system that integrates computer vision, machine learning, imageprocessing, and control systems to address these challenges. The proposed system employs a microcontroller and stepper motors to control a highly calibrated camera, enabling precise and efficient product inspection. At its core, the system utilizes the YOLOv5 model for object detection, specifically identifying hole marks and holes on products pre-assembly. This deep learning model was chosen for its real-time detection capabilities and high accuracy, achieving a mean Average Precision (mAP) of 0.95, which surpasses many current industry standards. Following object detection, advanced imageprocessing techniques are applied to determine the precise position of detected features. Our approach achieves a notable error rate of 0.2 %, offering improvements over traditional inspection methods. Our system offers the potential to reduce inspection processing time and improve fault identification accuracy in real-time applications. Our research contributes to the field of industrial automation by introducing a seamless integration of state-of-the-art computer vision techniques with practical control systems. The system's modular design allows for easy adaptation to various manufacturing environments, benefiting industries with complex assembly processes, such as electronics, automotive manufacturing, etc. While the current implementation focuses on hole detection, future work will explore expanding the system's capabilities to identify a broader range of defects and adapt to different product types. This re
machinevision measurement is desirable to permit real-time non-contact measuring and positioning for hot forgings, among which edge extraction is a most essential issue to extract the contour and effective area. Howe...
详细信息
machinevision measurement is desirable to permit real-time non-contact measuring and positioning for hot forgings, among which edge extraction is a most essential issue to extract the contour and effective area. However, conventional edge detection methods are prone to get unsatisfactory edging extraction results, thus have poor effectiveness, and are not suitable for hot forging images. In this paper, an efficient and robust edge extraction approach for passive visionimages of hot forgings is proposed. Grayscale images of hot forgings converted into discrete gray surface, the approach is based on the geometric properties and the continuity of the equivalent discrete grayscale surface. The presented algorithm detects three types of edges by various continuity criterions, which are corresponded to the geometric properties and vary with the primary and secondary edges. The geometric properties dependent nature of the algorithm ensures the primary and the secondary edges of the forges are identified in the different environmental conditions and for forging parts with various heat radiation intensities. Moreover, an edge thinning and connection approach is presented by defining the edging direction, which can be used to improve the qualities of types of edges. Finally, experimentations for images of various sorts of hot forgings are carried out to extract three types of edges;the relevant experimental results and validation indicators show that the proposed method takes better performance as 17.4453 in PSNR and 0.1146 in entropy for G0 edge for a typical forging image while 0.0342 for G2 edge compared with existing methods. The results demonstrate that the proposed approach is validated to have satisfactory performance, as well as efficacy and robustness.
暂无评论