Recently, deeplearning has become a hot topic in wide fields, especially in the computer vision that proved its efficiency in processingimages. However, it tends to overfit or consumes a long learningtime in many p...
详细信息
Recently, deeplearning has become a hot topic in wide fields, especially in the computer vision that proved its efficiency in processingimages. However, it tends to overfit or consumes a long learningtime in many platforms. The causes behind these issues return to the huge number of learning parameters and lack or incorrect training samples. In this work, two levels of deep convolutional neural network (DCNN) are proposed for classifying the images. The first one is enhancing the training images with removing unnecessary details, and the second one is detecting the edges of the processed images for further reduction of learningtime in the DCNN. The proposed work is inspired by the human eye's way in recognizing an object, where a piece of object can be helpful in the recognition and not necessarily the whole object or full colors. The goal is to speed up the learning process of CNN based on the preprocessed training samples that are precise and lighter to work well in real-time applications. The obtained results proved to be more significant for real-time classification as it reduced the learning process by (94%) in Animals10 dataset with a validation accuracy of (99.2%) in accordance with the classical DCNNs.
This paper proposes an approach to convert real life images into cartoon images using imageprocessing. The cartoon images have sharp edges, reduced colour quantity compared to the original image, and smooth colour re...
This paper proposes an approach to convert real life images into cartoon images using imageprocessing. The cartoon images have sharp edges, reduced colour quantity compared to the original image, and smooth colour regions. With the rapid advancement in artificial intelligence, recently deeplearning methods have been developed for image to cartoon generation. Most of these methods perform extremely huge computations and require large datasets and are time consuming, unlike traditional imageprocessing which involves direct manipulation on the input images. In this paper, we have developed an imageprocessing based method for image to cartoon generation. Here, we perform parallel operations of enhancing the edges and quantizing the colour. The edges are extracted and dilated to highlight them in the output colour image. For colour quantization, the colours are assigned based on proposed formulation on separate colour channels. Later, these images are combined and the highlighted edges are added to generate the cartoon image. The generated images are compared with existing imageprocessing approaches and deeplearning based methods. From the experimental results, it is evident that the proposed approach generates high quality cartoon images which are visually appealing, have superior contrast and are able to preserve the contextual information at lower comnutational cost.
Road object detection, a pivotal task in computer vision and artificial intelligence, is dedicated to the identification and precise localization of a diverse array of elements on roadways, including vehicles, pedestr...
详细信息
ISBN:
(数字)9798350370249
ISBN:
(纸本)9798350370270
Road object detection, a pivotal task in computer vision and artificial intelligence, is dedicated to the identification and precise localization of a diverse array of elements on roadways, including vehicles, pedestrians, road signs, traffic lights, and potential obstacles. The essence of this task lies in its ability to provide real-time and precise object detection, ultimately serving as a crucial safeguard to prevent accidents and ensure the safety of drivers, passengers, and pedestrians. It also lays the foundation for advanced warning systems and aids in collision avoidance. Several popular models were implemented, encompassing YOLOv7, YOLOv7-Modified, YOLOv7-Tiny, YOLOv7-E6E, Faster R-CNN, and SSD. Among these, YOLOv7 achieved an impressive mean average precision (mAP) of 83.6%, with an inference speed of 15.1 ms, while YOLOv7E6E achieved the highest mAP of 86.2%, but with the cost of a slower inference speed of 30.3 ms. The modified version of YOLOv7 has produced 2.2% higher average precision accuracy (89.1%) than the main version of YOLOv7 due to its double RepVGG layers, skip connection and concatenation layers in the head architecture. To make this research accessible to a wider audience, a user-friendly web application is developed with an intuitive interface.
Infrared template matching is an essential technology that enables reliable and accurate object detection, recognition, and tracking in complex environments. Perceptible Lightweight Zero-mean normalized cross-correlat...
详细信息
Infrared template matching is an essential technology that enables reliable and accurate object detection, recognition, and tracking in complex environments. Perceptible Lightweight Zero-mean normalized cross-correlation (ZNCC) Template Matching (PLZ-TM) has been proposed as a tool for matching infrared images obtained from cameras with different fields of view. Aligning such images is challenging because of the involved differences in thermal distributions, focus discrepancies, background elements, and distortions. The first stage of PLZ-TM involves extracting feature maps from the search and template images using a deeplearning network. This deeplearning network is designed with a Convolutional Neural Network (CNN) architecture that omits pooling layers, thereby minimizing information loss during extraction. The subsequent stage involves matching the feature maps. The matching method utilizes a lightweight ZNCC (ZNCC) module that employs average pooling for training. The deeplearning network is trained to optimize the distribution of the output heatmap and the probability at the correct location of the template image. PLZ-TM delivers excellent performance achieving a processingtime of only 3.3 ms in matching a $640\times 480$ search image with a $192\times 144$ template image. Moreover, it attains a matching accuracy of 96% on a dataset obtained from infrared cameras with different fields of view.
deeplearning technology has been employed for precise medical image segmentation in recent years. However, due to the limited available datasets and real-timeprocessing requirement, the inherently complicated struct...
详细信息
deeplearning technology has been employed for precise medical image segmentation in recent years. However, due to the limited available datasets and real-timeprocessing requirement, the inherently complicated structure of deeplearning models restricts their application in the field of medical imageprocessing. In this work, we present a novel lightweight LMU-Net network with improved accuracy for medical image segmentation. The multilayer perceptron (MLP) and depth-wise separable convolutions are adopted in both encoder and decoder of the LMU-Net to reduce feature loss and the number of training parameters. In addition, a lightweight channel attention mechanism and convolution operation with a larger kernel are introduced in the proposed architecture to further improve the segmentation performance. Furthermore, we employ batch normalization (BN) and group normalization (GN) interchangeably in our module to minimize the estimation shift in the network. Finally, the proposed network is evaluated and compared to other architectures on publicly accessible ISIC and BUSI datasets by carrying out robust experiments with sufficient ablation considerations. The experimental results show that the proposed LMU-Net can achieve a better overall performance than existing techniques by adopting fewer parameters.
Public health initiatives must be made using evidence-based decision-making to have the greatest impact. Machine learning algorithms are created to gather, store, process, and analyze data to provide knowledge and gui...
详细信息
Public health initiatives must be made using evidence-based decision-making to have the greatest impact. Machine learning algorithms are created to gather, store, process, and analyze data to provide knowledge and guide decisions. A crucial part of any surveillance system is image analysis. The communities of computer vision and machine learning have become curious about it as of late. This study uses a variety of machine learning, and imageprocessing approaches to detect and forecast malarial illness. In our research, we discovered the potential of deeplearning techniques as innovative tools with a broader applicability for malaria detection, which benefits physicians by assisting in the diagnosis of the condition. We investigate the common confinements of deeplearning for computer frameworks and organizing, including the requirement for data preparation, preparation overhead, real-time execution, and explaining ability, and uncover future inquiries about bearings focusing on these constraints.
This paper proposes a lightweight deeplearning (DL) framework for real-time accurate weld feature extraction from noisy images with light, smoke, or splash. Leveraging a two-dimensional human pose estimation paradigm...
详细信息
This paper proposes a lightweight deeplearning (DL) framework for real-time accurate weld feature extraction from noisy images with light, smoke, or splash. Leveraging a two-dimensional human pose estimation paradigm, the framework follows a top-down architecture for accurate weld feature point localization. This study develops a semi-automatic annotation technique to dramatically reduce the annotation cost. Then, we design a lightweight yet faster You Only Look Once version 8 (YOLOv8) detector to rapidly detect the weld feature region in the presence of strong noise. To avoid reliance on high-resolution feature maps and achieve sub-pixel-level localization accuracy, a heatmap-free approach decomposes the feature point detection task into subtasks of horizontal and vertical coordinate classification. Comparison with mainstream DL-based weld recognition methods validates the superiority of the proposed method regarding real-time feature extraction accuracy and robustness.
Object detection has become a popular tool of deeplearning in the era of digital manufacturing. In this study, the most powerful and efficient object detection algorithm, i.e., You Only Look Once (YOLO) algorithm, wa...
详细信息
Object detection has become a popular tool of deeplearning in the era of digital manufacturing. In this study, the most powerful and efficient object detection algorithm, i.e., You Only Look Once (YOLO) algorithm, was used to detect anomalies in deposited beads of wire-arc additive manufacturing (WAAM) using melt pool images. This study used the latest version of YOLO algorithm to train and validate the custom image dataset of the melt pool obtained by conducting experiments using a robotic-controlled WAAM. The mean average precision (mAP) for the "Regular bead" class and the "Irregular bead" class reached 99% at an Intersection over Union (IoU) threshold of 0.5, for both training and validation. When the model was tested for new or unseen datasets by conducting four new experimental trials, the mAP value for the "Regular bead" class reached 98.47% and for the "Irregular bead" class reached 96.68% at an average processingtime of 0.014 s/frame. The object detection algorithm YOLO has shown an excellent processingtime of 15 ms per frame, which shows its potential for real-time application in the manufacturing industry.
Medical ultrasound imaging is a key diagnostic tool across various fields, with computer-aided diagnosis systems benefiting from advances in deeplearning. However, its lower resolution and artifacts pose challenges, ...
详细信息
Medical ultrasound imaging is a key diagnostic tool across various fields, with computer-aided diagnosis systems benefiting from advances in deeplearning. However, its lower resolution and artifacts pose challenges, particularly for non-specialists. The simultaneous acquisition of degraded and high-quality images is infeasible, limiting supervised learning approaches. Additionally, self-supervised and zero-shot methods require extensive processingtime, conflicting with the real-time demands of ultrasound imaging. Therefore, to address the aforementioned issues, we propose real-time ultrasound image enhancement via a self-supervised learning technique and a test-time adaptation for sophisticated rotational cuff tear diagnosis. The proposed approach learns from other domain image datasets and performs self-supervised learning on an ultrasound image during inference for enhancement. Our approach not only demonstrated superior ultrasound image enhancement performance compared to other state-of-the-art methods but also achieved an 18% improvement in the RCT segmentation performance.
Aiming to address the timely dissemination of news information, this work explores the clever utilization of data mining (DM) technology and deeplearning (DL) algorithms to construct an intelligent real-time news ima...
详细信息
Aiming to address the timely dissemination of news information, this work explores the clever utilization of data mining (DM) technology and deeplearning (DL) algorithms to construct an intelligent real-time news image acquisition system to meet the urgency of news dissemination needs. First, this work introduces an intelligent real-time news image acquisition system and provides a detailed analysis of its principles and advantages. Throughout this process, the crucial role of DM technology in news image classification and automation is emphasized, especially in dealing with rapidly evolving news events. Next, the work establishes an intelligent real-time news image acquisition model based on DL algorithms, which integrates the essence of DM technology. Through this fusion, the research objective is to enhance the performance of the news image acquisition system to achieve higher real-time and accuracy, which is vital for the swift delivery of news information. Finally, this work investigates the application of the intelligent news image acquisition system in network communication to ensure its adaptability to various network communication scenarios while maintaining accuracy and real-time capabilities. The research results demonstrate that the application of DM technology in combination with DL algorithms can effectively meet the practical needs of the news industry, enhancing the automation of news imageprocessing and enabling faster information delivery to the audience. Notably, the AlexNet model employed performs exceptionally well, achieving recognition rates of up to 99.6% after data augmentation or equalization processing, with an accuracy of 90.9% and a high specificity of 93.38%. This indicates outstanding overall classification accuracy and negative class accuracy, even when distinguishing between news and non-news scenarios. These results clearly underscore the connection between DM technology and news acquisition and editing practices, and emphasize it
暂无评论