The median filter is a valuable imageprocessing tool that can be used in many applications to advance picture quality, and inferior noise, and acquire data ready for more analysis. The median filter, a non-linear ima...
详细信息
ISBN:
(数字)9798331518523
ISBN:
(纸本)9798331518530
The median filter is a valuable imageprocessing tool that can be used in many applications to advance picture quality, and inferior noise, and acquire data ready for more analysis. The median filter, a non-linear imageprocessing technique, has excessive application in many diverse fields because of its capacity to remove noise while preserving edges. The filter is significant for both simple and complex imageprocessing occupations since it is together easy to comprehend and fast to use. An active method to attain real-time performance is to devise a median filter on a FieldProgrammable Gate Array (FPGA) for image signal processing and machinevisionapplications. FPGAs are well-suited for photo filtering due to their capability to achieve parallel processing, which is one of their prominent advantages. The research article meets with the design and simulation of the average filter in HDL and synthesis on the FPGA for assessing the performance indices.
A computer-vision-based industrial algorithm is proposed in this study for the detection of the dimensions and the spatial positioning of fruit and vegetables on a conveyor belt for their movement to a packing machine...
详细信息
Recent studies point to an accuracy gap between humans and Artificial Neural Network (ANN) models when classifying blurred images, with humans outperforming ANNs. To bridge this gap, we introduce a spectral channel-ba...
详细信息
ISBN:
(数字)9798331506520
ISBN:
(纸本)9798331506537
Recent studies point to an accuracy gap between humans and Artificial Neural Network (ANN) models when classifying blurred images, with humans outperforming ANNs. To bridge this gap, we introduce a spectral channel-based range-constrained entropy merit function, from which we devise a zero-phase, circular symmetric blind deblurring method. We apply it as a pre-processing step for image classification and test it using pre-trained classification models and images blurred by Gaussian kernels. We compare our method to state-of-the-art restoration methods, showing its superiority, effectively bridging the machine-human gap for most models and blur levels. Our results also rank higher than the competitors in no-reference and full-reference image quality metrics. Notwithstanding the limitation to zero-phase blur, this work shows that, for image pre-processing aimed at visual tasks, it may be advantageous to use merit functions based on vision science and information theory, rather than on the expected error to the latent image.
In the domain of egg production, the application of automation technologies is essential for boosting productivity and quality. This study introduces an online monitoring system designed for egg quality assessment wit...
详细信息
In the domain of egg production, the application of automation technologies is essential for boosting productivity and quality. This study introduces an online monitoring system designed for egg quality assessment within caged environments, incorporating a robotic patrol system for egg localization and a fixed video stream for quality analysis. The project involved upgrading traditional henhouses with enhanced wireless connectivity and developing data transmission techniques for video streams and image data. The core of the system, an enhanced You Only Look Once Version 8-small (YOLOv8s) model, was augmented by substituting the Residual Network-18 backbone and integrating the Shuffle Attention mechanism, significantly improving egg detection precision. This refined model was implemented on Jetson AGX Orin industrial computer to facilitate real-world applications. To diverse operational needs, two distinct post-processing algorithms were developed: one for counting eggs and detecting abnormalities during robotic patrols, and another for assessing egg quality through fixed video streams, which measured crucial parameters such as egg dimensions and shape indexes. Experimental results revealed that the henhouse average network latencies of 35 ms, with signal strengths between -30 and -71 dBm, ensuring data transmission to the poultry management system. The enhanced YOLOv8s model, deployed on the Jetson AGX Orin, demonstrated well improvements: a Precision of 94.0% (+2.4 %), Recall rate of 92.8% (+4.6 %), Average Precision50:95 of 91.5 % (+3 %) and F1 score of 93.4 % (+3.9 %), with a minor decrease in detection speed to 91.7 Frame Per Second (-18.2). Field experiment in 60 chicken cages during robotic patrols achieved an egg recognition rate of 98.9 %, validating the system's effectiveness. In fixed settings, an 83-minute experiment managed to analyze egg numbers and abnormalities, attaining a 100 % recognition rate with all scoring data promptly relayed back to the mana
With the emergence of Artificial Intelligence (AI) and the rapid growth of Generative AI, significant progress has been made in image captioning, delivering remarkably precise captions from static images and, more rec...
详细信息
Brain-computer interfaces (BCIs) have shown promise in supporting communication for individuals with motor or speech impairments. Recent advancements such as brain-to-speech or brain-to-image technology aim to reconst...
详细信息
As fundamental drivers of human behavior, emotions can be expressed through various modalities, including facial expressions. Facial emotion recognition (FER) has emerged as a pivotal area of affective computing, enab...
详细信息
The recent developments in deep learning (DL) led to the integration of natural language processing (NLP) with computer vision, resulting in powerful integrated vision and Language Models. Despite their remarkable cap...
详细信息
ISBN:
(数字)9798331536626
ISBN:
(纸本)9798331536633
The recent developments in deep learning (DL) led to the integration of natural language processing (NLP) with computer vision, resulting in powerful integrated vision and Language Models. Despite their remarkable capabilities, these models are frequently regarded as black boxes within the machine learning research community. This raises a critical question: which parts of an image correspond to specific segments of text, and how can we decipher these associations? Understanding these connections is essential for enhancing model transparency, interpretability, and trustworthiness. To answer this question, we present an image-text aligned human visual attention dataset (VISTA) 1 1 The data is available at https://***/h-pal/Data-for-VISTA that maps specific associations between image regions and corresponding text segments. We then compare the internal heatmaps generated by VL models with this dataset, allowing us to analyze and better understand the model's decision-making process. This approach aims to enhance model transparency, interpretability, and trustworthiness by providing insights into how these models align visual and linguistic information. We conducted a comprehensive study on text-guided visual saliency detection in these VL models. This study aims to understand how different models prioritize and focus on specific visual elements in response to corresponding text segments, providing deeper insights into their internal mechanisms and improving our ability to interpret their outputs.
Multilevel thresholding plays a crucial role in imageprocessing, with extensive applications in object detection, machinevision, medical imaging, and traffic control systems. It entails the partitioning of an image ...
详细信息
Capturing and presenting exciting moments is crucial for the audience’s experience in basketball game broadcast cameras. However, traditional radar imageprocessing techniques are limited by various factors and canno...
详细信息
暂无评论