Infrared-visible image fusion combines complementary information from both modalities, enhancing scene perception in applications such as surveillance and autonomous driving. However, existing deep learning-based meth...
详细信息
This article mainly takes the visual semantic segmentation of soccer robots as the background, introduces the development history of intelligent football, and explains the importance of machinevision. Firstly, the da...
详细信息
With recent advance of computer vision techniques, an increasing amount of image and video content is consumed by machines. However, existing image and video compression schemes are mainly designed for human vision, w...
详细信息
ISBN:
(纸本)9789819916382;9789819916399
With recent advance of computer vision techniques, an increasing amount of image and video content is consumed by machines. However, existing image and video compression schemes are mainly designed for human vision, which are not optimized concerning machinevision. In this paper, we propose a saliency guided learned image compression scheme for machines, where object detection is considered as an example task. To obtain salient regions for machinevision, a saliency map is obtained for each detected object using an existing black-box explanation of neural networks, and maps for multiple objects are merged sophistically into one. Based on a neural network-based image codec, a bitrate allocation scheme has been designed which prunes the latent representation of the image according to the saliency map. During the training of end-to-end image codec, both pixel fidelity and machinevision fidelity are used for performance evaluation, where the degradation in detection accuracy is measured without ground-truth annotation. Experimental results demonstrate that the proposed scheme can achieve up to 14.1% reduction in bitrate with the same detection accuracy compared with the baseline learned image codec.
Today, classification of polarimetric images is an important topic where various statistical pattern recognition methods have been used to achieve the high accurate classification maps. In this work, weighting the pol...
Today, classification of polarimetric images is an important topic where various statistical pattern recognition methods have been used to achieve the high accurate classification maps. In this work, weighting the polarimetric features according to their statistical behavior (the mean vector and variance values as the first and second statistics) is suggested to improve the PolSAR image classification. A weighted feature matrix is composed and applied to the popular classifiers such as maximum likelihood, K-nearest neighbor and support vector machine. The weighted feature matrix can be also implemented on other arbitrary classifiers to improve their discrimination ability. The experiments on the L-band AIRSAR dataset show appropriate classification results.
imageprocessing is a fundamental task in computer vision, which aims at enhancing image quality and extracting essential features for subsequent vision applications. Traditionally, task-specific models are developed ...
详细信息
imageprocessing is a fundamental task in computer vision, which aims at enhancing image quality and extracting essential features for subsequent vision applications. Traditionally, task-specific models are developed for individual tasks and designing such models requires distinct expertise. Building upon the success of large language models (LLMs) in natural language processing (NLP), there is a similar trend in computer vision, which focuses on developing large-scale models through pretraining and in-context learning. This paradigm shift reduces the reliance on task-specific models, yielding a powerful unified model to deal with various tasks. However, these advances have predominantly concentrated on high-level vision tasks, with less attention paid to low-level vision tasks. To address this issue, we propose a universal model for general imageprocessing that covers image restoration, image enhancement, image feature extraction tasks, etc. Our proposed framework, named PromptGIP, unifies these diverse imageprocessing tasks within a universal framework. Inspired by NLP question answering (QA) techniques, we employ a visual prompting question answering paradigm. Specifically, we treat the input-output image pair as a structured question-answer sentence, thereby reprogramming the imageprocessing task as a prompting QA problem. PromptGIP can undertake diverse cross-domain tasks using provided visual prompts, eliminating the need for task-specific finetuning. Capable of handling up to 15 different imageprocessing tasks, PromptGIP represents a versatile and adaptive approach to general imageprocessing. Codes will be available at https://***/lyh-18/PromptGIP. Copyright 2024 by the author(s)
No-reference image Quality Assessment (NR-IQA) attained acceptable results through deep learning models. However, the overfitting, caused by complex deep models and insufficient labeled datasets, has become a primary ...
No-reference image Quality Assessment (NR-IQA) attained acceptable results through deep learning models. However, the overfitting, caused by complex deep models and insufficient labeled datasets, has become a primary challenge for the research community. Addressing this issue, various strategies such as data augmentation, transfer learning, and weakly supervised learning have been investigated. This paper introduces an approach, suggesting the use of a probability distribution instead of a rigid target to mitigate overconfidence issues. The proposed label uncertainty can provide acceptable results, especially in terms of cross-dataset validation.
This paper proposes a machinevision defect detection algorithm based on ResNet and Canny operator edge approximation to address the quality monitoring issues in the manufacturing process of motor magnetic materials. ...
详细信息
ISBN:
(纸本)9798400707032
This paper proposes a machinevision defect detection algorithm based on ResNet and Canny operator edge approximation to address the quality monitoring issues in the manufacturing process of motor magnetic materials. Based on this detection method, an automated device system for detecting motor magnetic ring defects is constructed. This article uses the Canny operator of the maximum connected curve to obtain the contour of the original sample image. Based on the coordinate information of the contour points, the image is divided into several parts to reduce the dimensionality of the image samples. The obtained small image information is used as input to ResNet to achieve the detection of incomplete surface loops, pits, missing corners, cracks, etc. The results show that the prediction accuracy is 96.70%, and the average accuracy mAP is 92.92%. The missed kill rates in the training and prediction models are 2.22% and 4.44%, respectively, which are lower than the overall kill rate and meet the production sorting needs of enterprises.
Solar PhotoVoltaic (PV) installment is expanding quickly around the world, but deeper integration into the electrical grid is hampered by solar power intermittency. A portion of the short-term PV fluctuation is caused...
详细信息
ISBN:
(纸本)9798350300338
Solar PhotoVoltaic (PV) installment is expanding quickly around the world, but deeper integration into the electrical grid is hampered by solar power intermittency. A portion of the short-term PV fluctuation is caused by abrupt weather changes, such as cloud cover variations, which can drastically impact the PV panel output on time scales of minutes. images of the sky can be used to provide information on the present and upcoming cloud cover to increase the accuracy of the PV power forecast. Convolutional Neural Network (CNN) is used in this study so as to link the solar panels' power with the current sky photos. In addition, the PV panel's historical output is utilized to enhance forecasting accuracy. Also, the sensitivity of the proposed model to the configuration of the machine learning procedure, such as the number of neurons and the width of the network, has been assessed. Furthermore, the uncertainty of the incorporated stochastic approach and the effect of different arrangements of inputs and outputs on the performance metrics have been evaluated. The proposed model achieves the root mean square error ranges from 2.37 kW to 3.43 kW for the real-data dataset.
This study presents a vision-based method to predict the moisture ratio of a kiwifruit slice in a dryer. Firstly, an automated imageprocessing workflow was used to extract colour and morphology features from drying k...
详细信息
ISBN:
(纸本)9798350325621
This study presents a vision-based method to predict the moisture ratio of a kiwifruit slice in a dryer. Firstly, an automated imageprocessing workflow was used to extract colour and morphology features from drying kiwifruit slices. These features, along with pretreatment methods, slice thickness, and drying temperature, were then modelled using a random forest regression. The model exhibited exceptional performance, with an average Mean Absolute Error (MAE) of 0.0056 and an average Root Mean Square Error (RMSE) of 0.0312. The R-2 values remained consistently high across all folds, averaging 0.9879, indicating a substantial proportion of variance in the data explained by the model. Lastly, this study introduces a model-based drying control system, where the moisture ratio prediction method plays a pivotal role. This system eliminates the need for strict control over parameters such as fruit type, slicing state, or drying temperature by adhering to an automated optimal drying profile, thus reducing ownership costs and enhancing yield rates.
The deep integration of new-generation information technology and manufacturing is triggering far-reaching industrial changes. machinevision inspection is widely used in large-scale repetitive industrial production p...
详细信息
暂无评论