Deep supervised learning algorithms typically require a large volume of labeled data to achieve satisfactory performance. However, the process of collecting and labeling such data can be expensive and time-consuming. ...
详细信息
Deep supervised learning algorithms typically require a large volume of labeled data to achieve satisfactory performance. However, the process of collecting and labeling such data can be expensive and time-consuming. Self-supervised learning (SSL), a subset of unsupervised learning, aims to learn discriminative features from unlabeled data without relying on human-annotated labels. SSL has garnered significant attention recently, leading to the development of numerous related algorithms. However, there is a dearth of comprehensive studies that elucidate the connections and evolution of different SSL variants. This paper presents a review of diverse SSL methods, encompassing algorithmic aspects, application domains, three key trends, and open research questions. First, we provide a detailed introduction to the motivations behind most SSL algorithms and compare their commonalities and differences. Second, we explore representative applications of SSL in domains such as imageprocessing, computer vision, and natural language processing. Lastly, we discuss the three primary trends observed in SSL research and highlight the open questions that remain.
Diabetic Retinopathy is an eye disease which mainly caused to the diabetic patients. The patients who have been suffering from diabetes since long time have major chances to suffer from Diabetic Retinopathy (DR) as we...
详细信息
Today, bionic models for visionapplications base on the general information pathways, structure and characteristics of the visual system implemented in intelligent algorithms, mostly based on AI, to improve the resol...
详细信息
The smart unmanned vending machine using machinevision technology suffers from the sharp decrease of detection accuracy due to the incomplete image collection of items by monocular camera in complex environment, and ...
详细信息
The smart unmanned vending machine using machinevision technology suffers from the sharp decrease of detection accuracy due to the incomplete image collection of items by monocular camera in complex environment, and the lack of obvious features in dense stacking of items. In this article, a binocular camera system is designed to effectively solve the problems of distortion and coverage caused by monocular camera. Besides, an image-stitching algorithm is developed to splice the images captured by the camera, which reliefs the burden of computation for back-end recognition processing brought by the binocular camera. A new neural network structure-the YOLOv3-TinyE is proposed based on YOLOv3-tiny model. Based on the dataset of 21,000 images captured in real scenarios containing 20 different type of beverages, the comparison experimental results show that YOLOv3-TinyE model achieves the mean average precision of 99.15%, and the inference speed is 2.91 times faster than that of YOLOv3 model, and the detection accuracy of YOLOv3-TinyE model based on binocular vision is higher than that based on monocular vision. The results suggest that the designed method achieves the goal in terms of inference speed and average precision, that is, it is able to satisfy the requirements for real-world applications.
Computer vision, driven by artificial intelligence, has become pervasive in diverse applications such as self-driving cars and law enforcement. However, the susceptibility of these systems to attacks has raised signif...
详细信息
The main purpose of this study is to explore the issues of real-time, accurate, and unmarked recognition of sports movements in recent years. By reviewing the relevant research on machine learning or deep learning for...
详细信息
ISBN:
(纸本)9798350374407
The main purpose of this study is to explore the issues of real-time, accurate, and unmarked recognition of sports movements in recent years. By reviewing the relevant research on machine learning or deep learning for specific sports or target actions based on computer visionimage data input, the aim is to provide references for the application of unmarked motion capture technology in the field of sports motion recognition. The research employed a literature review methodology, conducting searches in six databases, namely Web of Science, PubMed, Scopus, Google Scholar, IEEE Xplore, and China National Knowledge Infrastructure (CNKI), covering publications from January 2000 to June 2020. Through boolean logic operations on the retrieved literature, key information such as first author/publication year, types/targets of motion, participant information, camera parameters, image feature extraction techniques, action recognition algorithms, evaluation methods for action recognition quality, training and validation methods for image data, and performance metrics for action recognition were extracted. After screening, a total of 23 articles were included in the study. The findings revealed that $39 \%$ of the studies utilized machine learning algorithms based on support vector machines, while $35 \%$ employed deep learning algorithms based on convolutional neural networks. Commonly used evaluation metrics for action recognition quality included classification accuracy, confusion matrix, and displacement error. The development of computer vision motion capture, models, and algorithms demonstrated promising applications in areas such as action technique recognition and sports performance analysis. Traditional machine learning algorithms like support vector machines and principal component analysis remain dominant in action recognition technology;however, in certain scenarios, the performance of deep learning algorithms surpassed that of traditional machine learning methods.
The push-relabel algorithm is recognized as one of the efficient algorithms in the field of graph cut, finding widespread applications in computer vision. While its pixel-level parallel implementations are prevalent, ...
详细信息
The push-relabel algorithm is recognized as one of the efficient algorithms in the field of graph cut, finding widespread applications in computer vision. While its pixel-level parallel implementations are prevalent, existing methods predominantly rely on checkerboard scheduling, imposing inherent constraints on neighborhood size, limited to four. This limitation compromises both algorithm precision and efficiency, hindering real-time and high-precision applications. To address these issues, this article introduces a novel approach to accelerate push-relabel algorithm implementation on FPGA in a more universal and efficient manner, supporting variable-sized image block operations. First, by introducing the deferred update strategy, we realize the pulse-enhanced parallel push-relabel (PEPPR) algorithm to address data contention and conflict in parallel processing. Second, the simultaneous weighted push method is proposed, further enhancing parallel operations. Lastly, we introduce the efficient diffusion wave search (DWS) algorithm to expedite algorithm convergence and reduce redundancy. While achieving a modest $1.7\times $ acceleration compared to state-of-the-art implementations, the proposed algorithm (PEPPR-DWS) successfully overcomes the inherent limitations of checkerboard scheduling in full pixel-level parallelism. In the test based on Middlebury benchmark v3, the proposed 8-neighborhood implementation exhibits a reduction of error rate by over 1% compared to the typical 4-neighborhood implementation. It provides a versatile and efficient solution for high-precision and real-time applications, holding substantial potential for practical applications.
Outdoor computer vision systems face significant challenges due to reduced visibility and severe color distortion in the images captured in sand-dust-affected environments. This study aims to improve the visibility of...
详细信息
Outdoor computer vision systems face significant challenges due to reduced visibility and severe color distortion in the images captured in sand-dust-affected environments. This study aims to improve the visibility of sand-dust-degraded images. To achieve this goal, a novel and effective method is proposed to remove the sand-dust color cast and enhance imagevisibility. The proposed method combines two essential color model methods to remove the sand-dust color cast and enhance image clarity. In the initial phase, sand-dust removal is achieved using a novel Intensity-corrected blue channel compensation along with white balancing for color adjustment based on the Red-Green-Blue (RGB) color model. In the next phase, a novel Edge-preserving contrast enhancement method is applied to improve the visibility under sand-dust conditions. This method consists of CLAHE, a Gaussian blur filter, a Laplace filter, and the sigmoid function. Using the Hue-Saturation-value (HSv) color model, CLAHE is applied for contrast enhancement;the Gaussian blur filter removes high-frequency noise, and the Laplace filter enhances edge detection, all targeting the v (value) channel to refine image details, while the sigmoid function adjusts saturation in the Saturation (S) channel, ensuring natural color balance and improved feature visibility. In-depth qualitative and quantitative evaluations are conducted on images with varying levels of sand-dust intensity (weak, moderate, strong, extreme). The proposed method shows superior performance in Peak Signal-to-Noise Ratio (PSNR), Structural Similarity Index (SSIM), and processing speed, while significantly reducing computational complexity. Compared to the state-of-the-art CNN and all previous methods, our proposed method is efficient for real-time applications with minimal hardware requirements, making it ideal for embedded vision systems. Furthermore, a novel Energy Efficiency Index (EEI) is used to assess computational cost-effectiveness. The ev
Introduction: In recent years, various deep learning algorithms have exhibited remarkable performance in various data-rich applications, like health care, medical imaging, as well as in computer vision. COvID-19, whic...
详细信息
Introduction: In recent years, various deep learning algorithms have exhibited remarkable performance in various data-rich applications, like health care, medical imaging, as well as in computer vision. COvID-19, which is a rapidly spreading virus, has affected people of all ages both socially and economically. Early detection of this virus is therefore important in order to prevent its further spread. Methods: COvID-19 crisis has also galvanized researchers to adopt various machine learning as well as deep learning techniques in order to combat the pandemic. Lung images can be used in the diagnosis of COvID-19. Results: In this paper, we have analysed the COvID-19 chest CT image classification efficiency using multilayer perceptron with different imaging filters, like edge histogram filter, colour histogram equalization filter, color-layout filter, and Garbo filter in the WEKA environment. Conclusion: The performance of CT image classification has also been compared comprehensively with the deep learning classifier Dl4jMlp. It was observed that the multilayer perceptron with edge histogram filter outperformed other classifiers compared in this paper with 89.6% of correctly classified instances.
Recent developments in image analysis and interpretation using computer vision techniques have shown potential for novel applications in microbiology laboratories to support the task of automation, aiming for faster a...
详细信息
暂无评论