Privacy is a crucial concern in collaborative machinevision where a part of a Deep Neural Network (DNN) model runs on the edge, and the rest is executed on the cloud. In such applications, the machinevision model do...
详细信息
Privacy is a crucial concern in collaborative machinevision where a part of a Deep Neural Network (DNN) model runs on the edge, and the rest is executed on the cloud. In such applications, the machinevision model does not need the exact visual content to perform its task. Taking advantage of this potential, private information could be removed from the data insofar as it does not significantly impair the accuracy of the machinevision system. In this paper, we present an autoencoder-style network integrated within an object detection pipeline, which generates a latent representation of the input image that preserves task-relevant information while removing private information. Our approach employs an adversarial training strategy that not only removes private information from the bottleneck of the autoencoder but also promotes improved compression efficiency for feature channels coded by conventional codecs like VVC-Intra. We assess the proposed system using a realistic evaluation framework for privacy, directly measuring face and license plate recognition accuracy. Experimental results show that our proposed method is able to reduce the bitrate significantly at the same object detection accuracy compared to coding the input images directly, while keeping the face and license plate recognition accuracy on the images recovered from the bottleneck features low, implying strong privacy protection. Our code is available at https://***/bardia-az/ppa-code.
The specular reflection of objects is an important factor affecting image display quality, which poses challenges to tasks such as pattern recognition and machinevision detection. At present, specular removal for a s...
详细信息
The specular reflection of objects is an important factor affecting image display quality, which poses challenges to tasks such as pattern recognition and machinevision detection. At present, specular removal for a single real image is a crucial pre-processing step to improve the performance of computer vision algorithms. Despite notable approaches tailored for handling synthesized and pre-simplified images with dark backgrounds, real-time separation of specular reflection for a single real image remains a challenging problem. This paper proposes a novel specular removal method to separate the specular reflection for a single real image accurately and efficiently based on the dark channel prior. Initially, a modified-specular-free (MSF) image is developed using the dark channel prior, which can derive a direct estimation of specular reflection. Next, the image chromaticity spaces are established to represent the pixel intensity. Then, the maximum chromaticity value of the modified MSF image is extracted to guide the filtering of the specular reflection, treating the specular pixels as noise in the chromaticity space. Finally, the image without specular reflection can be obtained using the restored maximum chromaticity value based on the dichromatic reflection model. The superiority of this method is to achieve highquality specular reflection separation quickly without destroying the geometric features of the real image. Compared with the state-of-the-art methods, experimental results show that the proposed algorithm can achieve the best subjective visual effect and satisfactory quantitative performance. In addition, this approach can be implemented efficiently to meet real-time requirements, promising to be applied to computer vision measurement and inspection applications.
Optimizers play important roles in enhancing the performance of a deep network. A study on different optimizers is necessary to understand the effect of optimizers on the performance of the deep network for a given ta...
详细信息
Optimizers play important roles in enhancing the performance of a deep network. A study on different optimizers is necessary to understand the effect of optimizers on the performance of the deep network for a given target task, such as image classification. Several attempts were made to investigate the effect of optimizers on the performance of CNNs. However, such experiments have not been carried out on vision transformers (ViT), despite the recent success of ViT in various imageprocessing tasks. In this paper, we conduct exhaustive experiments with ViT using different optimizers. In our experiments, we found that weight decoupling and weight decay in optimizers play important roles in training ViT. We focused on the concept of weight decoupling and tried different variations of it to investigate to what extent weight decoupling is beneficial for a ViT. We propose two techniques that provide better results than weight-decoupled optimizers: (i) The weight decoupling step in optimizers involves a linear update of the parameter with weight decay as the scaling factor. We propose a quadratic update of the parameter which involves using a linear as well as squared parameter update using the weight decay as the scaling factor. (ii) We propose using different weight decay values for different parameters depending on the gradient value of the loss function with respect to that parameter. A smaller weight decay is used for parameters with a higher gradient value and vice versa. image classification experiments are conducted over CIFAR-100 and TinyimageNet datasets to observe the performance of these proposed methods with respect to state-of-the-art optimizers such as Adam, RAdam, and AdaBelief. The code is available at https://***/Hemanth-Boyapati/Adaptive-weight-decay-optimizers.
The advancements in computer vision and imageprocessing techniques have led to emergence of new application in the domain of visual surveillance, targeted advertisement, content-based searching, human-computer intera...
详细信息
The advancements in computer vision and imageprocessing techniques have led to emergence of new application in the domain of visual surveillance, targeted advertisement, content-based searching, human-computer interaction, etc. Out of the various techniques in computer vision, face analysis, in particular, has gained much attention. Several previous studies have tried to explore different applications of facial feature processing for a variety of tasks, including age and gender classification. However, despite several previous studies having explored the problem, the age and gender classification of in-wild human faces is still far from achieving the desired levels of accuracy required for real-world applications. This paper, therefore, attempts to bridge this gap by proposing a hybrid model that combines self-attention and BiLSTM approaches for age and gender classification problems. The proposed model's performance is compared with several state-of-the-art models proposed so far. An improvement of approximately 10% and 6% over the state-of-the-art implementations for age and gender classification, respectively, is noted for the proposed model. The proposed model is thus found to achieve superior performance and is found to provide a more generalized learning. The model can, therefore, be applied as a core classification component in various imageprocessing and computer vision problems.
Embedded computer vision systems are increasingly being adopted across various domains, playing a pivotal role in enabling advanced technologies such as autonomous vehicles and industrial automation. Their cost-effect...
详细信息
Embedded computer vision systems are increasingly being adopted across various domains, playing a pivotal role in enabling advanced technologies such as autonomous vehicles and industrial automation. Their cost-effectiveness, compact size, and portability make them particularly well-suited for diverse implementations and operations. In real-time scenarios, these systems must process visual data with minimal latency, which is crucial for immediate decision-making. However, these solutions continue to face significant challenges related to computational efficiency, memory usage, and accuracy. This research addresses these challenges by enhancing classification methodologies, specifically in Gray Level Co-occurrence Matrix (GLCM) feature extraction and Support Vector machine (SVM) classifiers. To maintain a high level of accuracy while preserving performance, a smaller feature set is selected following a comprehensive complexity analysis and is further refined through Correlation-based Feature Selection (CFS). The proposed method achieves an overall classification accuracy of 84.76% with a feature set reduced by 79.2%, resulting in a 72.45% decrease in processing time, a 50% reduction in storage requirements, and up to a 77.8% decrease in memory demand during prediction. These improvements demonstrate the effectiveness of the proposed approach in improving the adaptability and capabilities of embedded vision systems (EVS), optimizing their performance under the constraints of real-time limited-resource environments.
X-ray imaging technology has been used for decades in clinical tasks to reveal the internal condition of different organs, and in recent years, it has become more common in other areas such as industry, security, and ...
详细信息
X-ray imaging technology has been used for decades in clinical tasks to reveal the internal condition of different organs, and in recent years, it has become more common in other areas such as industry, security, and geography. The recent development of computer vision and machine learning techniques has also made it easier to automatically process X-ray images and several machine learning-based object (anomaly) detection, classification, and segmentation methods have been recently employed in X-ray image analysis. Due to the high potential of deep learning in related imageprocessingapplications, it has been used in most of the studies. This survey reviews the recent research on using computer vision and machine learning for X-ray analysis in industrial production and security applications and covers the applications, techniques, evaluation metrics, datasets, and performance comparison of those techniques on publicly available datasets. We also highlight some drawbacks in the published research and give recommendations for future research in computer vision-based X-ray analysis.
The computer vision-based analysis of railway superstructure has gained significant attention in railway engineering. This approach utilises advanced imageprocessing and machine learning techniques to extract valuabl...
详细信息
The computer vision-based analysis of railway superstructure has gained significant attention in railway engineering. This approach utilises advanced imageprocessing and machine learning techniques to extract valuable information from visual data captured in the railway track environment. By analysing images from various sources such as cameras, drones, or sensors, computer vision algorithms can accurately detect and classify different components of the ballast superstructure, including the catenary system support, rail surface and profile, fastening system, sleeper, and ballast layer. This enables the automated assessment of the railway track's condition, stability, and maintenance needs. This paper comprehensively reviews the recent advancements, challenges, and potential applications of computer vision techniques in analysing railway superstructure. It discusses various vision-based methodologies and machine-learning approaches utilised in this context. Furthermore, it examines the benefits and limitations of computer vision-based analysis and presents future research directions for improving its applicability in railway track engineering.
Industry 4.0 conceptualizes the automation of processes through the introduction of technologies such as artificial intelligence and advanced robotics, resulting in a significant production improvement. Detecting defe...
详细信息
Industry 4.0 conceptualizes the automation of processes through the introduction of technologies such as artificial intelligence and advanced robotics, resulting in a significant production improvement. Detecting defects in the production process, predicting mechanical malfunctions in the assembly line, and identifying defects of the final product are just a few examples of applications of these technologies. In this context, this work focuses on the detection of ultrasound probes' surface defects, with a focus on Esaote S.p.A.'s production line probes. To date, this control is performed manually and therefore biased by many factors such as surface morphology, color, size of the defect, and by lighting conditions (which can cause reflections preventing detection). To overcome these shortfalls, this work proposes a fully automatic machinevision system for surface acquisition of ultrasound probes coupled with an automated defect detection system that leverage artificial intelligence. The paper addresses two crucial steps: (i) the development of the acquisition system (i.e., selection of the acquisition device, analysis of the illumination system, and design of the camera handling system);(ii) the analysis of neural network models for defect detection and classification by comparing three possible solutions (i.e., MMSD-Net, ResNet, EfficientNet). The results suggest that the developed system has the potential to be used as a defect detection tool in the production line (full image acquisition cycle takes similar to 200 s), with the best detection accuracy obtained with the EfficientNet model being 98.63% and a classification accuracy of 81.90%.
The AdaMax algorithm provides enhanced convergence properties for stochastic optimization problems. In this paper, we present a regret bound for the AdaMax algorithm, offering a tighter and more refined analysis compa...
详细信息
The AdaMax algorithm provides enhanced convergence properties for stochastic optimization problems. In this paper, we present a regret bound for the AdaMax algorithm, offering a tighter and more refined analysis compared to existing bounds. This theoretical advancement provides deeper insights into the optimization landscape of machine learning algorithms. Specifically, the You Only Look Once (YOLO) framework has become well-known as an extremely effective object segmentation tool, mostly because of its extraordinary accuracy in real-time processing, which makes it a preferred option for many computer visionapplications. Finally, we used this algorithm for image segmentation.
Artificial neural networks have been one of the science's most influential and essential branches in the past decades. Neural networks have found applications in various fields including medical and pharmaceutical...
详细信息
Artificial neural networks have been one of the science's most influential and essential branches in the past decades. Neural networks have found applications in various fields including medical and pharmaceutical services, voice and speech recognition, computer vision, natural language processing, and video and imageprocessing. Neural networks have many layers and consume much energy. Approximate computing is a promising way to reduce energy consumption in applications that can tolerate a degree of accuracy reduction. This paper proposes an effective method to prevent accuracy reduction after using approximate computing methods in the CNNs. The method exploits the k-means clustering algorithm to label pixels in the first convolutional layer. Then, using one of the existing pruning methods, different pruning amounts have been applied to all layers. The experimental results on three CNNs and four different datasets show that the accuracy of the proposed method has significantly improved (by 17%) compared to the baseline network.
暂无评论