We present a new haze removal algorithm based on attention map-guided multi-scale imageprocessing. The proposed method is based on the frequency-domain coefficient correction of a set of images followed by their fusi...
详细信息
ISBN:
(纸本)9781510667877;9781510667884
We present a new haze removal algorithm based on attention map-guided multi-scale imageprocessing. The proposed method is based on the frequency-domain coefficient correction of a set of images followed by their fusion based on the Laplacian pyramid. A new stage is presented in obtaining a local-global estimate of high-contrast images, also used in the attention map-guided fusion model. The algorithm consists of the following steps: gamma correction with different gamma parameters;the weight map calculation by multiplying the saturation, contrast, and attention for each image;decomposition of the weight map into a Gaussian pyramid;3-D block-rooting enhancement;decomposition of images after 3-D block-rooting and gamma correction into the Laplacian pyramid;merging by multiplying multi-scale images and weights. The experiment results on the dataset D-HAZE confirmed the high efficiency of the proposed enhancement method compared to the state-of-the-art techniques for industrial inspection systems.
Depth estimation and 3D object detection are critical for autonomous systems to gain context of their surroundings. In recent times, compute capacity has improved tremendously, enabling computer vision and AI on the e...
详细信息
The proceedings contain 9 papers. The special focus in this conference is on Design and Architectures for Signal and imageprocessing. The topics include: Brain Blood vessel Segmentation in Hyperspectral images T...
ISBN:
(纸本)9783031299698
The proceedings contain 9 papers. The special focus in this conference is on Design and Architectures for Signal and imageprocessing. The topics include: Brain Blood vessel Segmentation in Hyperspectral images Through Linear Operators;Neural Network Predictor for Fast Channel Change on DvB Set-Top-Boxes;AINoC: New Interconnect for Future Deep Neural Network Accelerators;Real-Time FPGA Implementation of the Semi-global Matching Stereo vision Algorithm for a 4K/UHD video Stream;TaPaFuzz - An FPGA-Accelerated Framework for RISC-v IoT Graybox Fuzzing;Adaptive Inference for FPGA-Based 5G Automatic Modulation Classification;High-Level Online Power Monitoring of FPGA IP Based on machine Learning.
Digital watermarking is a widely used technique for embedding information into digital media to protect intellectual property rights. However, digital watermarks are vulnerable to various types of malicious attacks. I...
详细信息
Different aspects of froth flotation have received varying levels of interest from the automation community over the past 30 years. Model-based level stabilisation and masspull based grade control strategies continue ...
详细信息
ISBN:
(纸本)9781713872344
Different aspects of froth flotation have received varying levels of interest from the automation community over the past 30 years. Model-based level stabilisation and masspull based grade control strategies continue to deliver significant benefit to industry. However, industry seems slow to adopt the use of imageprocessingapplications and more comprehensive flotation models in industrial Advanced Process Control (APC) applications - despite the benefits reported in the literature. In this paper an industrial flotation control system that includes basic visual froth imaging functionality is presented as a case study, to highlight some of the challenges experienced, and to identify reasons why integrated industrial APC implementations including advanced models and machine learning components remain scarce. Copyright (c) 2023 The Authors.
Human action recognition has become one of the main topics in the computer vision field due to its high demand and competitiveness in real-world applications. The main goals of human action recognition are to improve ...
详细信息
Human action recognition has become one of the main topics in the computer vision field due to its high demand and competitiveness in real-world applications. The main goals of human action recognition are to improve classification accuracy and reduce computational complexity. Previous studies have mainly used two approaches: the hand-crafted feature extraction approach and the deep learning approach. The hand-crafted approach is simple, which confers it with an added advantage in terms of computational complexity. However, this method is low in accuracy. Conversely, the deep learning approach achieves high accuracy even for complex datasets, but it suffers in terms of computational complexity and long training time as it needs to process huge datasets during training. Other approaches include the use of pre-trained deep learning networks to fuse both methods. In this paper, we will introduce a combination of pre-trained convolutional neural networks (CNN) to extract features, an improved Fisher vector (iFv) codebook, and an optimized support vector machine SvM to achieve improved human action recognition. We leveraged three pre-trained CNNs, namely, Inception-ResNet-v2, NASNet-Large, and Xception, to extract the features. Then, we applied the improved Fisher vector codebook to encode them. We subsequently trained the codebook using SvM for classification and re- adjusted the SvM weights using five different optimization techniques, which are SGD, Adadelta, ADAM, Adamax, and Nadam. To evaluate the performance, we utilized UCF101 and HMDB51 datasets. The results demonstrate that the accuracy and computational complexity of our approach are comparable to state-of-the-art techniques.
The difficulty in differentiating between the normal and cancerous cells in the brain through the frequent magnetic resonance imaging approaches is one of the major obstacles to realization of diagnostic precision. Th...
详细信息
ISBN:
(纸本)9798350361155
The difficulty in differentiating between the normal and cancerous cells in the brain through the frequent magnetic resonance imaging approaches is one of the major obstacles to realization of diagnostic precision. The presented work reveals a new MRI imageprocessing technology, which includes an original software that contains complex algorithms and trained machine learning models, as the programs that make the images much better than before. Carefully calibrated computer vision dataset which features brain scans is subjected to the well-defined novel approach, whose performance is compared with the traditional MRI methods by calculating multiple metrics such as classification accuracy, sensitivity, specificity, ROC are and so on. This paper is the climax of a deep cognition of various cell types revealed by the newest MRI method which presents a better contrast than the old techniques. Next, the quality in the detection and recognition of the tumour after, comparison of these modalities displays that the modality of higher resolution, the ability to detect the tumour earlier and better. Such technological improvements in MRI machines will enable the surgeons to identify the growth of tumors at the early stages that will lay the right groundwork for the design of personalized treatment plans and also will have positive impact on the lives of the patients. Through this effect, the level of quality followed by MR imaging has been improved as well as arising new alliances between major imaging companies and machine learning technologies. It can be thought as the border - eraser diagnostics and imaging which will obey medical laws. Therefore, it states that the present hypothetical world should be improved while the advanced and proven diagnostics systems should be developed. The daily clinical applications of these advanced MRIs may well be the beginning of a new era in diagnostic oncology, which will be a very important way forward in improving treatment combined wi
In recent years, the growth of large-scale datasets has significantly propelled the progress of deep learning applications. Yet, annotating these datasets remains a labor-intensive endeavor, pushing the reliance on co...
详细信息
In recent years, the growth of large-scale datasets has significantly propelled the progress of deep learning applications. Yet, annotating these datasets remains a labor-intensive endeavor, pushing the reliance on costeffective but less specialized data collection methods and internet data sources. This often results in noisy and inaccurate labels, compromising data quality. Traditional machine learning models assume clean data, but real-world datasets often exhibit significant label noise. This paper examines the impact of such noise on object detection performance, a pivotal aspect of computer vision. We analyze the influence of noisy labels using three renowned object detection frameworks: YOLOv5, Faster R-CNN, and the recent YOLOv8, alongside established datasets: MS COCO, vOC, and ExDARK. Additionally, experiments with the UvM dataset explore domain-specific tasks in dense object scenarios. Two new metrics - Model Health and Detection Capability - were introduced to evaluate the results. Findings indicate that models maintain over 80% of their health (a 20% decline in mAP from the baseline) with up to 40% label corruption. However, Detection Capability deteriorates more sharply under the same conditions. The research also employs the D-RISE method for model explainability, highlighting crucial image regions affecting detection outcomes. Despite the noise, critical detection areas in models remain similar to those in clean data up to the 40% corruption level, as verified by similarity metrics. This study underscores the resilience of object detection models to label noise and provides insights into maintaining performance amidst data quality challenges.
vision Language Models (vLMs) are rapidly advancing in their capability to answer information-seeking questions. As these models are widely deployed in consumer applications, they could lead to new privacy risks due t...
详细信息
The detection and recognition of targets within imagery and video analysis is vital for military and commercial applications. The development of infrared sensor devices for tactical aviation systems imagery has increa...
详细信息
ISBN:
(纸本)9781510661561;9781510661578
The detection and recognition of targets within imagery and video analysis is vital for military and commercial applications. The development of infrared sensor devices for tactical aviation systems imagery has increased the performance of target detection. Due to the advancements of infrared sensors capabilities, their use for field operations such as visual operations (visops) or reconnaissance missions that take place in a variety of operational environments have become paramount. Many techniques implemented stretch back to 1970, but were limited due to computational power. The AI industry has recently been able to bridge the gap between traditional signal processing tools and machine learning. Current state of the art target detection and recognition algorithms are too bloated to be applied for on ground or aerial mission reconnaissance. Therefore, this paper proposes Edge IR vision Transformer (EIR-viT), a novel algorithm for automatic target detection utilizing infrared images that is lightweight and operates on the edge for easier deployability.
暂无评论