This paper addresses two key limitations in existing image Signal processing (ISP) approaches: the suboptimal performance in low-light conditions and the lack of trainability in traditional ISP methods. To tackle thes...
详细信息
ISBN:
(纸本)9798350344868;9798350344851
This paper addresses two key limitations in existing image Signal processing (ISP) approaches: the suboptimal performance in low-light conditions and the lack of trainability in traditional ISP methods. To tackle these issues, we propose a novel, trainable ISP framework that incorporates both the strengths of traditional ISP techniques and advanced MultiScale Retinex (MSR) algorithms for night-time enhancement. Our method consists of three primary components: an ISP-based Luminance Harmonization layer to initially optimize luminance levels in RAW data, a deep learning-based MSR layer for nuanced decomposition of image components, and a specialized enhancement layer for both precise, regionspecific luminance enhancement and color denoising. The proposed approach is validated through rigorous experiments on machinevision benchmarks and objective visual quality indicators. Our results demonstrate not only a significant improvement over existing methods but also robust adaptability under diverse lighting conditions. This work offers a versatile ISP framework with promising applications beyond its immediate scope.
With the continuous progress of imageprocessing and machinevision technology, the demand for efficient and real-time processing is becoming more and more prominent, especially in the field of high-noise image proces...
详细信息
ISBN:
(纸本)9798350377040;9798350377033
With the continuous progress of imageprocessing and machinevision technology, the demand for efficient and real-time processing is becoming more and more prominent, especially in the field of high-noise imageprocessing. In this study, an adaptive Gaussian filtering algorithm is proposed, which is implemented based on FPGA and aims to improve the computational efficiency and real-time performance of the imageprocessing system. Compared with the traditional fixed-weight filter, this algorithm is able to dynamically adjust the filtering parameters according to different noise environments, effectively balancing noise suppression and image detail retention. We coded the algorithm using Verilog hardware description language and verified it on PYNQ-Z2 FPGA platform. The experimental results show that the adaptive algorithm outperforms the fixed-weight filtering method in terms of performance, especially in terms of noise suppression and detail preservation. Meanwhile, the FPGA hardware implements the reduction of filtering delay and optimization of resource consumption, making it well suited for real-time applications. This study demonstrates the promise of FPGA adaptive filtering for applications in medical imaging, remote sensing, and intelligent surveillance, which have stringent requirements for high-performance and high-efficiency processing. This research provides new hardware solutions for real-time, high-quality imageprocessing in constrained environments.
To meet the needs of teaching and practical applications in machinevision technology, a virtual reality-based machinevision experimental platform has been designed and developed. Unity3D was utilized as the developm...
详细信息
In agricultural applications, the utilization of imageprocessing with machine learning, particularly for fruit classification, has become increasingly prevalent. This study focuses on the automated classification of ...
详细信息
In agricultural applications, the utilization of imageprocessing with machine learning, particularly for fruit classification, has become increasingly prevalent. This study focuses on the automated classification of various Indian mango varieties, employing the deep features of MobileNet-v2 and Shufflenet, integrated with diverse machine learning classifiers. The research is anchored on an extensive dataset, encompassing 15 distinct Indian mango varieties, meticulously collated from various vegetable markets across India. This dataset is accessible at "Sethy, Prabira Kumar;Behera, Santi;Pandey, Chanki (2023), 'Mango Variety', Mendeley Data, V2, doi: 10.17632/tk6d98f87d.2". A comprehensive comparison of various machine learning classifiers highlighted the dominance of the Cubic Support Vector machine (SVM) when integrated with deep features extracted from MobileNet-v2. This pairing resulted in an outstanding classification accuracy of 99.5% and an Area Under the Curve (AUC) of 1, demonstrating exceptional performance in identifying fruit varieties. The significance of this research lies in its potential to revolutionize fruit classification processes in supermarkets and related sectors. By demonstrating the feasibility of applying advanced computer vision technology for the accurate classification of fruits, this study lays the groundwork for future exploration into the scalability, robustness, and wider applicability of these methods, potentially extending beyond mangoes to other fruit varieties. Such advancements could substantially benefit the agricultural industry, enhancing efficiency in both production and retail sectors.
Feature compression has attracted much attention in recent years due to its promising applications in scenarios where features are transmitted and analyzed by machinevision. However, existing research mainly focuses ...
详细信息
Feature compression has attracted much attention in recent years due to its promising applications in scenarios where features are transmitted and analyzed by machinevision. However, existing research mainly focuses on coarse-grained features extracted from recognition tasks such as classification and detection, neglecting fine-grained features extracted from identification tasks. In this paper, we make a pioneering attempt to study fine-grained feature compression in the context of identification tasks. Our main focus is on the distortion metric, given its critical importance in optimizing the performance of a compression network. We initiate our discussion by reviewing the instance-level metrics in existing literature, highlighting their oversight of the inter-feature relationships. The inter-feature relationships are especially important for identification tasks as they involve similarity comparison among different identities. To address this problem, we propose to consider inter-feature relationships from the perspective of identity information. Specifically, we propose an identity-level metric to incorporate both intra-identity similarity and inter-identity discriminability. The intra-identity similarity constraint aims to cluster features from the same identity, while the inter-identity discriminability constraint ensures that features from different identities deviate from each other. We implement the identity-level metric on four different feature compression networks designed based on feature characteristics. Experimental results show the effectiveness of the proposed identity-level metric on person re-identification and face verification tasks.
Deep learning-based approaches, such as Convolutional Neural Nets (CNNs), have shown high performance in classifying contents of images. CNNs, however, have the notable drawbacks of potentially high computing costs, p...
详细信息
ISBN:
(纸本)9781510674219;9781510674202
Deep learning-based approaches, such as Convolutional Neural Nets (CNNs), have shown high performance in classifying contents of images. CNNs, however, have the notable drawbacks of potentially high computing costs, poor explainability, and wide performance variance if the underlying imagery data deviates from the training baseline. As advanced imageprocessing capabilities are matured, the on-board detection of objects in space-based imagery is increasingly proposed. On-board satellite processingapplications, which may be resource-limited, can drive the need for simpler models that reduce the necessary computing burden for edge computing applications. This raises the question of how well classic computer vision techniques can compete with more modern approaches. This paper characterizes and compares the performance of multiple computer vision models for the application of distinguishing maritime vessels from typical clutter in commercial electrooptical (EO) satellite imagery. A Support Vector machine (SVM) model using manually curated features is compared to multiple DL-based models spanning a range of model sizes, with the goal of determining whether classical approaches can compete favorably with DL when computational resources are taken into consideration. Differences in performance and processing resources are characterized between the approaches. Findings include that the SVM-based model may approach the accuracy of some CNN-based models for classifying images of clouds in satellite EO imagery for smaller DL-based models. However, even the smallest DL-based models, which take about the same computational resources as the SVM-based model, generally out-perform the SVMbased model. This finding may have implications for the operational use of on-board processing techniques for satellite payloads.
Approximate computing (AC) leverages the inherent error resilience and is used in many big-data applications from various domains such as multimedia, computer vision, signal processing, and machine learning to improve...
详细信息
Approximate computing (AC) leverages the inherent error resilience and is used in many big-data applications from various domains such as multimedia, computer vision, signal processing, and machine learning to improve systems performance and power consumption. Like many other approximate circuits and algorithms, the memory subsystem can also be used to enhance performance and save power significantly. This paper proposes an efficient and effective systematic methodology to construct an approximate nonvolatile magneto-resistive RAM (MRAM) framework using consumer-off-the-shelf (COTS) MRAM chips. In the proposed scheme, an extensive experimental characterization of memory errors is performed by manipulating the write latency of MRAM chips which exploits the inherent (intrinsic/extrinsic process variation) stochastic switching behavior of magnetic tunnel junctions (MTJs). The experimental results, involving error-resilient image compression and machine learning applications, reveal that the proposed AC framework provides a significant performance improvement and demonstrates a reduction in MRAM write energy of similar to 47:5% on average with negligible or no loss in output quality.
Complexity intensifies when gesticulations span various scales. Traditional scale-invariant object recognition methods often falter when confronted with case-sensitive characters in the English alphabet. The literatur...
详细信息
Complexity intensifies when gesticulations span various scales. Traditional scale-invariant object recognition methods often falter when confronted with case-sensitive characters in the English alphabet. The literature underscores a notable gap, the absence of an open-source multi-scale un-instructional gesture database featuring a comprehensive dictionary. In response, we have created the NITS (gesture scale) database, which encompasses isolated mid-air gesticulations of ninety-five alphanumeric characters. In this research, we present a scale-centric framework that addresses three critical aspects: (1) detection of smaller gesture objects: our framework excels at detecting smaller gesture objects, such as a red color marker. (2) Removal of redundant self co-articulated strokes: we propose an effective approach to eliminate redundant self co-articulated strokes often present in gesture trajectories. (3) Scale-variant approach for recognition: to tackle the scale vs. size ambiguity in recognition, we introduce a novel scale-variant methodology. Our experimental results reveal a substantial improvement of approximately 16% compared to existing state-of-the-art recognition models for mid-air gesture recognition. These outcomes demonstrate that our proposed approach successfully emulates the perceptibility found in the human visual system, even when utilizing data from monophthalmic vision. Furthermore, our findings underscore the imperative need for comprehensive studies encompassing scale variations in gesture recognition.
With the development of vision technology, image set classification (ISC) has flourished in the imageprocessing field. Different from the one-shot image classification, ISC focuses on the set rather than a one-shot i...
详细信息
With the development of vision technology, image set classification (ISC) has flourished in the imageprocessing field. Different from the one-shot image classification, ISC focuses on the set rather than a one-shot image. Hence, ISC can synthesize the abundant set information to alleviate various appearance variations. Despite the great success of the existing ISC methods, there are still some problems: (1) They usually face an expensive time complexity, which directly limits the practical application;(2) They largely ignore the intrinsic relationships between different sets. In light of this, we propose a novel Discrete Aggregation Hashing (DAH) for fast ISC. To be specific, to extract more semantic information from each set and each sample, we adopt the same projection standard to embed dual semantic labels (i.e., sample label and set label) into instance and set hash codes. Then we regard set hash codes as set-specific centers. A hashing aggregation strategy is proposed to learn compact discriminative instance hash codes via iteratively aggregating intrinsic neighborhood representations around each central node. Therefore, instance hash codes can obtain greater intra-set compactness and inter -set separability. Extensive experiments demonstrate that our DAH can obtain promising performance and outperform these state-of-the-art ISC methods on four image set datasets.
Visual small target motion detection finds successful applications in varied scenarios. However, dim-light conditions, such as the tunnel scenes and nighttime environments, present significant challenges to existing d...
详细信息
Visual small target motion detection finds successful applications in varied scenarios. However, dim-light conditions, such as the tunnel scenes and nighttime environments, present significant challenges to existing detection methods which mainly operate within the spatiotemporal domain. This is because the transmission of small target motion information suffers from the inevitable interference of image noise caused by dim light in the spatiotemporal domain, resulting in the detriment of extracting essential spatiotemporal features of the target motion. Given the significant obstacles posed by dim-light imaging to small target motion detection within the spatiotemporal domain, the exploration of an alternative observation domain for small target motion, alongside the development of a corresponding detection method, emerges as a viable solution. To address this, in this paper, we discovered the remarkable potential of the Haar frequency domain in characterizing the small target motion in dim light. To investigate the advantages of integrating Haar frequency processing in small target motion detection, we introduce a Haar-windowed summation mechanism into an existing bio-inspired small target motion detection model. The proposed mechanism integrates visual information in spatiotemporal windows regulated by frequency parameters of Haar wavelets and effectively discriminates the small target motion from the disturbance of random noise caused by dim light. Theoretical analysis and numerical experiments confirm the superior performance of integrating the Haar frequency processing. This study provides a new vision of small target motion detection through the lens of the frequency domain and extends the limits of existing bio-inspired models for practical applications in dim light.
暂无评论