This paper addresses two key limitations in existing image Signal processing (ISP) approaches: the suboptimal performance in low-light conditions and the lack of trainability in traditional ISP methods. To tackle thes...
This paper addresses two key limitations in existing image Signal processing (ISP) approaches: the suboptimal performance in low-light conditions and the lack of trainability in traditional ISP methods. To tackle these issues, we propose a novel, trainable ISP framework that incorporates both the strengths of traditional ISP techniques and advanced Multi-Scale Retinex (MSR) algorithms for night-time enhancement. Our method consists of three primary components: an ISP-based Luminance Harmonization layer to initially optimize luminance levels in RAW data, a deep learning-based MSR layer for nuanced decomposition of image components, and a specialized enhancement layer for both precise, region-specific luminance enhancement and color denoising. The proposed approach is validated through rigorous experiments on machinevision benchmarks and objective visual quality indicators. Our results demonstrate not only a significant improvement over existing methods but also robust adaptability under diverse lighting conditions. This work offers a versatile ISP framework with promising applications beyond its immediate scope.
machinevision systems used in modern industrial complexes, based on the analysis of multi and hyperspectral imaging. The transition to implementing the "Industry 4.0" program is not possible when using one ...
详细信息
ISBN:
(数字)9781510645974
ISBN:
(纸本)9781510645974;9781510645967
machinevision systems used in modern industrial complexes, based on the analysis of multi and hyperspectral imaging. The transition to implementing the "Industry 4.0" program is not possible when using one type of data. The first control system used only the visible range image. They made it possible to analyze the trajectories of movement of objects, control product quality, carry out security functions (control of perimeter crossing), etc. The development of new industrial robotic cells and processing complexes using cognitive functions implying the receipt, analysis, and processing of heterogeneous data. The construction of a unified information field, which allows performing multidimensional operations with data, allows increasing the speed of decision-making and the implementation of automated robot-human systems at the level of an assistant working in a unified workspace. The use of machinevision systems analyzing information received in: visible (shape, the trajectory of movement, position of objects, etc.);near-infrared range (data is similar to visible, allows operation in dusty, foggy, low light conditions);far-infrared range - thermal (plotting temperature gradients, identifying areas of overheating);ultraviolet range (analysis of ionization sources, corona discharges, static charges, tags);X-ray and microwave ranges (analysis of the surface and internal structure of objects, allow the identification of defects);range and 3D sensors (construction of volumetric figures, analysis of the relative position of objects and their interaction), etc. Data analysis is often performed not by a single camera but by a group of sensors located not in a single housing. Primary data integration reduces the number of information channels while maintaining the functionality and accuracy of the analysis. The article discusses creating fusion images obtained by industrial sensors into a combined image containing joint data. Combining multi and hyperspectral imaging makes i
The traditional drop-weight impact velocity measurement is affected by the measurement distance of the sensor and the measurement environment, and the accuracy is difficult to guarantee. Aiming at this problem, this p...
详细信息
Local descriptor algorithms are foundational in computer visionapplications such as image matching and image retrieval. Some local descriptor algorithms extract features containing similar information from images whi...
详细信息
ISBN:
(数字)9798350362312
ISBN:
(纸本)9798350362329
Local descriptor algorithms are foundational in computer visionapplications such as image matching and image retrieval. Some local descriptor algorithms extract features containing similar information from images while others extract complementary information. In this work, we investigate the advantages of combining a binary and a non-binary local descriptor algorithm. We propose and compare three methods to combine SIFT and BRISK descriptor algorithms selected because they produce complementary descriptor vectors. First, we propose to combine SIFT and BRISK descriptors using a weighted summation of their individual descriptor distances with learned weights. Our second method converts SIFT into a binary descriptor and concatenates the binary SIFT vector with the BRISK descriptor. The third method is to scale the binary BRISK descriptor vector and concatenate it with the SIFT descriptor. Parameters for combining the descriptors are learned based on the HPatches data set for each of the three methods. Our proposed methods increase the mean Average Precision in the range of 3% to 15.8% over the original BRISK and in the range of 5.9% to 21.8% over the original SIFT algorithm in various evaluation conditions.
Chromosome analysis and classification are essential in clinical applications to diagnose various structural and numerical abnormalities. Recently, karyotype analysis using intelligent imageprocessing methods, especi...
详细信息
Egocentric vision data captures the first person perspective of a visual stimulus and helps study the gaze behavior in more natural contexts. In this work, we propose a new dataset collected in a free viewing style wi...
详细信息
Egocentric vision data captures the first person perspective of a visual stimulus and helps study the gaze behavior in more natural contexts. In this work, we propose a new dataset collected in a free viewing style with an end-to-end data processing pipeline. A group of 25 participants provided their gaze information wearing Tobii Pro Glasses 2 set up at a museum. The gaze stream is post-processed for handling missing or incoherent information. The corresponding video stream is clipped into 20 videos corresponding to 20 museum exhibits and compensated for user's unwanted head movements. Based on the velocity of directional shifts of the eye, the I-VT algorithm classifies the eye movements into either fixations or saccades. Representative scanpaths are built by generalizing multiple viewers' gazing styles for all exhibits. Therefore, it is a dataset with both the individual gazing styles of many viewers and the generic trend followed by all of them towards a museum exhibit. The application of our dataset is demonstrated for characterizing the inherent gaze dynamics using state trajectory estimator based on ancestor sampling (STEAS) model in solving gaze data classification and retrieval problems. This dataset can also be used for addressing problems like segmentation, summarization using both conventional machine and deep learning approaches.
In this proposed approach to unobtrusive human activity classification, a two-stage machine learning-based algorithm was applied to backscattered ultrawideband radar signals. First, a preprocessing step was applied fo...
详细信息
In this proposed approach to unobtrusive human activity classification, a two-stage machine learning-based algorithm was applied to backscattered ultrawideband radar signals. First, a preprocessing step was applied for noise and clutter suppression. Then, feature extraction and a combination of time-frequency (TF) and time-range (TR) domains were used to extract the features of human activities. Then, feature analysis was performed to determine robust features relative to this kind of classification and reduce the dimensionality of the feature vector. Subsequently, different recognition algorithms were applied to group activities as fall or non-fall and categorise their types. Finally, a performance study was used to choose the higher accuracy algorithm. The ensemble bagged tree and fine K-nearest neighbour methods showed the best performance. The results show that the two-stage classification was more accurate than the one-stage. Finally, it was observed that the proposed approach using a combination of TR and TF domains with two-stage recognition outperformed reference approaches mentioned in the literature, with average accuracies of 95.8% for eight-activities classification and 96.9% in distinguishing between fall and non-fall activities with efficient computational complexity.
Hardwood flooring products are popular construction materials because of their aesthetics, durability, low maintenance requirements, and affordability. To ensure product quality during manufacturing, common defects su...
详细信息
Hardwood flooring products are popular construction materials because of their aesthetics, durability, low maintenance requirements, and affordability. To ensure product quality during manufacturing, common defects such as cracks, chips, or stains are typically detected and classified manually, but this process can decrease productivity. The aim of this study was to develop an automatic machinevision-based inspection system with a robust algorithm for inspecting small hardwood flooring defects in a production line. This defect-inspection algorithm is based on image-processing techniques, including background elimination, boundary approximation, and defect inspection of photographs. The YOLOv5 deep-learning algorithm for object detection was applied to detect surface defects. The resulting algorithm identified the quality of each specimen (i.e., either good or defective). The influences of colour and surface patterns on defect inspection were experimentally investigated under light conditions. The algorithm was adaptable to specimens with different colours and patterns under various conditions, demonstrating the potential of this approach in practical situations.
Flat-field correction (FFC) is commonly used in image signal processing (ISP) to improve the uniformity of image sensor pixels. image sensor nonuniformity and lens system characteristics have been known to be temperat...
详细信息
Flat-field correction (FFC) is commonly used in image signal processing (ISP) to improve the uniformity of image sensor pixels. image sensor nonuniformity and lens system characteristics have been known to be temperature-dependent. Some machinevisionapplications, such as visual odometry and single-pixel airborne object tracking, are extremely sensitive to pixel-to-pixel sensitivity variations. Numerous cameras, especially in the fields of infrared imaging and staring cameras, use multiple calibration images to correct for nonuniformities. This paper characterizes the temperature and analog gain dependence of the dark signal nonuniformity (DSNU) and photoresponse nonuniformity (PRNU) of two contemporary global shutter CMOS image sensors for machinevisionapplications. An optimized hardware architecture is proposed to compensate for nonuniformities, with optional parametric lens shading correction (LSC). Three different performance configurations are outlined for different application areas, costs, and power requirements. For most commercial applications, the correction of LSC suffices. For both DSNU and PRNU, compensation with one or multiple calibration images, captured at different gain and temperature settings are considered. For more demanding applications, the effectiveness, external memory bandwidth, power consumption, implementation, and calibration complexity, as well as the camera manufacturability of different nonuniformity correction approaches were compared.
machine learning (ML) in general and deep learning (DL) in particular has become an extremely popular tool in several visionapplications (like object detection, super resolution, segmentation, object tracking etc.). ...
详细信息
machine learning (ML) in general and deep learning (DL) in particular has become an extremely popular tool in several visionapplications (like object detection, super resolution, segmentation, object tracking etc.). Almost in parallel, the issue of explainability in ML (i.e. the ability to explain/elaborate the way a trained ML model arrived at its decision) in vision has also received fairly significant attention from various quarters. However, we argue that the current philosophy behind explainable ML suffers from certain limitations, and the resulting explanations may not meaningfully uncover black box ML models. To elaborate our assertion, we first raise a few fundamental questions which have not been adequately discussed in the corresponding literature. We also provide perspectives on how explainablity in ML can benefit by relying on more rigorous principles in the related areas.(c) 2021 Elsevier B.V. All rights reserved.
暂无评论