The proceedings contain 37 papers. The topics discussed include: CANDU in-reactor quantitative visual-based inspection techniques;fast hand recognition method using limited area of IR projection pattern;high-resolutio...
The proceedings contain 37 papers. The topics discussed include: CANDU in-reactor quantitative visual-based inspection techniques;fast hand recognition method using limited area of IR projection pattern;high-resolution inline video-AOI for printed circuit assemblies;localized contourlet features in vehicle make and model recognition;discriminating poultry feeds by image analysis for the purpose of avoiding importunate poultry behaviors;3D reconstruction of hot metallic surfaces for industrial part characterization;assessing fabric stain release using a GPU implementation of statistical snakes;fingerprint verification using direction images and local features;introduction of a wavelet transform based on 2D matched filter in a Markov random field for fine structure extraction: application on road crack detection;and anomaly based vessel detection in visible and infrared images.
Lateral flow assays (LFAs) are important diagnostic tools with numerous applications in various scientific fields, including diagnostics, medicine, analytical chemistry, biochemistry, environmental and food science. A...
详细信息
Lateral flow assays (LFAs) are important diagnostic tools with numerous applications in various scientific fields, including diagnostics, medicine, analytical chemistry, biochemistry, environmental and food science. Artificial Intelligence (AI) and imageprocessing tools are the state-of-the-art technology in analytical tools, especially in Point-of-Care (POC) devices that improve the detection efficiency without the need for highly qualified personnel. In this context, we have developed novel multicolor LFAs exploiting machinevision and image analysis tools for the automated "reading" of the visual result of LFAs using beads of different colors as reporters to distinguish between multiple targets. The system consists of a multicolor test integrated with a mobile/ smartphone and a web application for the automatic interpretation of the results. The use of multicolor beads, relating each color to a specific target, enhanced image analysis-based discrimination of the tests between different targets. The developed diagnostic tool has been applied to cutting-edge liquid biopsy applications which include the detection of three different microRNA molecules spiked in urine samples. The developed integrated system has been successfully applied to a series of real samples, advancing the field of LFAs diagnostics. The system showed 99.3 % accuracy, 99.1 % sensitivity and 100 % specificity.
In low-light environments, machinevision tasks often suffer from performance degradation because traditional image Signal processing (ISP) pipelines are primarily optimized for image quality metrics such as Peak Sign...
详细信息
In low-light environments, machinevision tasks often suffer from performance degradation because traditional image Signal processing (ISP) pipelines are primarily optimized for image quality metrics such as Peak Signal-to-Noise Ratio and Structural Similarity Index, which do not adequately address the specific needs of these applications. Existing methods fall short in enhancing the critical image features required for computer vision tasks under challenging lighting conditions. To address this, we introduce PhyDiiSP, a physics-guided, differentiable ISP pipeline designed to improve machinevision performance in low-light scenarios. PhyDiiSP integrates traditional ISP design principles with physical insights, including demosaicing for RAW-to-RGB conversion, global tone mapping to adjust overall brightness, and Multiscale Retinex-based enhancement to tackle low-light challenges. Experimental results show that PhyDiiSP outperforms existing ISP methods in object detection accuracy across standard benchmarks by effectively enhancing key image features. Furthermore, when trained with L1 loss and aligned with ground truth on datasets of dark-light environments and real RAW-to-RGB conversions, it demonstrates competitive image quality. These results confirm that PhyDiiSP is a viable and effective solution for real-world low-light machinevisionapplications.
(1) Computer vision: The field of computer vision is making significant strides in dynamic reasoning capability through test-time scaling (TTS) [1] technology. TTS optimizes the robustness and interpretability of mode...
(1) Computer vision: The field of computer vision is making significant strides in dynamic reasoning capability through test-time scaling (TTS) [1] technology. TTS optimizes the robustness and interpretability of models in complex tasks by flexibly allocating computational resources. Multimodal base models, such as CLIP (contrastive language-image pre-training) [2] and Florence, facilitate the deep fusion of vision and language through cross-modal alignment techniques. These advancements have significantly improved the accuracy of visual question answering (VQA) and cross-modal retrieval. Generative AI technologies, such as Stable Diffusion, have also broken through the limitations of 2D image generation, enabling the transition to semantics-driven 3D scene models, like neural radiance fields (NeRF) [3]. This shift supports the generation of spatial models with physically interactive attributes from a single sheet of input, providing a new paradigm for virtual reality and industrial design. In addition, the introduction of the spatial intelligence [4] concept allows computer vision systems to simulate physical interactions in 3D space, driving the development of embodied intelligence and robot *** articles included in this Special Issue cover advancements in ten research directions: computer vision, feature extraction and image selection, pattern recognition for imageprocessing techniques, imageprocessing in intelligent transportation, neural networks, machine learning and deep learning, biomedical imageprocessing and recognition, imageprocessing for intelligent surveillance, deep learning for imageprocessing, robotics and unmanned systems, and AI-based imageprocessing, understanding, recognition, compression, and reconstruction. I have categorized the 33 articles included in this Special Issue based on these research directions, with the classification system not only demonstrating the vertical extension of the technological depth but also embodyin
Optimal Transport (OT) theory has seen increasing attention from the computer science community due to its potency and relevance in modeling and machine learning (ML). OT provides powerful tools for comparing probabil...
详细信息
Optimal Transport (OT) theory has seen increasing attention from the computer science community due to its potency and relevance in modeling and machine learning (ML). OT provides powerful tools for comparing probability distributions and producing optimal mappings that minimize cost functions. Consequently, OT has been widely implemented in computer vision tasks such as image retrieval, image interpolation, and semantic correspondence, as well as in broader applications spanning domain adaptation, natural language processing, and variational inference. In this survey, we aim to convey the emerging prominence and widespread applications of OT methods across various ML areas and outline future research directions. We first provide a history of OT. We then introduce a mathematical formulation and the prerequisites to understand OT, including Kantorovich duality, entropic regularization, KL Divergence, and Wasserstein barycenters. Given the computational complexity of OT, we discuss entropy-regularized version of computing optimal mappings that facilitate practical applications of OT across diverse ML domains. Further, we review prior studies on OT applications in ML. To this end, we cover the following: computer vision, graph learning, neural architecture search, document representation, domain adaptation, model fusion, medicine, natural language processing, and reinforcement learning. Finally, we outline future research directions and key challenges that could drive the broader integration of OT in ML.
This study presents a vision-based closed-loop tracking system designed specifically for robotic laser beam welding of curved and closed square butt joints. The proposed system is compared against 11 existing solution...
详细信息
This study presents a vision-based closed-loop tracking system designed specifically for robotic laser beam welding of curved and closed square butt joints. The proposed system is compared against 11 existing solutions reported in the literature, which employ various sensor principles for the same application. The system employs a non-contact, non-intrusive machinevision approach, seamlessly integrated into the laser beam welding head to mitigate challenges associated with sensor forerun. Key features include an off-axis LED illumination, an optical filter, and a movable actuator, facilitating real-time imageprocessing and closed-loop control during the welding process. Experimental validation was conducted on stainless-steel plates with complex closed square butt joints. The system achieved a mean absolute joint-to-beam offset of 0.14 mm across four test cases, with a maximum offset of 0.85 mm, demonstrating its robustness and precision. Comparative analysis underscores the proposed method's advantages, showcasing its potential for industrial applications in laser beam welding of geometrically challenging joints.
Soil erosion, primarily driven by water and wind, poses a significant environmental challenge globally, leading to land degradation and geo-hazards. Despite various empirical methods, image analysis, and machine learn...
详细信息
Soil erosion, primarily driven by water and wind, poses a significant environmental challenge globally, leading to land degradation and geo-hazards. Despite various empirical methods, image analysis, and machine learning techniques employed to address this issue, effective mitigation tools remain lacking. This study presents an innovative framework integrating imageprocessing (IP) and machine learning (ML) to enhance the understanding, quantification, and prediction of soil erosion processes. Laboratory flume experiments were conducted to capture erosion images, which were pre-processed using techniques such as Contrast Limited Adaptive Histogram Equalization (CLAHE) to improve image quality. Supervised ML models, including Logistic Regression (LR), K-Nearest Neighbor (KNN), Support Vector machine (SVM), Decision Tree (DT), and Random Forest (RF), were applied to classify eroded and non-eroded soil areas. The model's performance was rigorously evaluated using metrics such as precision, recall, and F1-score. The results demonstrated that KNN and RF outperformed other models in predicting soil erosion, with KNN exhibiting the least variation (2.39%) compared to the reference erosion profile. This study underscores the potential of an IP and ML ensemble framework for precise soil erosion quantification and prediction, offering practical applications for erosion mitigation. The open-source code and dataset are available at https://***/mlgeotech/***.
The specular reflection of objects is an important factor affecting image display quality, which poses challenges to tasks such as pattern recognition and machinevision detection. At present, specular removal for a s...
详细信息
The specular reflection of objects is an important factor affecting image display quality, which poses challenges to tasks such as pattern recognition and machinevision detection. At present, specular removal for a single real image is a crucial pre-processing step to improve the performance of computer vision algorithms. Despite notable approaches tailored for handling synthesized and pre-simplified images with dark backgrounds, real-time separation of specular reflection for a single real image remains a challenging problem. This paper proposes a novel specular removal method to separate the specular reflection for a single real image accurately and efficiently based on the dark channel prior. Initially, a modified-specular-free (MSF) image is developed using the dark channel prior, which can derive a direct estimation of specular reflection. Next, the image chromaticity spaces are established to represent the pixel intensity. Then, the maximum chromaticity value of the modified MSF image is extracted to guide the filtering of the specular reflection, treating the specular pixels as noise in the chromaticity space. Finally, the image without specular reflection can be obtained using the restored maximum chromaticity value based on the dichromatic reflection model. The superiority of this method is to achieve highquality specular reflection separation quickly without destroying the geometric features of the real image. Compared with the state-of-the-art methods, experimental results show that the proposed algorithm can achieve the best subjective visual effect and satisfactory quantitative performance. In addition, this approach can be implemented efficiently to meet real-time requirements, promising to be applied to computer vision measurement and inspection applications.
Optimizers play important roles in enhancing the performance of a deep network. A study on different optimizers is necessary to understand the effect of optimizers on the performance of the deep network for a given ta...
详细信息
Optimizers play important roles in enhancing the performance of a deep network. A study on different optimizers is necessary to understand the effect of optimizers on the performance of the deep network for a given target task, such as image classification. Several attempts were made to investigate the effect of optimizers on the performance of CNNs. However, such experiments have not been carried out on vision transformers (ViT), despite the recent success of ViT in various imageprocessing tasks. In this paper, we conduct exhaustive experiments with ViT using different optimizers. In our experiments, we found that weight decoupling and weight decay in optimizers play important roles in training ViT. We focused on the concept of weight decoupling and tried different variations of it to investigate to what extent weight decoupling is beneficial for a ViT. We propose two techniques that provide better results than weight-decoupled optimizers: (i) The weight decoupling step in optimizers involves a linear update of the parameter with weight decay as the scaling factor. We propose a quadratic update of the parameter which involves using a linear as well as squared parameter update using the weight decay as the scaling factor. (ii) We propose using different weight decay values for different parameters depending on the gradient value of the loss function with respect to that parameter. A smaller weight decay is used for parameters with a higher gradient value and vice versa. image classification experiments are conducted over CIFAR-100 and TinyimageNet datasets to observe the performance of these proposed methods with respect to state-of-the-art optimizers such as Adam, RAdam, and AdaBelief. The code is available at https://***/Hemanth-Boyapati/Adaptive-weight-decay-optimizers.
The AdaMax algorithm provides enhanced convergence properties for stochastic optimization problems. In this paper, we present a regret bound for the AdaMax algorithm, offering a tighter and more refined analysis compa...
详细信息
The AdaMax algorithm provides enhanced convergence properties for stochastic optimization problems. In this paper, we present a regret bound for the AdaMax algorithm, offering a tighter and more refined analysis compared to existing bounds. This theoretical advancement provides deeper insights into the optimization landscape of machine learning algorithms. Specifically, the You Only Look Once (YOLO) framework has become well-known as an extremely effective object segmentation tool, mostly because of its extraordinary accuracy in real-time processing, which makes it a preferred option for many computer visionapplications. Finally, we used this algorithm for image segmentation.
暂无评论