Quality inspection in the pharmaceutical and food industry is crucial to ensure that products are safe for the customers. Among the properties that are controlled in the production process are chemical composition, th...
详细信息
Quality inspection in the pharmaceutical and food industry is crucial to ensure that products are safe for the customers. Among the properties that are controlled in the production process are chemical composition, the content of the active substances, and visual appearance. Although the latter may not influence the product's properties, it lowers customers' confidence in drugs or food and affects brand perception. The visual appearance of the consumer goods is typically inspected during the packaging process using machinevision quality inspection systems. In line with the current trends, the processing of the images is often supported with deep neural networks, which increases the accuracy of detection and classification of faults. Solutions based on AI are best suited to production lines with a limited number of formats or highly repeatable production. In the case where formats differ significantly from each other and are often being changed, a quality inspection system has to enable fast training. In this paper, we present a fast method for image anomaly detection that is used in high-speed production lines. The proposed method meets these requirements: It is easy and fast to train, even on devices with limited computing power. The inference time for each production sample is sufficient for real-time scenarios. Additionally, the ultra-lightweight algorithm can be easily adapted to different products and different market segments. In this work, we present the results of our algorithm on three different real production data gathered from food and pharmaceutical industries.
Learning-based multi-view stereo regularizes cost volumes containing spatial information to reduce noise and improve the quality of a depth map. Cost volume regularization using 3D CNNs consumes a large amount of memo...
详细信息
ISBN:
(纸本)9784901122207
Learning-based multi-view stereo regularizes cost volumes containing spatial information to reduce noise and improve the quality of a depth map. Cost volume regularization using 3D CNNs consumes a large amount of memory, making it difficult to scale up the network architecture. Recent work proposed a cost-volume regularization method that applies 2D convolutional GRUs and significantly reduces memory consumption. However, this uni-directional recurrent processing has a narrower receptive field than 3D CNNs because the regularized cost at a time step does not contain information about future time steps. In this paper, we propose a cost volume regularization method using bi-directional GRUs that expands the receptive field in the depth direction. In our experiments, our proposed method significantly outperforms the conventional methods in several benchmarks while maintaining low memory consumption.
A generic fundus foreground extractor is required for the standardization of fundus datasets in machine-learning applications due to the vast range of retinal fundus images. Some fundus images have a large amount of n...
A generic fundus foreground extractor is required for the standardization of fundus datasets in machine-learning applications due to the vast range of retinal fundus images. Some fundus images have a large amount of non-essential background data and others have missing data because of clipping. To standardize these varied images for machine learning applications while preserving the aspect resolution, a generalized threshold algorithm is needed to separate the foreground and background. Existing threshold algorithms fail to segment images with low contrast. There is a need for a generalized algorithm to handle varied image conditions in a dynamic manner. The proposed segmentation algorithm uses shifts in histogram frequency using intensity extrema to find the ideal threshold value. The proposed post-processing algorithm crops, pads, and resizes the image to a standardized size of 512x512 pixels using the segmentation map output. To demonstrate the effectiveness of this proposed standardization approach on downstream tasks, an ablation experiment of popular standardization strategies is evaluated on a newly proposed benchmark dataset, EyePACS-light. The experimental results demonstrate the benefits of using this standardization approach for resizing fundus images.
Getting a complete description of scene with all the relevant objects in focus is a hot research area in surveillance, medicine and machinevisionapplications. In this work, transform based fusion method called as NS...
详细信息
Getting a complete description of scene with all the relevant objects in focus is a hot research area in surveillance, medicine and machinevisionapplications. In this work, transform based fusion method called as NSCT-FMO, is introduced to integrate the image pairs having different focus features. The NSCT-FMO approach basically contains four steps. Initially, the NSCT is applied on the input images to acquire the approximation and detailed structural information. Then, the approximation sub band coefficients are merged by employing the novel Focus Measure Optimization (FMO) approach. Next, the detailed sub-images are combined using Phase Congruency (PC). Finally, an inverse NSCT operation is conducted on synthesized sub images to obtain the initial synthesized image. To optimize the initial fused image, an initial decision map is first constructed and morphological post-processing technique is applied to get the final map. With the help of resultant map, the final synthesized output is produced by the selection of focused pixels from input images. Simulation analysis show that the NSCT-FMO approach achieves fair results as compared to traditional MST based methods both in qualitative and quantitative assessments.
Differential rendering has recently emerged as a powerful tool for image-based rendering or geometric reconstruction from multiple views, with very high quality. Up to now, such methods have been benchmarked on generi...
详细信息
ISBN:
(纸本)9781665456708
Differential rendering has recently emerged as a powerful tool for image-based rendering or geometric reconstruction from multiple views, with very high quality. Up to now, such methods have been benchmarked on generic object databases and promisingly applied to some real data, but have yet to be applied to specific applications that may benefit. In this paper, we investigate how a differential rendering system can be crafted for raw multi-camera performance capture. We address several key issues in the way of practical usability and reproducibility, such as processing speed, explainability of the model, and general output model quality. This leads us to several contributions to the differential rendering framework. In particular we show that a unified view of differential rendering and classic optimization is possible, leading to a formulation and implementation where complete non-stochastic gradient steps can be analytically computed and the full perframe data stored in video memory, yielding a straightforward and efficient implementation. We also use a sparse storage and coarse-to-fine scheme to achieve extremely high resolution with contained memory and computation time. We show that results rivaling or exceeding the quality of state of the art multi-view human surface capture methods are achievable in a fraction of the time, typically around a minute per frame.
In robotic applications, good perception can be computationally costly and create rindesirable latency before a control decision is initiated. Most of the methods available for object detection deep learning are eithe...
详细信息
In robotic applications, good perception can be computationally costly and create rindesirable latency before a control decision is initiated. Most of the methods available for object detection deep learning are either fast with low accuracy or slow with high accuracy. Fast and accurate methods are necessary to track and localize objects such as cotton bolls that may be visible or occluded by each other or not well illuminated. In this study, an ensemble of a deep learning method and other imageprocessing techniques was used to detect cotton bolls in-field on defoliated plants. In each image, a trained deep learning method, the YOLOv2 model, was used to detect open cotton bolls, and color segmentation was applied to confirm if the bolls detected by the YOLOv2 model were actually white to avoid false positives. Boll tracking was performed by following the spatial movement of good features on the edges of the bolls using the Lucas-Kanade algorithm. An image transformation algorithm was applied to the next image in case the previously detected boll was lost to retrieve the information of the missing boll. Each tracked and localized boll was stored and counted to give the total number of bolls detected. In this study, detection accuracy was sacrificed for imageprocessing speed by using the YOLOv2 model. Detection accuracy was improved by using an ensemble method that combined image color segmentation, optical flow, and image transformation. This method was compared to eight other open-source methods implemented in OpenCV. The ensemble method detected and counted bolls at a speed of 7.6 fps with an accuracy of .94.4% using the Jetson TX2 embedded system to process 1K resolution images, outperforming the other OpenCV methods in various measurements.
A major obstacle to the advancements of machine learning models in marine science, particularly in sonar imagery analysis, is the scarcity of AI-ready datasets. While there have been efforts to make AI-ready sonar ima...
ISBN:
(纸本)9798331314385
A major obstacle to the advancements of machine learning models in marine science, particularly in sonar imagery analysis, is the scarcity of AI-ready datasets. While there have been efforts to make AI-ready sonar image dataset publicly available, they suffer from limitations in terms of environment setting and scale. To bridge this gap, we introduce SeafloorAI, the first extensive AI-ready datasets for seafloor mapping across 5 geological layers that is curated in collaboration with marine scientists. We further extend the dataset to SeafloorGenAI by incorporating the language component in order to facilitate the development of both vision- and language-capable machine learning models for sonar imagery. The dataset consists of 62 geo-distributed data surveys spanning 17,300 square kilometers, with 696K sonar images, 827K annotated segmentation masks, 696K detailed language descriptions and approximately 7M question-answer pairs. By making our data processing source code publicly available, we aim to engage the marine science community to enrich the data pool and inspire the machine learning community to develop more robust models. This collaborative approach will enhance the capabilities and applications of our datasets within both fields. Our code repository are available https://***/deep-real/SeafloorAI under the CC-BY-4.0 license.
The field of machinevision is continuously evolving. There are new products coming into the market that have very severe size, weight and power constraints and handle very high computational loads simultaneously. Exi...
详细信息
The field of machinevision is continuously evolving. There are new products coming into the market that have very severe size, weight and power constraints and handle very high computational loads simultaneously. Existing architectures and digital imageprocessing solutions will not be able to meet these ever-increasing demands. There is a need to develop novel architectures and imageprocessing solutions to address these requirements. The major contribution of this work is to show that analog signal processing is a solution to this problem. The analog processor will be used as an augmentation device which works in parallel with the digital processor, making the system faster and more efficient. We have developed a prototype of an analog processing board using commercially available off-the-shelf components and demonstrated that a prototype development has several advantages over a direct integrated circuit design. We focus on providing experimental results that demonstrate functionality of the analog processing board and show that the performance of the prototype board for low-level and mid-level imageprocessing tasks is equivalent to a digital implementation. To demonstrate improvement in speed and power consumption over other systems, we propose an integrated circuit design of the analog processor and show that such an analog processor would be 100x faster than existing FPGAs and 5x faster than state-of-the-art GPUs. We also compare the performance of the proposed integrated circuit design against other analog processors reported in the literature. We report a case study in which we use the processor for an object detection and recognition application and show that the processor has excellent performance.
Retrieving relevant photos from a database by analysing their differences and similarities is an essential part of machinevision. Numerous applications exist in areas such as storage, object recognition, localisation...
详细信息
Energy efficiency, particularly in Heating, Ventilation, and Air Conditioning (HVAC) systems, is a critical challenge in modern building management due to the increasing energy demands and environmental impacts. This ...
详细信息
暂无评论