检索结果-内蒙古大学图书馆

Ultra-Lightweight Fast Anomaly Detectors for Industrial applications

SENSORS 2024年第1期24卷 161页

作者： Kocon, Michal Malesa, Marcin Rapcewicz, Jerzy KSM Vis Sp Zoo PL-01142 Warsaw Poland Warsaw Univ Technol Inst Automat Control & Robot PL-02525 Warsaw Poland

Quality inspection in the pharmaceutical and food industry is crucial to ensure that products are safe for the customers. Among the properties that are controlled in the production process are chemical composition, the content of the active substances, and visual appearance. Although the latter may not influence the product's properties, it lowers customers' confidence in drugs or food and affects brand perception. The visual appearance of the consumer goods is typically inspected during the packaging process using machine vision quality inspection systems. In line with the current trends, the processing of the images is often supported with deep neural networks, which increases the accuracy of detection and classification of faults. Solutions based on AI are best suited to production lines with a limited number of formats or highly repeatable production. In the case where formats differ significantly from each other and are often being changed, a quality inspection system has to enable fast training. In this paper, we present a fast method for image anomaly detection that is used in high-speed production lines. The proposed method meets these requirements: It is easy and fast to train, even on devices with limited computing power. The inference time for each production sample is sufficient for real-time scenarios. Additionally, the ultra-lightweight algorithm can be easily adapted to different products and different market segments. In this work, we present the results of our algorithm on three different real production data gathered from food and pharmaceutical industries.

关键词： anomaly detection quality control X-ray image processing

来源：评论

学校读者我要写书评

暂无评论

Bi-directional Recurrent MVSNet for High-resolution Multi-view Stereo 17

Bi-directional Recurrent MVSNet for High-resolution Multi-vi...

引用

17th International Conference on machine vision applications (MVA)

作者： Fujitomi, Taku Ito, Seiya Kaneko, Naoshi Sumi, Kazuhiko Aoyama Gakuin Univ Chuo Ku 5-10-1 Fuchino Sagamihara Kanagawa Japan

ISBN: (纸本)9784901122207

Learning-based multi-view stereo regularizes cost volumes containing spatial information to reduce noise and improve the quality of a depth map. Cost volume regularization using 3D CNNs consumes a large amount of memory, making it difficult to scale up the network architecture. Recent work proposed a cost-volume regularization method that applies 2D convolutional GRUs and significantly reduces memory consumption. However, this uni-directional recurrent processing has a narrower receptive field than 3D CNNs because the regularized cost at a time step does not contain information about future time steps. In this paper, we propose a cost volume regularization method using bi-directional GRUs that expands the receptive field in the depth direction. In our experiments, our proposed method significantly outperforms the conventional methods in several benchmarks while maintaining low memory consumption.

关键词： Three-dimensional displays machine vision Aggregates Memory management Bidirectional control Benchmark testing Network architecture

来源：评论

学校读者我要写书评

暂无评论

Automated Fundus image Standardization Using a Dynamic Global Foreground Threshold Algorithm

Automated Fundus Image Standardization Using a Dynamic Globa...

引用

International Conference on image, vision and Computing (ICIVC)

作者： Riley Kiefer Muhammad Abid Mahsa Raeisi Ardali Jessica Steen Ehsan Amjadian Department of Computer Science Florida Polytechnic University Lakeland FL US College of Optometry Nova Southeastern University Fort Lauderdale FL USA Cheriton School of Computer Science University of Waterloo Waterloo Ontario Canada

A generic fundus foreground extractor is required for the standardization of fundus datasets in machine-learning applications due to the vast range of retinal fundus images. Some fundus images have a large amount of non-essential background data and others have missing data because of clipping. To standardize these varied images for machine learning applications while preserving the aspect resolution, a generalized threshold algorithm is needed to separate the foreground and background. Existing threshold algorithms fail to segment images with low contrast. There is a need for a generalized algorithm to handle varied image conditions in a dynamic manner. The proposed segmentation algorithm uses shifts in histogram frequency using intensity extrema to find the ideal threshold value. The proposed post-processing algorithm crops, pads, and resizes the image to a standardized size of 512x512 pixels using the segmentation map output. To demonstrate the effectiveness of this proposed standardization approach on downstream tasks, an ablation experiment of popular standardization strategies is evaluated on a newly proposed benchmark dataset, EyePACS-light. The experimental results demonstrate the benefits of using this standardization approach for resizing fundus images.

关键词：

来源：评论

学校读者我要写书评

暂无评论

NSCT and focus measure optimization based multi-focus image fusion

引用

JOURNAL OF INTELLIGENT & FUZZY SYSTEMS 2021年第1期41卷 903-915页

作者： Aishwarya, N. BennilaThangammal, C. Praveena, N. G. Amrita Vishwa Vidyapeetam Dept ECE Amrita Sch Engn Chennai Tamil Nadu India Anna Univ RMD Engn Coll Dept ECE Chennai Tamil Nadu India Anna Univ RMK Coll Engn & Technol Dept ECE Chennai Tamil Nadu India

Getting a complete description of scene with all the relevant objects in focus is a hot research area in surveillance, medicine and machine vision applications. In this work, transform based fusion method called as NSCT-FMO, is introduced to integrate the image pairs having different focus features. The NSCT-FMO approach basically contains four steps. Initially, the NSCT is applied on the input images to acquire the approximation and detailed structural information. Then, the approximation sub band coefficients are merged by employing the novel Focus Measure Optimization (FMO) approach. Next, the detailed sub-images are combined using Phase Congruency (PC). Finally, an inverse NSCT operation is conducted on synthesized sub images to obtain the initial synthesized image. To optimize the initial fused image, an initial decision map is first constructed and morphological post-processing technique is applied to get the final map. With the help of resultant map, the final synthesized output is produced by the selection of focused pixels from input images. Simulation analysis show that the NSCT-FMO approach achieves fair results as compared to traditional MST based methods both in qualitative and quantitative assessments.

关键词： image fusion multi-focus NSCT focus measure decision map

来源：评论

学校读者我要写书评

暂无评论

Fast Gradient Descent for Surface Capture Via Differentiable Rendering 10

Fast Gradient Descent for Surface Capture Via Differentiable...

引用

International Conference on 3D vision (3DV)

作者： Toussaint, Briac Genisson, Maxime Franco, Jean-Sebastien Univ Grenoble Alpes INRIA CNRS Grenoble INP Inst Engn Univ Grenoble AlpesLJK Grenoble France

ISBN: (纸本)9781665456708

Differential rendering has recently emerged as a powerful tool for image-based rendering or geometric reconstruction from multiple views, with very high quality. Up to now, such methods have been benchmarked on generic object databases and promisingly applied to some real data, but have yet to be applied to specific applications that may benefit. In this paper, we investigate how a differential rendering system can be crafted for raw multi-camera performance capture. We address several key issues in the way of practical usability and reproducibility, such as processing speed, explainability of the model, and general output model quality. This leads us to several contributions to the differential rendering framework. In particular we show that a unified view of differential rendering and classic optimization is possible, leading to a formulation and implementation where complete non-stochastic gradient steps can be analytically computed and the full perframe data stored in video memory, yielding a straightforward and efficient implementation. We also use a sparse storage and coarse-to-fine scheme to achieve extremely high resolution with contained memory and computation time. We show that results rivaling or exceeding the quality of state of the art multi-view human surface capture methods are achievable in a fraction of the time, typically around a minute per frame.

关键词： 3D vision Computer vision Differentiable rendering Surface Capture

来源：评论

学校读者我要写书评

暂无评论

ENSEMBLE METHOD OF DEEP LEARNING, COLOR SEGMENTATION, AND image TRANSFORMATION TO TRACK, LOCALIZE, AND COUNT COTTON BOLLS USING A MOVING CAMERA IN REAL-TIME

引用

TRANSACTIONS OF THE ASABE 2021年第1期64卷 341-352页

作者： Fue, K. G. Porter, W. M. Barnes, E. M. Rains, G. C. Univ Georgia Coll Engn Tifton GA USA Univ Georgia Dept Crop & Soil Sci Tifton GA USA Cotton Inc Agr Engn Cary NC USA Univ Georgia Dept Entomol Tifton GA USA

In robotic applications, good perception can be computationally costly and create rindesirable latency before a control decision is initiated. Most of the methods available for object detection deep learning are either fast with low accuracy or slow with high accuracy. Fast and accurate methods are necessary to track and localize objects such as cotton bolls that may be visible or occluded by each other or not well illuminated. In this study, an ensemble of a deep learning method and other image processing techniques was used to detect cotton bolls in-field on defoliated plants. In each image, a trained deep learning method, the YOLOv2 model, was used to detect open cotton bolls, and color segmentation was applied to confirm if the bolls detected by the YOLOv2 model were actually white to avoid false positives. Boll tracking was performed by following the spatial movement of good features on the edges of the bolls using the Lucas-Kanade algorithm. An image transformation algorithm was applied to the next image in case the previously detected boll was lost to retrieve the information of the missing boll. Each tracked and localized boll was stored and counted to give the total number of bolls detected. In this study, detection accuracy was sacrificed for image processing speed by using the YOLOv2 model. Detection accuracy was improved by using an ensemble method that combined image color segmentation, optical flow, and image transformation. This method was compared to eight other open-source methods implemented in OpenCV. The ensemble method detected and counted bolls at a speed of 7.6 fps with an accuracy of .94.4% using the Jetson TX2 embedded system to process 1K resolution images, outperforming the other OpenCV methods in various measurements.

关键词： Boll counting Cotton Cotton harvesting DarkFlow Darknet Deep learning machine vision YOLOv2

来源：评论

学校读者我要写书评

暂无评论

SeafloorAI: a large-scale vision-language dataset for seafloor geological survey 24

SeafloorAI: a large-scale vision-language dataset for seaflo...

引用

Proceedings of the 38th International Conference on Neural Information processing Systems

作者： Kien X. Nguyen Fengchun Qiao Arthur Trembanis Xi Peng Deep-REAL Lab Department of Computer and Information Sciences University of Delaware School of Marine Science and Policy University of Delaware

ISBN: (纸本)9798331314385

A major obstacle to the advancements of machine learning models in marine science, particularly in sonar imagery analysis, is the scarcity of AI-ready datasets. While there have been efforts to make AI-ready sonar image dataset publicly available, they suffer from limitations in terms of environment setting and scale. To bridge this gap, we introduce SeafloorAI, the first extensive AI-ready datasets for seafloor mapping across 5 geological layers that is curated in collaboration with marine scientists. We further extend the dataset to SeafloorGenAI by incorporating the language component in order to facilitate the development of both vision- and language-capable machine learning models for sonar imagery. The dataset consists of 62 geo-distributed data surveys spanning 17,300 square kilometers, with 696K sonar images, 827K annotated segmentation masks, 696K detailed language descriptions and approximately 7M question-answer pairs. By making our data processing source code publicly available, we aim to engage the marine science community to enrich the data pool and inspire the machine learning community to develop more robust models. This collaborative approach will enhance the capabilities and applications of our datasets within both fields. Our code repository are available https://***/deep-real/SeafloorAI under the CC-BY-4.0 license.

关键词：

来源：评论

学校读者我要写书评

暂无评论

Analog signal processing solution for machine vision applications

引用

JOURNAL OF REAL-TIME image processing 2019年第5期16卷 1607-1628页

作者： Athreyas, Nihar Gupta, Dev Gupta, Jai Univ Massachusetts Dept Elect & Comp Engn Amherst MA 01003 USA Spero Devices Inc 43 Nagog Pk Acton MA USA

The field of machine vision is continuously evolving. There are new products coming into the market that have very severe size, weight and power constraints and handle very high computational loads simultaneously. Existing architectures and digital image processing solutions will not be able to meet these ever-increasing demands. There is a need to develop novel architectures and image processing solutions to address these requirements. The major contribution of this work is to show that analog signal processing is a solution to this problem. The analog processor will be used as an augmentation device which works in parallel with the digital processor, making the system faster and more efficient. We have developed a prototype of an analog processing board using commercially available off-the-shelf components and demonstrated that a prototype development has several advantages over a direct integrated circuit design. We focus on providing experimental results that demonstrate functionality of the analog processing board and show that the performance of the prototype board for low-level and mid-level image processing tasks is equivalent to a digital implementation. To demonstrate improvement in speed and power consumption over other systems, we propose an integrated circuit design of the analog processor and show that such an analog processor would be 100x faster than existing FPGAs and 5x faster than state-of-the-art GPUs. We also compare the performance of the proposed integrated circuit design against other analog processors reported in the literature. We report a case study in which we use the processor for an object detection and recognition application and show that the processor has excellent performance.

关键词： Analog signal processing Analog processor Parallel architecture Prototype design Off-the-shelf components

来源：评论

学校读者我要写书评

暂无评论

Reliable and Efficient image processing and Deep machine Learning for large-scale Digital image Retrieval 1

Reliable and Efficient Image Processing and Deep Machine Lea...

引用

1st IEEE International Interdisciplinary Humanitarian Conference for Sustainability, IIHC 2022

作者： Selvi, G Thamarai Hamdy, Amal Vijaya, K Sri Kumar, Nellore Manoj Pallavi, L. Rao, Kanusu Srinivasa Sri Sai Ram Institute of Technology Department of ECE Tamilnadu Chennai India College of Women Physics Department Ain Shams University Egypt P v P Siddhartha Institute of Technology Department of IT Andhra Pradesh Vijayawada India Chennai Thandalam India B v Raju Institute of Technology Department of CSE Telangana Narsapur India Yogi Vemana University Department of Computer Science and Technology Andhra Pradesh India

ISBN: (纸本)9781665456876

Retrieving relevant photos from a database by analysing their differences and similarities is an essential part of machine vision. Numerous applications exist in areas such as storage, object recognition, localisation, and recognition with limited data. In this study, we take a look at the issue of image retrieval, in which a group of pictures is retrieved from a huge database in order to find ones that are most like the one the user has submitted (the 'query image'). This thesis investigates the origins of the issue, as well as its present-day manifestations and potential solutions. To the best of our knowledge, however, there is currently no accessible generic text removal method that can be used to any font, script, language, or shape to remove all or user-specified text sections. The difficulties of multi-lingual and curved text identification and inpainting are carried over into the development of a general text eraser for real-world settings. To the best of our knowledge, however, there is currently no accessible generic text removal method that can be used to any font, script, language, or shape to remove all or user-specified text sections. Creating a general text eraser for use in real-world scenarios is difficult since it takes on the difficulties of both multi-lingual text detection and curved text inpainting. By adhering to these standards, our method significantly outperforms the state of the art on four benchmark datasets. This includes more complicated algorithms with auxiliary components. Furthermore, we offer a qualitative evaluation of our trained representation, which shows that despite its small size, it is able to gather information from highly localised and discriminative regions, much like an implicit attention mechanism. © 2022 IEEE.

关键词： Computer vision

来源：评论

学校读者我要写书评

暂无评论

Occupancy Detection for HVAC Systems Using IoT Edge Computing and vision-Based image processing 17

Occupancy Detection for HVAC Systems Using IoT Edge Computin...

引用

17th IEEE/ACM International Conference on Utility and Cloud Computing, UCC 2024

作者： Akhtar, Tariq Khatoon, Shaheen Mahmood, Azhar School of Architecture Computing and Engineering University of East London London United Kingdom

ISBN: (纸本)9798350367201

Energy efficiency, particularly in Heating, Ventilation, and Air Conditioning (HVAC) systems, is a critical challenge in modern building management due to the increasing energy demands and environmental impacts. This paper focuses on developing optimized object detection models using machine vision for occupancy detection in office environments, aiming to improve HVAC efficiency. The primary objective is to compare three models—YOLOv8n, YOLOv9c, and YOLOv10n—against the Faster R-CNN baseline, emphasizing detection speed, computational efficiency, and small object detection. Data collection involved creating a custom dataset of 1,728 images from office environments, annotated with eight object classes, including persons and office devices. Preprocessing techniques such as grayscale conversion, image resizing, and augmentation improved the model’s ability to detect objects under various conditions, including occlusion and varied camera angles. The models were evaluated based on mAP@50, mAP@50-95, and detection speed. YOLOv9c outperformed Faster R-CNN in speed and accuracy, achieving a mAP@50 of 88.0% and mAP@50-95 of 59.8%, making it the most balanced model. YOLOv8n demonstrated the fastest detection speed, ideal for real-time applications, while YOLOv10n, though less accurate, provided a strong trade-off between speed and precision. Despite these successes, challenges remain, particularly in small object detection and dataset size. Future work includes expanding the dataset to 100,000 images, improving detection of smaller objects, and integrating the object detection models into real-time HVAC control systems. Moreover, deployment on edge devices, transfer learning, and integration with Building Management Systems (BMS) for dynamic HVAC control represent promising areas for future research. © 2024 IEEE.

关键词： Computer vision Energy Efficiency HVAC Occupancy Detection YOLO

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：