检索结果-内蒙古大学图书馆

machine vision inspection systems /

引用

2020年

作者： edited by Muthukumaran Malarvel Soumya Ranjan Nayak Sury Narayan Panda Prasant Kumar Pattnaik...

来源：内蒙古大学图书馆图书评论

学校读者我要写书评

暂无评论

Urban traffic monitoring based on deep learning on an embedded GPU

引用

EXPERT SYSTEMS WITH applications 2025年 273卷

作者： Nocua, M. Fredy Perez-Holguin, Wilson-Javier Pardo-Beainy, Camilo Univ Pedag & Tecnol Colombia UPTC Grp GIRA Sogamoso Colombia Tunja Colombia Univ Santo Tomas Grp GIDINT Tunja Colombia Bogota Colombia Fdn Univ San Gil Unisangil COMUNIT Yopal Colombia Santander Colombia

This paper presents a deep learning-based system for urban traffic monitoring, focusing on the detection and tracking of motorcycles using embedded hardware, due to the high accident rates of this type of vehicle. Different convolutional neural network (CNN) models were evaluated, including MobileNet-v1-SSD, YOLOv5, and Faster R-CNN, implemented on an NvIDIA Graphics processing Units (GPUs) board as the Jetson Xavier NX (R). The MobileNet-v1-SSD model stands out for its balance between precision (90 %), recall (66 %), and latency (similar to 10 ms), making it ideal for real-time applications. Additionally, a tracking algorithm based on optical flow using the Lucas-Kanade method was developed, complemented with logic for creating and deleting identities (IDs), enabling object tracking in dynamic scenarios with partial occlusions. The system includes a methodology for calculating key traffic variables such as speed and direction by correlating pixels with real-world distances through camera calibration. This approach demonstrates the feasibility of developing complex image-processing applications based on resource-constrained platforms by leveraging the features of efficient embedded systems such as General Purpose GPUs.

关键词： Deep learning Computer vision Object detection Object tracking Embedded system

来源：评论

学校读者我要写书评

暂无评论

Deep Learning-Based Compressed Domain Multimedia for Man and machine: A Taxonomy and Application to Point Cloud Classification

引用

IEEE ACCESS 2023年 11卷 128979-128997页

作者： Seleem, Abdelrahman Guarda, Andre F. R. Rodrigues, Nuno M. M. Pereira, Fernando Univ Lisbon Inst Super Tecn P-1049001 Lisbon Portugal Inst Telecomunicacoes P-1049001 Lisbon Portugal South Valley Univ Fac Comp & Informat Qena 83523 Egypt Politecn Leiria ESTG P-2411901 Leiria Portugal

In the current golden age of multimedia, human visualization is no longer the single main target, with the final consumer often being a machine which performs some processing or computer vision tasks. In both cases, deep learning plays a fundamental role in extracting features from the multimedia representation data, usually producing a compressed representation referred to as latent representation. The increasing development and adoption of deep learning-based solutions in a wide area of multimedia applications have opened an exciting new vision where a common compressed multimedia representation is used for both man and machine. The main benefits of this vision are two-fold: i) improved performance for the computer vision tasks, since the effects of coding artifacts are mitigated;and ii) reduced computational complexity, since prior decoding is not required. This paper proposes the first taxonomy for designing compressed domain computer vision solutions driven by the architecture and weights compatibility with an available spatio-temporal computer vision processor. The potential of the proposed taxonomy is demonstrated for the specific case of point cloud classification by designing novel compressed domain processors using the JPEG Pleno Point Cloud Coding standard under development and adaptations of the PointGrid classifier. Experimental results show that the designed compressed domain point cloud classification solutions can significantly outperform the spatial-temporal domain classification benchmarks when applied to the decompressed data, containing coding artifacts, and even surpass their performance when applied to the original uncompressed data.

关键词： Task analysis Encoding Transform coding image coding Standards Streaming media Point cloud compression Classification algorithms Representation learning Computer vision Deep learning Taxonomy Classification coding compressed representation processing computer vision deep learning man and machine consumption point cloud visualization taxonomy

来源：评论

学校读者我要写书评

暂无评论

An efficient nano-design of image processor circuits for morphology operations based on quantum dots

引用

AIP ADvANCES 2024年第9期14卷

作者： Yang, Li Lianjun, Wang Anbar, Mohammad Mohammed, Amin Salih Beijing Jiaotong Univ Sch Civil Engn Beijing Peoples R China Beijing Jiaotong Univ Beijing Key Lab Track Engn Beijing Peoples R China Tartous Univ Commun Technol Engn Dept Tartus Syria Salahaddin Univ Erbil Coll Engn Dept Software & Informat Engn Erbil Kurdistan Regio Iraq Lebanese French Univ Coll Engn & Comp Sci Dept Comp Engn Erbil Kurdistan Regio Iraq

Quantum-dot cellular automata (QCA) are one of the most promising alternatives to traditional vLSI technology despite significant current obstacles. The QCA has the advantages of very low power dissipation, faster switching speed, and extremely low circuit area, which can be used in designing nano-scale image processing circuits. Morphological operations and processing of digital image processing is a significant topic for researchers because it is widely used for analyzing, enhancing, and modifying images to extract meaningful information or improve their visual quality. image processing is also used for image retrieval and enhancement, image compression, object recognition, machine vision, and medical applications. QCA technology, as a new and leading technology with great potential, can play a fundamental role in morphological operations, processing digital images, image editing, medical imaging, facial recognition, and autonomous vehicles. In recent years, researchers in this field have presented many circuits, but they have many flaws in terms of speed, accuracy, and area consumption, and the need to create more efficient circuits is felt more than ever. Therefore, in this article, a new design for morphological operations and processing digital images is presented using QCA technology. This paper presents a new efficient QCA-based implementation of image processing based on the direct interactions between the QCA cells. This circuit uses two majority gates of five new inputs to produce the output and produces the desired output. In addition, a comparison and analysis of the area and clocking complexity, design cost, and energy dissipation through simulation using QCADesigner and QCADesigner-E are done. The results show that the presented circuit produces the expected and correct output results in 0.75 clock phases, and the obtained results show the high speed and low consumption space of the presented circuit. In addition, the presented circuit performs better

关键词： Semiconductor quantum dots

来源：评论

学校读者我要写书评

暂无评论

Landslide prediction with severity analysis using efficient computer vision and soft computing algorithms

引用

Multimedia Tools and applications 2024年第37期83卷 85079-85101页

作者： varangaonkar, Payal Rode, S.v. Sipna College of Engineering and Technology Amravati India Electronics and Telecommunication Department Sipna College of Engineering & ampTechnology Amravati India

Since the preceding decade, there has been a great deal of interest in forecasting landslides using remote-sensing images. Early detection of possible landslide zones will help to save lives and money. However, this approach presents several obstacles. Computer vision systems must be carefully built since normal image processing does not apply to images obtained by remote sensing (RS). This research proposes a novel landslide prediction method with a severity analysis model based on real-time hyperspectral RS images. The proposed model consists of phases of pre-processing, dynamic segmentation, hybrid feature extraction, landslide prediction, and landslide severity detection. The pre-processing step performs the geometric correction of input RS images to suppress the built-up regions, water, and vegetation using the Normal Difference vegetation Index (NDvI). The pre-processing stage encompasses many steps, including atmospheric adjustments, geometric corrections, and the elimination of superfluous regions by denoising techniques such as 2D median filtering. Dynamic segmentation is employed to segment the pre-processed picture for Region of Interest (ROI) localization. The ROI image is utilized to extract manually designed features that accurately depict spatial and temporal variations within the input RS image. For each input RS image, the hybrid feature vector is normalized. We trained ANN and SvM to predict landslides. If the input image predicts a landslide, its severity is identified. For the performance analysis, we collected real-time RS images of the western region of India (Goa and Maharashtra). Simulation results show the efficiency of the proposed model. © The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2024.

关键词： Computer vision

来源：评论

学校读者我要写书评

暂无评论

COMPUTER vision ALGORITHM DESIGN IN image processing BASED ON PROJECTIvE GEOMETRY

COMPUTER VISION ALGORITHM DESIGN IN IMAGE PROCESSING BASED O...

引用

作者： Kang, Y.G. Di Zhao School of Electrical and Information Engineering Hunan Institute of Engineering Hunan Xiangtan411104 China

image processing with computer vision, particularly in the realm of projective geometry, offers remarkable potential for various applications. Through the lens of projective geometry, images can be transformed, augmented, and reconstructed with precision, facilitating tasks such as image rectification, 3D reconstruction, and object tracking. Landmark estimation in computer vision is a vital task with broad applications across various domains. This process involves identifying key points or landmarks within images, enabling tasks such as facial recognition, object tracking, and gesture recognition. This paper, proposed a novel approach for landmark estimation in computer vision using Projective Geometry Landmark Estimation (PGLM). The proposed model aims to estimate the landmark features by a projective geometry model. With the estimation of the geometry features landmarks related to the facial, object, and medical images are computed. The PGLM model uses the point features for the location of the landmark features. In order to compare PGLM’s performance to that of more conventional classification methods like Random Forest, K-Nearest Neighbors (KNN), and Support vector machine (SvM), simulation analysis is carried out. From what we can see, PGLM routinely beats these alternatives when we compare their accuracy, precision, recall, and F1 score. The findings stated the effectiveness of PGLM as a promising approach for landmark estimation in image processing tasks, paving the way for further advancements in this domain. ©2024: The Royal Institution of Naval Architects.

关键词： Nearest neighbor search

来源：评论

学校读者我要写书评

暂无评论

A Real-Time Edge-Detection CMOS image Sensor for machine vision applications

引用

IEEE SENSORS JOURNAL 2023年第9期23卷 9254-9261页

作者： Park, Min-Jun Kim, Hyeon-June Seoul Natl Univ Sci & Technol Dept Semicond Engn Seoul 01811 South Korea

This article presents a real-time edge image extraction CMOS image sensor (CIS) with an edge-detection counter for machine vision applications. By examining a conventional column-parallel (CP) CIS imaging structure with a single-slope analog-to-digital convertor (SS ADC), it discovered an additional time slot available to extract information of an additional image during a normal imaging operation of two adjacent columns. While obtaining a normal image in this study, the prototype CIS with the proposed edge-detection counter effectively utilizes the spare time for extracting an additional column edge image without an image signal processor (ISP) and any computational latency. In addition, by applying a proposed variable edge thresholding function, the proposed CIS can adopt an optimum edge threshold value according to its imaging condition, alleviating an inherent limitation of a column edge image. This prototype CIS was fabricated using a 0.18-mu m 1-poly 6-metal (1P6M) CMOS process with an effective pixel resolution of 320 (H) x 320 (v). The prototype consumes 17.72-mW power with a frame rate of 240 frames/s. The prototype CIS demonstrated a figure of merit of 721 pW/frame pixel.

关键词： CMOS image sensor (CIS) column-parallel (CP) imaging structure edge-detection counter on-chip edge image extraction single-slope analog-to-digital convertor (SS ADC) variable edge thresholding

来源：评论

学校读者我要写书评

暂无评论

Amber: A 16-nm System-on-Chip With a Coarse-Grained Reconfigurable Array for Flexible Acceleration of Dense Linear Algebra

引用

IEEE JOURNAL OF SOLID-STATE CIRCUITS 2024年第3期59卷 947-959页

作者： Feng, Kathleen Kong, Taeyoung Koul, Kalhan Melchert, Jackson Carsello, Alex Liu, Qiaoyi Nyengele, Gedeon Strange, Maxwell Zhang, Keyi Nayak, Ankita Setter, Jeff Thomas, James Sreedhar, Kavya Chen, Po-Han Bhagdikar, Nikhil Myers, Zach A. D'Agostino, Brandon Joshi, Pranil Richardson, Stephen Torng, Christopher Horowitz, Mark Raina, Priyanka Stanford Univ Dept Elect Engn Stanford CA 94305 USA Stanford Univ Dept Comp Sci Stanford CA 94305 USA

Amber is a system-on-chip (SoC) with a coarse-grained reconfigurable array (CGRA) for acceleration of dense linear algebra applications, such as machine learning (ML), image processing, and computer vision. It is designed using an agile accelerator-compiler codesign flow;the compiler updates automatically with hardware changes, enabling continuous application-level evaluation of the hardware-software system. To increase hardware utilization and minimize reconfigurability overhead, Amber features the following: 1) dynamic partial reconfiguration (DPR) of the CGRA for higher resource utilization by allowing fast switching between applications and partitioning resources between simultaneous applications;2) streaming memory controllers supporting affine access patterns for efficient mapping of dense linear algebra;and 3) low-overhead transcendental and complex arithmetic operations. The physical design of Amber features a unique clock distribution method and timing methodology to efficiently layout its hierarchical and tile-based design. Amber achieves a peak energy efficiency of 538 INT16 GOPS/W and 483 BFloat16 GFLOPS/W. Compared with a CPU, a GPU, and a field-programmable gate array (FPGA), Amber has up to 3902x , 152x, and 107x better energy-delay product (EDP), respectively.

关键词： Hardware Field programmable gate arrays Switches Registers Random access memory Multiplexing Linear algebra Coarse-grained reconfigurable array (CGRA) computer architecture computer vision image processing machine learning (ML) reconfigurable accelerators system-on-chip (SoC)

来源：评论

学校读者我要写书评

暂无评论

machine learning-based augmented vision for detecting driver drowsiness

引用

Multimedia Tools and applications 2025年 1-23页

作者： Ramarao, Gude Soni, Palangthod vaishnavi, Agraharam Sri venkatavarshitha Department of Electronics and Communication Engineering G.Pullaiah College of Engineering and Technology Andhra Pradesh Kurnool India

Fatigued drivers often cause traffic accidents. This study introduces a novel method for detecting fatigue that combines machine learning and image processing techniques. We propose a unique approach that utilizes the Haar Cascade method, the CatBoost algorithm, and an Inception v3 for facial detection and eye classification, allowing for the quick identification and management of drowsiness-related issues. The proposed method uses Haar Cascade for facial detection, proving more reliable than CNN-based methods. A Python-based program detects the real-time images of the driver’s face using OpenCv, while deep learning models built on Keras analyze the facial data. The CNN is trained to distinguish between open and closed eyes, aiding in detecting fatigue. When prolonged eye closure is detected, drivers receive immediate advice to stop or take a break. This effort aims to create a fatigue detection system that is both reliable and robust, capable of swiftly identifying prolonged eye closure. Ultimately, our method has the potential to improve road safety significantly and contribute to global initiatives aimed at addressing this critical issue by providing early warnings to drivers. © The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2025.

关键词： Face recognition

来源：评论

学校读者我要写书评

暂无评论

machine vision system for automatic defect detection of ultrasound probes

引用

INTERNATIONAL JOURNAL OF ADvANCED MANUFACTURING TECHNOLOGY 2024年第7-8期135卷 3421-3435页

作者： Profili, Andrea Magherini, Roberto Servi, Michaela Spezia, Fabrizio Gemmiti, Daniele volpe, Yary Univ Florence Dept Ind Engn Via Santa Marta 3 I-50139 Florence Italy Esaote SpA Via Caciolle 15 I-50127 Florence Italy

Industry 4.0 conceptualizes the automation of processes through the introduction of technologies such as artificial intelligence and advanced robotics, resulting in a significant production improvement. Detecting defects in the production process, predicting mechanical malfunctions in the assembly line, and identifying defects of the final product are just a few examples of applications of these technologies. In this context, this work focuses on the detection of ultrasound probes' surface defects, with a focus on Esaote S.p.A.'s production line probes. To date, this control is performed manually and therefore biased by many factors such as surface morphology, color, size of the defect, and by lighting conditions (which can cause reflections preventing detection). To overcome these shortfalls, this work proposes a fully automatic machine vision system for surface acquisition of ultrasound probes coupled with an automated defect detection system that leverage artificial intelligence. The paper addresses two crucial steps: (i) the development of the acquisition system (i.e., selection of the acquisition device, analysis of the illumination system, and design of the camera handling system);(ii) the analysis of neural network models for defect detection and classification by comparing three possible solutions (i.e., MMSD-Net, ResNet, EfficientNet). The results suggest that the developed system has the potential to be used as a defect detection tool in the production line (full image acquisition cycle takes similar to 200 s), with the best detection accuracy obtained with the EfficientNet model being 98.63% and a classification accuracy of 81.90%.

关键词： Product characterization Inspection system image processing Artificial intelligence virtual prototyping

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：