检索结果-内蒙古大学图书馆

High-speed and area-efficient Sobel edge detector on field-programmable gate array for artificial intelligence and machine learning applications

引用

COMPUTATIONAL INTELLIGENCE 2021年第3期37卷 1056-1067页

作者： Sikka, Prateek Asati, Abhijit R. Shekhar, Chandra Birla Inst Technol & Sci Elect & Elect Engn Dept Vidya Vihar Campus Pilani Rajasthan India

Sobel edge detector is an algorithm commonly used in image processing and computer vision to extract edges from input images using derivative of image pixels in x and y directions against surrounding pixels. Most artificial intelligence and machine learning applications require image processing algorithms running in real time on hardware systems like field-programmable gate array (FPGAs). They typically require high throughput to match real-time speeds and since they run alongside other processing algorithms, they are required to be area efficient as well. This article proposes a high-speed and low-area implementation of the Sobel edge detection algorithm. We created the design using a novel high-level synthesis (HLS) design method based on application specific bit widths for intermediate data nodes. Register transfer level code was generated using MATLAB hardware description language (HDL) coder for HLS. The generated HDL code was implemented on Xilinx Kintex 7 field programmable gate array (FPGA) using Xilinx Vivado software. Our implementation results are superior to those obtained for similar implementations using the vendor library block sets as well as those obtained by other researchers using similar implementations in the recent past in terms of area and speed. We tested our algorithm on Kintex 7 using real-time input video with a frame resolution of 1920 x 1080. We also verified the functional simulation results with a golden MATLAB implementation using FPGA in the loop feature of HDL Verifier. In addition, we propose a generic area, speed, and power improvement methodology for different HLS tools and application designs.

关键词： field-programmable gate array hardware description language high-level synthesis MATLAB HDL coder register transfer language Sobel edge detection Vivado

来源：评论

学校读者我要写书评

暂无评论

FDT - Dr²T: a unified Dense Radiology Report Generation Transformer framework for X-ray images

引用

machine vision AND applications 2024年第4期35卷 68-68页

作者： Sharma, Dhruv Dhiman, Chhavi Kumar, Dinesh Delhi Technol Univ Dept Elect & Commun Engn Delhi India

Medical image Captioning (MIC), is a developing area of artificial intelligence that combines two main research areas, computer vision and natural language processing. In order to support clinical workflows and decision-making, MIC is used in a variety of applications pertaining to diagnosis, therapy, report production, and computer-aided diagnosis. The generation of long and coherent reports highlighting correct abnormalities is a challenging task. Therefore, in this direction, this paper presents an efficient FDT - Dr(2)T framework for the generation of coherent radiology reports with efficient exploitation of medical content. The proposed framework leverages the fusion of texture features and deep features in the first stage by incorporating ISCM-LBP + PCA-HOG feature extraction algorithm and Convolutional Triple Attention-based Efficient XceptionNet (C-TaXNet). Further, fused features from the FDT module are utilized by the Dense Radiology Report Generation Transformer (Dr(2)T) model with modified multi-head attention generating dense radiology reports by highlighting specific crucial abnormalities. To evaluate the performance of the proposed FDT - Dr(2)T extensive experiments are conducted on publicly available IU Chest X-ray dataset and the best performance of the work is observed as 0.531 BLEU@1, 0.398 BLEU@2, 0.322 BLEU@3, 0.251 BLEU@4, 0.384 CIDEr, 0.506 ROUGE-L, 0.277 METEOR. An ablation study is carried out to support the experiments. Overall, the results obtained demonstrate the efficiency and efficacy of the proposed framework.

关键词： Deep-features Texture features Transformer Medical image captioning XceptionNet Computer vision Natural language processing Tripple attention

来源：评论

学校读者我要写书评

暂无评论

Design and Implementation of machine vision Experiment Platform for Virtual Production Line

Design and Implementation of Machine Vision Experiment Platf...

引用

Virtual Reality (ICVR), IEEE International Conference on

作者： Jinfang Li Mingtong He Jiancong Su Boyang Wang Zhenxian Li School of Electromechanical Engineering Guangdong University of Technology Guangzhou China

To meet the needs of teaching and practical applications in machine vision technology, a virtual reality-based machine vision experimental platform has been designed and developed. Unity3D was utilized as the development engine, and image processing technology was integrated to achieve the construction of virtual production line scenes, simulation of vision component parameter adjustments, and image acquisition. The platform features a graphical programming interface for visualizing image processing algorithms, which can be used to perform visual debugging of vision stations with a virtual robot system driven by software PLC. This machine vision experimental platform ensures the consistency between simulation and actual engineering processes, and enables students to explore different vision schemes on an industrial production line, thereby avoiding constraints on location, time, and equipment in related experiments.

关键词：

来源：评论

学校读者我要写书评

暂无评论

Infrared Computer vision for Utility-Scale Photovoltaic Array Inspection

Infrared Computer Vision for Utility-Scale Photovoltaic Arra...

引用

International Conference on Information, Intelligence, Systems and applications (IISA)

作者： David F. Ramirez Deep Pujara Cihan Tepedelenlioglu Devarajan Srinivasan Andreas Spanias SenSIP Center School of ECEE Arizona State University Tempe USA Poundra LLC Tempe AZ USA

ISBN: (数字)9798350368833

ISBN: (纸本)9798350368840

Utility-scale solar arrays require specialized inspection methods for detecting faulty panels. Photovoltaic (PV) panel faults caused by weather, ground leakage, circuit issues, temperature, environment, age, and other damage can take many forms but often symptomatically exhibit temperature differences. Included is a mini survey to review these common faults and PV array fault detection approaches. Among these, infrared thermography cameras are a powerful tool for improving solar panel inspection in the field. These can be combined with other technologies, including image processing and machine learning. This position paper examines several computer vision algorithms that automate thermal anomaly detection in infrared imagery. We demonstrate our infrared thermography data collection approach, the PV thermal imagery benchmark dataset, and the measured performance of image processing transformations, including the Hough Transform for PV segmentation. The results of this implementation are presented with a discussion of future work.

关键词： Surveys Computer vision machine learning algorithms Reviews Infrared imaging Transforms Inspection Land surface temperature Solar panels Meteorology

来源：评论

学校读者我要写书评

暂无评论

Evolution of visual data captioning Methods, Datasets, and evaluation Metrics: A comprehensive survey

引用

EXPERT SYSTEMS WITH applications 2023年 221卷

作者： Sharma, Dhruv Dhiman, Chhavi Kumar, Dinesh Delhi Technol Univ Dept Elect & Commun Engn Delhi India

Automatic Visual Captioning (AVC) generates syntactically and semantically correct sentences by describing important objects, attributes, and their relationships with each other. It is classified into two categories: image captioning and video captioning. It is widely used in various applications such as assistance for the visually impaired, human-robot interaction, video surveillance systems, scene understanding, etc. With the unprecedented success of deep-learning in Computer vision and Natural Language processing, the past few years have seen a surge of research in this domain. In this survey, the state-of-the-art is classified based on how they conceptualize the captioning problem, viz., traditional approaches that cast visual description either as retrieval or template-based description and deep learning approaches. A detailed review of existing methods, highlighting their pros and cons, societal impact as the number of citations, architectures used, datasets experimented on and GitHub link is presented. Moreover, the survey also provides an overview of the benchmark image and video datasets and the evaluation measures that have been developed to assess the quality of machine-generated captions. It is observed that dense or paragraph generation and Change image Captioning (CIC) are stimulating the research community more due to the near-to-human abstraction ability. Finally, the paper explores future directions in the area of automatic visual caption generation.

关键词： Visual Captioning image Captioning Video Captioning Change image Captioning (CIC) LSTM CNN RNN Computer vision (CV) Natural Language processing (NLP)

来源：评论

学校读者我要写书评

暂无评论

Closed-loop active object recognition with constrained illumination power

Closed-loop active object recognition with constrained illum...

引用

Conference on Real-Time image processing and Deep Learning

作者： Noom, Jacques Soloviev, Oleg Smith, Carlas Verhaegen, Michel Delft Univ Technol Delft Ctr Syst & Control Mekelweg 2 NL-2628 CD Delft Netherlands Flexible Opt BV Polakweg 10-11 NL-2288 GG Rijswijk Netherlands

ISBN: (纸本)9781510650817;9781510650800

Some applications require high level of image-based classification certainty while keeping the total illumination energy as low as possible. Examples are minimally invasive visual inspection in Industry 4.0, and medical imaging systems such as computed tomography, in which the radiation dose should be kept "as low as is reasonably achievable". We introduce a sequential object recognition scheme aimed at minimizing phototoxicity or bleaching while achieving a predefined level of decision accuracy. The novel online procedure relies on approximate weighted Bhattacharyya coefficients for determination of future inputs. Simulation results on the MNIST handwritten digit database show how the total illumination energy is decreased with respect to a detection scheme using constant illumination.

关键词： Active fault diagnosis Auxiliary signal design machine vision Computational Tomography Medical imaging Industry 4.0

来源：评论

学校读者我要写书评

暂无评论

MED-GPVS: A Deep Learning-Based Joint Biomedical image Classification and Visual Question Answering System for Precision e-Health

MED-GPVS: A Deep Learning-Based Joint Biomedical Image Class...

引用

IEEE International Conference on Communications (ICC)

作者： Haridas, Harishma T. Fouda, Mostafa M. Fadlullah, Zubair Md Mahmoud, Mohamed ElHalawany, Basem M. Guizani, Mohsen Lakehead Univ Dept Comp Sci Thunder Bay ON Canada Idaho State Univ Dept Elect & Comp Engn Pocatello ID 83209 USA Thunder Bay Reg Hlth Res Inst TBRHRI Thunder Bay ON Canada Tennessee Technol Univ Dept Elect & Comp Engn Cookeville TN USA Benha Univ Fac Engn Shoubra Banha Egypt Mohamed Bin Zayed Univ Artificial Intelligence MB Machine Learning Dept Abu Dhabi U Arab Emirates

ISBN: (数字)9781538683477

ISBN: (纸本)9781538683477

General Purpose vision System (GPVS) is a task-agnostic vision-language system that inputs an image and a question from which the system recognizes the tasks to be performed and outputs bounding boxes, confidence scores, and text outputs to answer the question. While much attention to GPVS has been recently given in the computer vision field, its medical field applications are still in their infancy. This paper presents MED-GPVS, a customized deep learning-based GPVS on biomedical images to perform various vision tasks, such as object detection and visual question answering, on medical images to facilitate precision medicine/e-health services. Our envisioned MED-GPVS takes an image and a natural language text as inputs, and then outputs bounding boxes, confidence scores, and generates a caption (i.e., the answer to the posed query). For example, if a medical image of a patient's abdomen is presented to MED-GPVS followed by the question: "does the picture contain stomach?", MED-GPVS should ideally provide the answer "yes" along with a prediction box and prediction score on the image. We utilize the multilingual SLAKE dataset, which was annotated by expert physicians with a full semantic label, to validate the performance of MED-GPVS under various scenarios involving different biomedical image-based diagnoses. For the visual question answering (VQA) task, MED-GPVS demonstrates encouraging performance with significantly high accuracy of 82.41%.

关键词： e-health precision medicine General Purpose vision System (GPVS) DEtection TRansformer (DETR) vision-and-Language Bidirectional Encoder Representations from Transformers (ViLBERT) object detection Natural Language processing (NLP)

来源：评论

学校读者我要写书评

暂无评论

OmniGlasses: an optical aid for stereo vision CNNs to enable omnidirectional image processing

引用

machine vision AND applications 2024年第3期35卷 58-58页

作者： Seuffert, Julian B. Grassi, Ana C. Perez Ahmed, Hamza Seidel, Roman Hirtz, Gangolf Tech Univ Chemnitz Fac Elect Engn & Informat Technol Reichenhainer Str 70 D-09126 Chemnitz Saxony Germany Tech Univ Chemnitz Fac Comp Sci Reichenhainer Str 70 D-09126 Chemnitz Saxony Germany

Stereo vision is a key technology for 3D scene reconstruction from image pairs. Most approaches process perspective images from commodity cameras. These images, however, have a very limited field of view and only picture a small portion of the scene. In contrast, omnidirectional images, also known as fisheye images, exhibit a much larger field of view and allow a full 3D scene reconstruction with a small amount of cameras if placed carefully. However, omnidirectional images are strongly distorted which make the 3D reconstruction much more sophisticated. Nowadays, a lot of research is conducted on CNNs for omnidirectional stereo vision. Nevertheless, a significant gap between estimation accuracy and throughput can be observed in the literature. This work aims to bridge this gap by introducing a novel set of transformations, namely OmniGlasses. These are incorporated into the architecture of a fast network, i.e., AnyNet, originally designed for scene reconstruction on perspective images. Our network, Omni-AnyNet, produces accurate omnidirectional distance maps with a mean absolute error of around 13 cm at 48.4 fps and is therefore real-time capable.

关键词： Epipolar geometry Fisheye Omnidirectional Look up table Stereo vision View synthesis

来源：评论

学校读者我要写书评

暂无评论

Macro-Scale Pattern Recognition and Coordinate Identification in Real-time Spatio-temporal Overlap for Photonics Engineering applications

引用

IFAC-PapersOnLine 2024年第3期58卷 70-73页

作者： Haider Al-Juboori South East Technological University Faculty of Engineering Dept. of Electronics Engineering and Communications 806 Killeshin Building Kilkenny Road Carlow Ireland R93V960

The significance of high-speed machine vision in scientific and technological fields is growing, especially with the era of Industry 4.0 technologies. There are several pattern-matching algorithms that have various intriguing applications in ultralow-latency machine vision processing. However, the low frame rate of image sensors—which usually operate at tens of hertz—fundamentally limits the processing rate. The paper will conceptualize and develop the computerized pattern recognition technique that can be applied to investigate light beam profiles and extract the desired information according to the purpose required in this case study. In the current work, the automatic detection and inspection of laser spots were designed to perform analysis and alignment for the laser beam in comparison with the electron spot beam using the LabVIEW graphical programming environment, especially when the laser and electron beams overlap. This is one of the important steps for realizing the fundamental aim of test-FEL to produce short wavelengths with the second, third, and fifth harmonics at 131.5, 88, and 53 nm, respectively. The tentative version of the program achieved the elementary purpose, which fulfilled the accurate transversal alignment of the ultrashort laser pulses with the electron beam in the system of the FEL test facility at MAX-Lab, in addition to studying the beam’s stability and jittering range.

关键词： intelligent systems pattern matching real-time tracking computer vision concepts supporting control automation semi-robotic systems

来源：评论

学校读者我要写书评

暂无评论

Development of an Automated Algorithm to Quantify Optic Nerve Diameter Using Ultrasound Measures: Implications for Optic Neuropathies

Development of an Automated Algorithm to Quantify Optic Nerv...

引用

International Conference on Computer vision and machine Intelligence, CVMI 2022

作者： Gupta, Vishal Singh, Maninder Gupta, Rajeev Kumar, Basant Agarwal, Deepak Centre for Development of Telematics Telecom Technology Centre of Government of India New Delhi India Electronics and Communication Engineering Department Motilal Nehru National Institute of Technology Allahabad Prayagraj India JPNATC All India Institute of Medical Sciences New Delhi India

ISBN: (纸本)9789811978661

The paper presents a novel computational and image processing algorithm for automatic measurement of optic nerve diameter (OND) from B-scan ultrasound images acquired in a traumatic cohort. The OND is an important diagnostic parameter for the detection and therapeutic planning of several diseases and trauma cases. The automated measurement of OND may provide reliable information for predicting intracranial pressure (ICP) in traumatic patients. In the proposed method, the automatic measurement of the OND involves pre-processing an ultrasound image, followed by retinal globe detection, optic nerve localization, optic nerve tip detection and measurement of OND. The proposed algorithm measures the OND in the perpendicular direction to the optic nerve axis, considering the orientation of the optic nerve in an automated framework. The developed algorithm automatically calculates the OND. The study includes Twenty-four traumatic individuals with optic nerve pathologies whose B-scan ultrasound images of the optic nerve were manually obtained in the axial plane. The accuracy of the automatic measurement has been quantified by comparing it with manually measured OND by medical experts. A low Percent Root Mean Square Difference (PRD) value of 14.92% between automated and manually measured OND is found as an accuracy measure for the automatic measurement of OND. The proposed algorithm automatically and accurately determines the optic nerve diameter from an eye ultrasound image. © 2023, The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

关键词： Ultrasonic applications

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：