检索结果-内蒙古大学图书馆

6th IEEE International Conference on image processing, applications and Systems, IPAS 2025

作者： Alsubaie, Norah A. Almalki, Ghayda A. Almutairi, Ghada N. Alrumaih, Sarah A. Princess Nourah Bint Abdulrahman University Department of Computer Sciences Riyadh Saudi Arabia

ISBN: (纸本)9798331506520

This research introduces "Jaddah,"an innovative AI-based system for the automated detection of road infrastructure defects using advanced computer vision and machine learning techniques. The system addresses the limitations of traditional road inspection methods, which are often slow and prone to human error. Jaddah develops a mobile application that efficiently detects, classifies, and segments road defects at the pixel level. By utilizing a comprehensive dataset of high-resolution images, the model training process is significantly enhanced. The YOLOv8-seg model is implemented to achieve precise defect localization and segmentation, ensuring high accuracy in identifying and categorizing road defects. Performance metrics show an impressive 87% mAP50, demonstrating reliable defect detection. These results contribute to improved infrastructure maintenance, enhanced road safety, and greater operational efficiency. © 2025 IEEE.

关键词： image enhancement

来源：评论

学校读者我要写书评

暂无评论

Integrating artificial intelligence and natural language processing for computer-assisted reporting and report understanding in nuclear cardiology

引用

JOURNAL OF NUCLEAR CARDIOLOGY 2023年第3期30卷 1180-1190页

作者： Garcia, Ernest, v Emory Univ Sch Med Dept Radiol & Imaging Sci 101 Woodruff CircleRoom 1203 Atlanta GA 30322 USA

Natural language processing (NLP) offers many opportunities in Nuclear Cardiology. These opportunities include applications in converting nuclear cardiology imaging reports to digital searchable information that may be used as Big Data for machine learning and registries. Another major NLP application is, with the support of AI, in automatically translating MPI image features directly into nuclear cardiology reports. This review describes the symbiotic relationship between AI and NLP in that NLP is being used to facilitate AI applications and, AI techniques are being used to facilitate NLP. This article reviews the fundamentals of NLP and describes various conventional and AI techniques that have been applied in imaging. Key nuclear cardiology applications are reviewed such as conversion of MPI free-text reports to digital documents as well as direct conversion of MPI images into structured medical reports.

关键词： Natural language processing Natural language understanding Natural language generation Structured reports

来源：评论

学校读者我要写书评

暂无评论

A Novel Measuring Strategy For Diamond Exposure Of diamond beaded wire On machine vision 18

A Novel Measuring Strategy For Diamond Exposure Of diamond b...

引用

18th CIRP Conference on Computer Aided Tolerancing, CAT 2024

作者： Zhang, Zhen Lin, Jinghua Cui, Changcai Institute of Manufacturing Engineering Huaqiao University Xiamen China Inational & Local Joint Engineering Research Center for Intelligent Manufacturing Technology of Brittle Materials Products Huaqiao University Xiamen China College of Metrology Measurement and Instrument China Jiliang University Hangzhou China

This paper proposes a method based on the contour method to solve the problem of difficulty in measuring the wear state of diamond beaded wire during processing. The edge contour image of the diamond beaded wire was collected through the built machine vision measurement system. Building on this, a custom convolution kernel was used to extract the beaded corner feature information. Utilizing this corner feature information, the diamond beaded wire abrasive particle exposure height profile was constructed, revealing the wear status of the diamond beaded wire. © 2024 The Authors. Published by Elsevier B.v.

关键词： machine vision

来源：评论

学校读者我要写书评

暂无评论

Redefining Dental Radiology Using Deep Learning 2

Redefining Dental Radiology Using Deep Learning

引用

2nd IEEE World Conference on Communication and Computing (WCONF)

作者： Saha, Kshitij G. Dheeraj, R. P. Kulkarni, Pratham Agarwal, Animesh Prasad, v. R. Badri Saha, Mainak Kanti Saha, Suparna Ganguly PES Univ Dept Comp Sci Engn Bengaluru India Modern Dent Coll Dept Prosthodont Indore India Index Inst Dent Sci Dept Conservat Dent Indore India

ISBN: (纸本)9798350395334;9798350395327

The recent times have witnessed a rise in the use of image processing, computer vision, and machine learning in the field of medical imaging, thus offering more accurate diagnoses with a reduction of the cost of labor while at the same time, minimizing the scope for human error. Dental X-ray images are often challenging and time-consuming to study consequently making diagnosis more arduous. Furthermore, only an experienced clinician can endeavor to provide an accurate diagnosis from a two-dimensional X-ray image. Manual investigation of dental diseases and abnormalities is still the most prevalent method in the field of dentistry. This article aims to introduce a novel method to automate the process of obtaining an initial diagnosis from orthopantamogram(OPG) X-rays by using state-of-the-art object detection models which are currently proving to be effective in medical image diagnosis. By providing an effective comparison between popular object detection frameworks, we aim to determine the computer vision model that provides the most promising results by accurately diagnosing dental abnormalities and identifying treatments from a dental X-ray image in an error-free and efficient manner.

关键词： X-ray Dental Radiology OPG Computer vision Object Detection image processing medical imaging teeth deep learning YOLO

来源：评论

学校读者我要写书评

暂无评论

Autonomous Object Detection and Counting using Edge Detection and image processing Algorithms 7

Autonomous Object Detection and Counting using Edge Detectio...

引用

7th International Conference on Trends in Electronics and Informatics, ICOEI 2023

作者： Patil, Swati B. Shimpi, Jay Chandrakant Tanawade, Archana Girish Chavan, Pranali Gajanan Tandulkar, vrushali Shrimant Information Technology Vishawakarma Institute of Information Technology Pune India T and T Infra Ltd Pune India

ISBN: (纸本)9798350397284

machine vision applications are commonly utilised in manufacturing lines as low cost, high precision measuring devices. Output facilities can accomplish high production numbers without mistakes thanks to these solutions that offer contactless control and measurement. A camera may be used to carry out machine vision tasks including product counting., error checking., and dimension measuring. This study makes a recommendation for a vision system application that can do inanimate object item enumeration. The recommended solution uses Otsu thresholding., Hough transformations., edge detection methods., and other image processing algorithms to accomplish automatic counting without taking into account the kind or colour of the product. The system primarily uses one camera. The general idea is to get image with balanced contrast., brightness and appropriate HSv values in it. A picture of the items being captured using camera using android device., and different image processing techniques are then applied to the picture. Further., a real-time machine vision programme was deployed and took photos taken from an actual experimental setup. The practical experiments conducted have shown that the suggested technique yields quick., precise., and trustworthy results based on the comparative study of various detection techniques. © 2023 IEEE.

关键词： Cameras

来源：评论

学校读者我要写书评

暂无评论

image Monitoring System Based on Deep Neural Network 2

Image Monitoring System Based on Deep Neural Network

引用

2nd International Conference on image processing, Computer vision and machine Learning, ICICML 2023

作者： Weng, Junhong Zeng, Lingfeng Song, Yingjie Shenzhen Power Supply Bureau CO. LTD China Southern Power Grid Guangdong Shenzhen China

ISBN: (纸本)9798350331417

For safety and security reasons, the indoor/outdoor working environments of various industries require the use of many cameras for automated surveillance. In such context, a major challenge for automated monitoring system is achieving high-precision real-time performance for image classification and object detection. In this paper, we present a novel image surveillance system based on a combined approach derived from YOLO v5. The system first detects moving targets using background subtraction. Then, we propose a modified YOLO v5 algorithm for accurately detecting and categorizing different objects in images captured in a video stream. The system runs in real time and could analyze multiple video streams simultaneously. The results of the experiments show that this system has good performance and could be widely applied on several areas, such as security, surveillance, and traffic management. © 2023 IEEE.

关键词： deep nueral network monitoring system YOLOv5

来源：评论

学校读者我要写书评

暂无评论

Enhancing out-of-distribution learning in computer vision through dominant feature masking

引用

PATTERN ANALYSIS AND applications 2025年第2期28卷 1-30页

作者： Pilzak, Artem Thivierge, Jean-Philippe Univ Ottawa Sch Psychol Ottawa ON K1N 6N5 Canada Univ Ottawa Brain & Mind Res Inst Ottawa ON K1N 6N5 Canada

Out-of-distribution (OOD) learning presents a major challenge in machine learning as models must effectively generalize to previously unseen data. This challenge is prevalent in deep learning models, which tend to focus on the most dominant features in images. This narrow focus impedes OOD learning, where critical features are concealed or absent during testing, leading to reduced prediction accuracy. To address this issue, we introduce a novel data augmentation approach termed Dominant Feature Masking (DFM), inspired by human visual holistic processing. DFM strategically conceals and reveals the most prominent features within images, allowing neural networks to simultaneously capture both dominant and non-dominant attributes, thereby enhancing adaptability to OOD data. We evaluated DFM using a novel set of learning challenges termed versatile Evaluation Benchmark (vEB), which assesses model performance on three distinct tasks: (i) augmented MNIST images to test resilience against diverse transformations;(ii) a novel dataset of unseen image classes to examine performance on new instances within familiar categories;and (iii) a dataset created by DALL-E to challenge class differentiation with artificially mixed features. Our results demonstrate that DFM significantly improves OOD generalization compared to traditional augmentation techniques, achieving marked enhancements across various conditions without compromising in-distribution testing accuracy. These findings underscore the potential of DFM to improve the performance of computer vision systems in various real-world scenarios, making them more robust and adaptable to unexpected data variations. By leveraging vEB, researchers will gain a deeper understanding of their models' generalization performance, ensuring that CNNs are well-equipped to handle the complexities of real-world applications. The source code and vEB datasets are available at https://***/Deepvisionary/DFM.

关键词： Out-of-distribution learning Convolutional neural networks Data augmentation Feature acquisition Domain generalization

来源：评论

学校读者我要写书评

暂无评论

A Review on Generative Adversarial Networks: Algorithms, Theory, and applications

引用

IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING 2023年第4期35卷 3313-3332页

作者： Gui, Jie Sun, Zhenan Wen, Yonggang Tao, Dacheng Ye, Jieping Southeast Univ Sch Cyber Sci & Engn Nanjing 211100 Jiangsu Peoples R China Purple Mt Labs Nanjing 210000 Peoples R China Univ Michigan Dept Computat Med & Bioinformat Ann Arbor MI USA Chinese Acad Sci Ctr Res Intelligent Percept & Comp Beijing 100190 Peoples R China Nanyang Technol Univ Sch Comp Sci & Engn Singapore 639798 Singapore JD Explore Acad Beijing Peoples R China Univ Sydney Sch Comp Sci Camperdown Australia Beike Beijing 100085 Peoples R China Univ Michigan Ann Arbor MI 48109 USA

Generative adversarial networks (GANs) have recently become a hot research topic;however, they have been studied since 2014, and a large number of algorithms have been proposed. Nevertheless, few comprehensive studies explain the connections among different GAN variants and how they have evolved. In this paper, we attempt to provide a review of the various GAN methods from the perspectives of algorithms, theory, and applications. First, the motivations, mathematical representations, and structures of most GAN algorithms are introduced in detail, and we compare their commonalities and differences. Second, theoretical issues related to GANs are investigated. Finally, typical applications of GANs in image processing and computer vision, natural language processing, music, speech and audio, the medical field, and data science are discussed.

关键词： Generators Generative adversarial networks Data models Linear programming Natural language processing machine learning algorithms Inference algorithms Deep learning generative adversarial networks algorithm theory applications

来源：评论

学校读者我要写书评

暂无评论

Bio-inspired smart vision sensor: toward a reconfigurable hardware modeling of the hierarchical processing in the brain

引用

JOURNAL OF REAL-TIME image processing 2021年第1期18卷 157-174页

作者： Bhowmik, Pankaj Pantho, Md Jubaer Hossain Bobda, Christophe Univ Florida Dept Elect & Comp Engn Gainesville FL 32611 USA

Biological vision systems inspire processing methods in computer vision applications. This paper employs the insights of vision systems in hardware and presents a pixel-parallel, reconfigurable, and layer-based hierarchical architecture for smart image sensors. The architecture aims to bring computation close to the sensor to achieve high acceleration for different machine vision applications while consuming low power. We logically divide the image into multiple regions and perform pixel-level and region-level processing after removing spatiotemporal redundancy. Those processors use bio-inspired algorithms to activate the regions with region of interest of a scene. The hierarchical processing breaks the traditional sequential image processing and introduces parallelism for machine vision applications. Also, we make the hardware design reconfigurable even after fabrication to make the hardware reusable for different applications. Simulation results show that the area overhead and power penalty for adding reconfigurable features stay in an acceptable range. We emphasize to maximize the operating speed and obtain 800 MHz. Besides, the design saves 84.01% and 96.91% dynamic power at the first and second stages of the hierarchy by removing redundant information. Furthermore, the sequential deployment of high-level reasoning only on the selected regions of the image becomes computationally inexpensive to execute a complex task in real time.

关键词： Biological vision Pixel-level processing Reconfigurability Predictive coding Attention module Smart image sensor FPGA ASIC

来源：评论

学校读者我要写书评

暂无评论

Deep Learning-Based Hand Gesture Recognition System and Design of a Human-machine Interface

引用

NEURAL processing LETTERS 2023年第9期55卷 12569-12596页

作者： Sen, Abir Mishra, Tapas Kumar Dash, Ratnakar Natl Inst Technol Dept Comp Sci & Engn Rourkela 769008 India

Hand gesture recognition plays an important role in developing effective human-machine interfaces (HMIs) that enable direct communication between humans and machines. But in real-time scenarios, it is difficult to identify the correct hand gesture to control an application while moving the hands. To address this issue, in this work, a low-cost hand gesture recognition system based human-computer interface (HCI) is presented in real-time scenarios. The system consists of six stages: (1) hand detection, (2) gesture segmentation, (3) feature extraction and gesture classification using five pre-trained convolutional neural network models (CNN) and vision transformer (viT), (4) building an interactive human-machine interface (HMI), (5) development of a gesture-controlled virtual mouse, (6) smoothing of virtual mouse pointer using of Kalman filter. In our work, five pre-trained CNN models (vGG16, vGG19, ResNet50, ResNet101, and Inception-v1) and viT have been employed to classify hand gesture images. Two multi-class datasets (one public and one custom) have been used to validate the models. Considering the model's performances, it is observed that Inception-v1 has significantly shown a better classification performance compared to the other four CNN models and viT in terms of accuracy, precision, recall, and F-score values. We have also expanded this system to control some multimedia applications (such as vLC player, audio player, playing 2D Super-Mario-Bros game, etc.) with different customized gesture commands in real-time scenarios. The average speed of this system has reached 25 fps (frames per second), which meets the requirements for the real-time scenario. Performance of the proposed gesture control system obtained the average response time in milisecond for each control which makes it suitable for real-time. This model (prototype) will benefit physically disabled people interacting with desktops.

关键词： Deep learning Hand gesture recognition Segmentation vision transformer Kalman filter Human machine interface Transfer learning virtual mouse

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：