检索结果-内蒙古大学图书馆

Multi-modal active learning with deep reinforcement learning for target feature extraction in multi-media image processing applications

引用

MULtimeDIA TOOLS AND APPLICATIONS 2023年第4期82卷 5343-5367页

作者： Dhiman, Gaurav Kumar, A. Vignesh Nirmalan, R. Sujitha, S. Srihari, K. Yuvaraj, N. Arulprakash, P. Raja, R. Arshath Govt Bikram Coll Commerce Dept Comp Sci Patiala Punjab India Chandigarh Univ Univ Ctr Res & Dev Dept Comp Sci & Engn Mohali Punjab India Graph Era Deemed Univ Dept Comp Sci & Engn Dehra Dun Uttarakhand India Jai Shriram Engn Coll Dept Comp Sci & Engn Tiruppur Tamil Nadu India Kalasalingam Acad Res & Educ Dept Comp Sci & Engn Krishnankoil Tamil Nadu India Sri Vidya Coll Engn & Technol Dept Comp Sci & Engn Virudunagar Tamil Nadu India SNS Coll Technol Dept Comp Sci & Engn Coimbatore Tamil Nadu India ICT Acad Training & Res Chennai Tamil Nadu India Dept Comp Sci & Engn Rathinam Tech Campus Coimbatore 641021 Tamil Nadu India ICT Acad Res & Publicat IIT Madras Res Pk Chennai Tamil Nadu India

The advancement in on demand Multimedia Streaming Applications (MAS) enables faster video transmission as per the user request in various fields. This system suffers from poor speed, flexibility and efficiency in accessing and presenting the multimedia contents from the archive. It mostly undergoes delay, packet loss and congestion during data delivery. Hence, the requirement of manual annotation is required for access and retrieval but it suffers from poor retrieval accuracy over large databases. The need of automatic annotation in MAS takes the lead for increased retrieval accuracy on most similar image retrieval systems based on various low-level features. Thus, it eliminates the gap between the high-level semantics and low-level feature representation. The approach on automated annotation of images is considered dependent on the accuracy of a model while detecting edges, color, texture, shape and spatial information. In this paper, we develop an automated annotation model that retrieves visually similar images from online multimedia streams with optimal feature extraction. The automated annotation model is designed with a Multi-modal Active learning (MAL) that uses Convolutional Recurrent Neural Network (CRNN) for automatic annotation of labels based on visually similar contents or features like edges, color, texture, shape and spatial information. Further, a deep Reinforcement learning (DRL) algorithm is used that increases the performance of the retrieval engine based on validating the visually extracted features. The simulation of MAL-CNN is conducted over large online streaming databases and it is then validated by DRL on an online real-time streaming. The performance is validated in terms of its retrieval accuracy, sensitivity, specificity, f-measure, geometric mean and mean absolute percentage error (MAPE). The results confirm the accuracy of the proposed MAL-DRL model against conventional machine learning, reinforcement learning and deep learning automati

关键词： Multimodal active learning Convolutional neural network deep reinforcement learning Feature extraction Multimedia Streaming Systems

来源：评论

学校读者我要写书评

暂无评论

Characterization of the in-focus droplets in shadowgraphy systems via deep learning-based image processing method

引用

PHYSICS OF FLUIDS 2022年第11期34卷 1-15页

作者： Wang, Zhibo He, Feng Zhang, Haixiang Hao, Pengfei Zhang, Xiwen Li, Xiangru Tsinghua Univ Dept Engn Mech Appl Mech Lab Beijing 100084 Peoples R China Tsinghua Univ Sch Mat Sci & Engn AV Aerodynam Res Inst Joint Res Ctr Adv Mat & Antiicing Beijing 100084 Peoples R China

It is important to accurately identify and measure in-focus droplets from shadowgraph droplet images that typically contain a large number of defocused droplets for the research of multiphase flow. However, conventional in-focus droplet identification methods are time-consuming and laborious due to the noise and background illumination in experimental data. In this paper, a deep learning-based method called focus-droplet generative adversarial network (FocGAN) is developed to automatically detect and characterize the focused droplets in shadow images. A generative adversarial network framework is adopted by our model to output binarized images containing only in-focus droplets, and inception blocks are used in the generator to enhance the extraction of multi-scale features. To emulate the real shadow images, an algorithm based on the Gauss blur method is developed to generate paired datasets to train the networks. The detailed architecture and performance of the model were investigated and evaluated by both the synthetic data and spray experimental data. The results show that the present learning-based method is far superior to the traditional adaptive threshold method in terms of effective extraction rate and accuracy. The comprehensive performance of FocGAN, including detection accuracy and robustness to noise, is higher than that of the model based on a convolutional neural network. Moreover, the identification results of spray images with different droplet number densities clearly exhibit the feasibility of FocGAN in real experiments. This work indicates that the proposed learning-based approach is promising to be widely applied as an efficient and universal tool for processing particle shadowgraph images.

关键词： Generative adversarial networks

来源：评论

学校读者我要写书评

暂无评论

Empirical Assessment of Identifying Human Blood Group Based on image processing Assisted deep learning Principles

Empirical Assessment of Identifying Human Blood Group Based ...

引用

2024 International Conference on Innovative Computing, Intelligent Communication and Smart Electrical Systems, ICSES 2024

作者： Ramkumar, G. Rao, N. Janardhana Nanammal, V. Shaphiya, S.K. Giri, J. Samhan, Ahmad Abdelhafiz Ali Dept of Ece Chennai India Qis College of Engineering and Technology Department of Mba Andhra Pradesh Ongole India Jeppiaar Engineering College Department of Electronics and Communication Engineering Chennai India Qis College of Engineering and Technology Department of S&h AndhraPradesh Ongole India Yeshwantrao Chavan College of Engineering Department of Mechanical Engineering Nagpur India Lovely Professional University Division of Research and Development Phagwara India Zarqa University Faculty of Information Technology Department of Software Engineering Zarqa Jordan University of Business and Technology Jeddah21448 Saudi Arabia

ISBN: (纸本)9798331543617

This study presents an empirical assessment of identifying human blood groups using image processing assisted by deep learning principles, specifically employing a cascaded Convolutional Neural Network (CNN) and Light Gradient Boosting Machine (LightGBM). Traditional methods of blood group identification are time-consuming and prone to errors, prompting the need for automated, more efficient systems. In this work, a CNN model was initially used to extract deep features from blood sample images, followed by a LightGBM classifier for final blood group classification. A comprehensive dataset, representing all major blood groups (A, B, AB, O, and Rh types), was collected and preprocessed for training and testing. The cascaded CNN-LightGBM approach achieved an accuracy of 92.4%, significantly outperforming baseline models, including standalone CNN (88.2%) and Random Forest (83.5%). The model also demonstrated high precision (92.1%), recall (92.0%), and F1-score (92.1%). real-world testing was conducted to validate its robustness in clinical settings. The results indicate that the proposed model is well-suited for accurate and efficient blood group identification, with low inference time, making it ideal for real-time applications. This approach has the potential to transform blood diagnostics by providing an automated, scalable, and accurate solution. © 2024 IEEE.

关键词： Convolutional neural networks

来源：评论

学校读者我要写书评

暂无评论

Lane detection using the concept of deep learning and Digital image processing

Lane detection using the concept of Deep learning and Digita...

引用

2024 International Conference on Electrical, Electronics and Computing Technologies, ICEECT 2024

作者： Ojha, Shashi Kant Kumar, Abhishek Kumar, Sandeep Sharda University CSE Department Greater Noida India

ISBN: (纸本)9798350378092

Lane detection technology plays a pivotal role in enabling autonomous navigation in vehicles. However, existing systems primarily cater to well-structured roads with clear lane markings, rendering them ineffective in scenarios where markings are unclear or absent. This study critically evaluates an existing approach for detecting lanes on unmarked roads, followed by the proposal of an enhanced methodology. Both approaches leverage digital image processing techniques and rely solely on vision or camera data. The primary objective is to derive real-time curvature values to facilitate driver-assistance systems in making necessary turns and preventing vehicles from veering off-road. © 2024 IEEE.

关键词： Off road vehicles

来源：评论

学校读者我要写书评

暂无评论

image and Video Captioning Using deep learning and Natural Language processing 8

Image and Video Captioning Using Deep Learning and Natural L...

引用

8th IEEE International Conference on Computing, Communication, Control and Automation, ICCUBEA 2024

作者： Naidu, Manoj Kulkarni, Athrva Kadam, Sahil Joshi, Siddhesh Sable, Nilesh P. Yenkikar, Anuradha Vishwakarma Institute of Information Technology Department of Cse - Artificial Intelligence Pune India

ISBN: (纸本)9798350391770

deep learning models have been a huge success in image recognition which hence can be used for the purpose of text generation. In the field of imaging science, captioning images and videos is regarded as an intellectually difficult job. Visual Geometry Group (VGG);is a standard deep Convolutional Neural Network (CNN) architecture with multiple layers, specifically focusing on the integration of CNN for image feature extraction. Exploring this underlying method, the use of another model is essential for caption generation. Here the Recurrent Neural Network (RNN) comes in use for caption generation from the extracted features. Models named Long Short-Term Memory (LSTM) based on RNN and Bidirectional encoder representation transformer (BERT) based on Transformers have been prominent in ensuring accurate results. The Flicker8k dataset is used which provides a variety of information useful for model training. By testing validation data along with evaluation metrics, we analyze the effectiveness of different models to create consistent and descriptive headlines. Extending our inquiry to encompass title generation using transformer models, while also exploring learning techniques for real-time title generation and delivery using the Open-CV library available in Python to get the output from the camera and display it on screen. The result shows that the LSTM is the best model for captioning, with an accuracy of 65.07% at the epochs of 300 and the BERT model has an accuracy of 31% at the epochs of 2. The findings of this study not only contribute to advancing subtitle enhancement methodologies but also broaden the potential applications of deep learning techniques in this domain. © 2024 IEEE.

关键词： Convolutional neural networks

来源：评论

学校读者我要写书评

暂无评论

Temporal deep learning image processing Model for Natural Gas Leak Detection Using OGI Camera

Temporal Deep Learning Image Processing Model for Natural Ga...

引用

2024 Offshore Technology Conference Asia, OTCA 2024

作者： Korjani, Mehdi Conley, David Smith, Mark Clean Connect Inc DenverCO United States

ISBN: (纸本)9781959025030

Natural gas extraction systems often encounter manufacturing defects or develop defects over time, leading to gas leaks. These leaks pose challenges, causing revenue losses and environmental pollution. Detecting gas leaks in the vast array of extraction, transfer, and storage equipment within these systems can be arduous, allowing leaks to persist unnoticed. Additionally, natural gas leaks are not visible to naked eyes, further complicating their detection. We developed a novel deep learning image processing model that utilizes videos captured by a specialized Optical Gas Imaging (OGI) camera to detect natural gas leaks. The temporal deep learning algorithm is designed to identify patterns associated with gas leaks and improve its performance through supervised learning. Our model incorporates algorithms to detect background environments, motion, equipment, and classify gas leaks. Our model employs leak identification algorithms to determine the presence of gas leaks. These algorithms calculate the probability of detected motion indicating a gas leak based on long-term and short-term background subtraction, detected motion, motion duration, equipment location, and telemetry data. To minimize false positives, we have developed image segmentation and object detection models to identify known objects, such as equipment, people, and cars, within the video footage. To train our model we collect more than 10,000 short videos from real fields and include simulated data with known rate controlled gas release in different situations. Data consist of wide range of weather situations including different temperature, wind speed, humidity in sunny, rainy, and snowy fields. We validated our model by conducting experiments involving actual footage from the field. The model achieved a 98% true positive rate, and a 100% true negative rate, correctly refraining from sending an alarm for all non-releases. Additionally, we developed a postprocessing algorithm capable of estimating the

关键词：

来源：评论

学校读者我要写书评

暂无评论

A method for nighttime tomato fruit detection and occlusion judgment based on deep learning and image processing

A method for nighttime tomato fruit detection and occlusion ...

引用

2024 International Conference on Optical and Photonic Engineering, icOPEN 2024

作者： Lin, Zhonglong Zhang, Caihong Liang, Zhi Zou, Xiangjun Li, Xiaojuan School of Mechanical Engineering Xinjiang University Urumqi830000 China Institute of Agricultural Mechanization Xinjiang Academy of Agricultural Sciences Xinjiang830091 China

ISBN: (数字)9781510688117

ISBN: (纸本)9781510688100

Nighttime detection and harvesting are key issues for achieving all-day operation of tomato-picking robots. Currently, most general detection algorithms are limited to natural daylight conditions, with significantly reduced performance in nighttime environments. To address the issues of low accuracy and poor robustness of nighttime tomato detection algorithms, a high-precision nighttime tomato detection method based on the integration of deep learning and image processing is proposed. This study designed multiple sets of nighttime RGB lighting experiments to calculate the HSV color distance between ripe tomatoes and the background under each lighting condition, determining the optimal lighting color to enhance the contrast between tomatoes and the background. Under the optimal lighting conditions, an RGB image dataset of nighttime tomatoes was constructed, and an YOLOv8-based nighttime detection model was trained to achieve precise detection and localization of nighttime tomato targets. Within the detected target frames of ripe tomatoes, image processing methods such as OTSU, Hough detection, and connected component analysis were used to judge and analyze the occlusion situation of tomatoes, distinguishing between occlusion types (leaves or branches), and providing guidance for optimizing the robot's picking strategy. Finally, this study verifies the effectiveness of the algorithm through multiple sets of experiments. The algorithm has an overall accuracy rate of 84% and can be deployed on edge devices to achieve efficient real-time detection tasks while ensuring performance. © 2025 SPIE.

关键词： Robots

来源：评论

学校读者我要写书评

暂无评论

real-time image processing and deep learning 2020

Real-Time Image Processing and Deep Learning 2020

引用

real-time image processing and deep learning 2020

ISBN: (纸本)9781510635791

The proceedings contain 12 papers. The topics discussed include: real-time detection of maize crop disease via a deep learning-based smartphone app;parallel artificial neural networks using wavelet-based features for classification of remote-sensing hyperspectral images;no-reference image quality assessment based on residual neural networks (ResNets);coverless image steganography framework using distance local binary pattern and convolutional neural network;the combined denoising of images on the optical and thermal range onboard the UAV;portable flow device using Fourier ptychography microscopy and deep learning for detection of biosignatures;parallel color image watermarking scheme for multiple picture object based on multithreading coding;and performance analysis of semantic segmentation algorithms trained with JPEG compressed datasets.

关键词：

来源：评论

学校读者我要写书评

暂无评论

Design of an AMR Using image processing and deep learning for Monitoring Safety Aspects in Warehouse

Design of an AMR Using Image Processing and Deep Learning fo...

引用

IST-Africa Conference

作者： Pooloo, Nabeelah Aumeer, Wafiik Khoodeeram, Rajeev Univ Mascareignes Ave Concorde Roches Brunes 71203 Rose Hill Mauritius

ISBN: (数字)9781905824694

ISBN: (纸本)9781905824694

The latest spinoffs in the field of Autonomous Vehicles have paved way for a revolution in mobility and transportation;particularly in the warehousing and distribution sector. AMRs, Autonomous Mobile Robots, are being deployed to assist in warehousing activities as they present multiple advantages. In this paper, an AMR coupled with image processing and deep learning is introduced as a novel approach to solve a two-fold problem: surveillance and disinfection. deep learning will make use of real-time data collected by the AMR's camera as a smart surveillance method for abnormal event detection. YOLOv4 is used to train a custom dataset for object detection on five different classes. The latter obtained a 74.40% accuracy. The vehicle will also be used to diffuse disinfecting agents as a mean to sanitize the stores and stocks against Covid-19. Moreover, autonomous navigation of the AMR will be based on image processing techniques for path track detection.

关键词： Autonomous Mobile Robots image processing deep learning YOLOv4

来源：评论

学校读者我要写书评

暂无评论

Research on License Plate Character Recognition Technology Based on image processing and deep learning

Research on License Plate Character Recognition Technology B...

引用

IEEE International Conference on Electrical Engineering, Big Data and Algorithms (EEBDA)

作者： Chen, Chun Zhong, Xiaolei Honghe Vocat & Tech Coll Honghe 661100 Peoples R China Yunnan Univ Dianchi Coll Kunming 650228 Yunnan Peoples R China

ISBN: (纸本)9781665416061

Character recognition methods are applied in many fields, greatly improving work efficiency in daily life[1], such as license plate retrieval, invoice printing recognition, lottery betting codes, tax reports, etc. Digital recognition has been widely used in the field of computer vision and image recognition, and deep learning algorithms are currently popular image recognition algorithms. deep learning has been widely studied and applied in target recognition and speech content recognition. With the rapid increase in production requirements and computer data processing speed, the application of character recognition in actual production and life is becoming more and more common[2]. It is also extremely important for automatic retrieval and real-time, fast and accurate character input. However, traditional pattern recognition and feature extraction algorithms cannot well meet the requirements of real-time and correctness in production. At the same time, due to the vigorous development of deep learning, character recognition technology based on deep learning has advantages that traditional recognition algorithms cannot match. This paper proposes a barcode recognition algorithm based on a deep neural network combined with a global optimization method. It uses a convolutional recurrent network to extract the characteristics of each character in the barcode and classify it. Compared with the traditional method, it has stronger adaptability and generalization. Chemical energy.

关键词： deep learning character recognition recognition algorithms

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：