检索结果-内蒙古大学图书馆

Automated deep learning system for power line inspection image analysis and processing: architecture and design issues

引用

Global Energy Interconnection 2023年第5期6卷 614-633页

作者： Daoxing Li Xiaohui Wang Jie Zhang Zhixiang Ji China Electric Power Research Institute Co. LtdHaidian DistrictBeijing 100192P.R.China Sichuan Electric Power Research Institute SGCC Chengdu 610041P.R.China

The continuous growth in the scale of unmanned aerial vehicle (UAV) applications in transmission line inspection has resulted in a corresponding increase in the demand for UAV inspection image processing. Owing to its excellent performance in computer vision, deep learning has been applied to UAV inspection image processing tasks such as power line identification and insulator defect detection. Despite their excellent performance, electric power UAV inspection image processing models based on deep learning face several problems such as a small application scope, the need for constant retraining and optimization, and high R&D monetary and time costs due to the black-box and scene data-driven characteristics of deep learning. In this study, an automated deep learning system for electric power UAV inspection image analysis and processing is proposed as a solution to the aforementioned problems. This system design is based on the three critical design principles of generalizability, extensibility, and automation. Pre-trained models, fine-tuning (downstream task adaptation), and automated machine learning, which are closely related to these design principles, are reviewed. In addition, an automated deep learning system architecture for electric power UAV inspection image analysis and processing is presented. A prototype system was constructed and experiments were conducted on the two electric power UAV inspection image analysis and processing tasks of insulator self-detonation and bird nest recognition. The models constructed using the prototype system achieved 91.36% and 86.13% mAP for insulator self-detonation and bird nest recognition, respectively. This demonstrates that the system design concept is reasonable and the system architecture feasible .

关键词： Transmission line inspection Deep learning Automated machine learning image analysis and processing

来源：评论

学校读者我要写书评

暂无评论

Challenges of Real-time processing with Embedded vision for IoT applications

Challenges of Real-time Processing with Embedded Vision for ...

引用

2022 International Conference on Electrical, Computer, Communications and Mechatronics Engineering, ICECCME 2022

作者： Lee, Suk Jin TSYS School of Computer Science Columbus State University Columbus United States

ISBN: (数字)9781665470957

ISBN: (纸本)9781665470957

Recent advances in both Artificial Intelligent (AI) and the Internet of Things (loT) make it possible to implement surveillance systems that can detect and recognize objects in an automatic manner. It is still challenging to set up the embedded visions in resource-limited devices. This study investigates how the system configuration and the memory size affect the performance of embedded vision applications. For this proj ect, we set up a vision sensor accessible wirelessly from a single board computer (SBC). The designed system utilizes Raspberry Pi (SBC) as a computing server, and Python programming language to receive the captured images from the vision sensor using an open-source computer vision and machine learning software (OpenCV) library, which can detect objects and allow for easy accommodation with high performance. We also use a set of Python classes (image ZM Q) that can transport live video streams from one controller to another for a distributed image processing network. We evaluate the performance with frame rate, frame transfer delay, and frame processing time. The proposed system is cost-efficient and suitable for home security, industrial wireless control, and other loT applications. © 2022 IEEE.

关键词： Internet of things

来源：评论

学校读者我要写书评

暂无评论

Optimizing Geospatial Data for ML/CV applications: A Python-Based Approach to Streamlining Map processing by Removing Irrelevant Areas

引用

APPLIED SCIENCES-BASEL 2024年第24期14卷 11978-11978页

作者： Kasperek, David Podpora, Michal Opole Univ Technol Dept Comp Sci Proszkowska 76 PL-45758 Opole Poland

Massive image datasets are often required for the proper functioning of machine Learning (ML) and Computer vision (CV) applications. This paper offers a solution to computational challenges in the image processing of satellite imagery, by proposing an optimization procedure. The presented approach is verified by an exemplary Python implementation, constituting a standalone tool for automating the dataset creation and labeling, including the extraction of road network data from the national satellite cartography provider. The collected data include detailed road maps along with the parcel information obtained via WebMapService endpoints. The method presented in this paper involves three basic steps: road segmentation (using the Shapely module) to facilitate handling high-resolution orthoimagery, and then a modified Region-of-Interest approach, i.e., removing irrelevant areas, with only roads remaining. This results in obtaining file sizes that are significantly smaller. The presented algorithm also involves asynchronous tile downloading, which, combined with the masking of irrelevant areas, improves not only the efficiency but surprisingly also the accuracy of subsequent ML/CV procedures. The research results of the paper reveal substantial file size reduction, and improved processing efficiency, thus making the optimized geospatial graphical data more practical for ML/CV applications, while still maintaining the original data quality and relevance of the analyzed parcels or infrastructure.

关键词： satellite imagery road segmentation geospatial data optimization dataset creation and labeling machine learning image processing

来源：评论

学校读者我要写书评

暂无评论

OPTIMIZING CNN PERFORMANCE ON CIFAR-10: A STUDY ON MULTI-CORE processing, BATCHING TECHNIQUES, AND FUTURE NETWORK ARCHITECTURES 2

OPTIMIZING CNN PERFORMANCE ON CIFAR-10: A STUDY ON MULTI-COR...

引用

2nd International Conference on Mechatronic Automation and Electrical Engineering, ICMAEE 2024

作者： Jin, Datong College of Art and Science The Ohio State University Columbus43210 United States

ISBN: (纸本)9781837242672

The performance of Convolutional Neural Networks (CNNs) is critically dependent on the underlying computational configurations, particularly in terms of processing capabilities and data handling techniques. As deep learning models become increasingly complex and data-intensive, the choice between single-core and multi-core processing, along with strategies such as batching, becomes pivotal. These configurations directly impact the efficiency and speed of model training and inference, which are essential for applications requiring real-time processing and high accuracy, such as image recognition and automated systems. Study Contributions: This research focuses on comparing the effects of single-core versus multi-core processing configurations on CNN performance, integrating batching techniques to observe their combined influence on reducing latency and increasing throughput. The study reveals that leveraging multi-core processing with batching substantially enhances CNN operations. Additionally, it addresses the deployment challenges of the YOLOv5 architecture, emphasizing the necessity for further architectural improvements. The paper suggests that future enhancements could include adopting deeper network structures to elaborate on complex features and employing methods like ResNet to counteract gradient vanishing problems. Moreover, it advocates for the fine-tuning of models with pre-trained datasets such as imageNet to boost robustness and accuracy, particularly demonstrated on the CIFAR-10 dataset. These recommendations aim to refine CNN capabilities, thereby advancing the broader field of machine learning and computer vision. © The Institution of Engineering & Technology 2024.

关键词： Convolutional neural networks

来源：评论

学校读者我要写书评

暂无评论

Gesture Recognition of Filipino Sign Language Using Convolutional and Long-Short Term Memory Neural Network 1

引用

Intelligent Systems Conference (IntelliSys)

作者： Cayme, Karl Jensen F. Retutal, Vince Andrei B. Salubre, Miguel Edwin P. Canete, Luis Gerardo S. Astillo, Philip Virgil B. Univ San Carlos Cebu Philippines

ISBN: (数字)9783031477249

ISBN: (纸本)9783031477232;9783031477249

Sign language is a form of communication prominently used by the deaf-mute community to convey their ideas and thoughts. In the Philippines, local signers use Filipino Sign Language (FSL) derived from the well-known American Sign Language (ASL). Despite the recent formalization of FSL as the country's official sign language, there is still minimal familiarity among the public. That said, Sign Language Recognition (SLR) systems integrated with machine learning applications have been developed to understand FSL better. However, the prevalent limitations of most of these systems are that it only involves static signs and asynchronous recognition. This study aimed to take this solution further and overcome existing limitations by developing a model capable of recognizing FSL gestures in real-time usable for applications such as in government service centers. To this end, the study proposes the deep learning algorithm Convolutional and Long Short-Term Memory Neural Networks in system capturing of real-time signs from a signer. The proponents considered 15 signs related to common greetings and business transactions. A total of 450 video recordings were collected for the signs with each having an equal number of samples. The collected data underwent cleaning, preprocessing, and augmentation before training. The proposed model's performance was analyzed with the following classification metrics: Accuracy, Precision, Recall, and F1-Score, and was able to achieve 95% accuracy and a macro-average of 0.95 precision, 0.95 Recall, and 0.95 F1-Score. Furthermore, the model had a comparable accuracy and loss between validation and test data-a 95.18% accuracy and 0.13629 loss on validation while 95.93% accuracy and loss of 0.1478 on the test. With that said, the proposed model was well-fit for classifying the 15 signs that involve upper body movements.

关键词： Computer vision CNN-LSTM Deep learning Filipino sign language image processing Sign language recognition system machine learning

来源：评论

学校读者我要写书评

暂无评论

Artificial Intelligences-Based Approaches for Generating image Caption 3rd

Artificial Intelligences-Based Approaches for Generating Ima...

引用

3rd International Conference on Data, Engineering, and applications, IDEA 2021

作者： Ingale, S.P. Bamnote, G.R. Prof Ram Meghe Institute of Technology and Research Badnera India

ISBN: (纸本)9789811946868

Most of the visual information presented to humans doesn’t have a description associated with it, but humans can generally understand the visual information without any detailed description accompanying them. But for a machine to perform the same task of interpreting the same images and generate description in a Natural language form is a hard task. For a machine, the process involves two basic human tasks of seeing things and understanding them. In computing, these problems come under the domains of Natural Language processing and Computer vision. Therefore creating descriptions of the image we need a combination of both tasks to be performed. The image description process consists of two parts: first part is to detect the important objects, features attributes, and the relationships between them in an image that comes under the field of Computer vision, and then it should also generate semantically and syntactically accurate phrases, sentences which the area of Natural Language processing. As this task of Describing images with human-readable language is a cognitive task the recent progress in area of machine and Deep learning which are the subset of Artificial Intelligence in which computer algorithms try to copy and adapt the workings of the human brain in processing data and has accelerated the study of this challenging problem of image description. In this paper, the basic techniques and ways to do image captioning are discussed. © 2022, The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

关键词： Computer vision

来源：评论

学校读者我要写书评

暂无评论

Modern Computer vision applications for Plant Phenotyping in Agriculture

Modern Computer Vision Applications for Plant Phenotyping in...

引用

作者： Thesma, Vaishnavi University of Georgia

学位级别：M.S., Master of Science/Master of Surgery

The rapidly growing world population challenges farmers to meet the rising food demand. Monitoring crop phenotypes, or the physical plant traits, is useful in tracking plant development, maintaining plant health, and increasing yield. However, phenotyping efforts are traditionally manual and become tedious for large scale farms. Thus, it is imperative to develop autonomous solutions to monitor plants accurately, remotely, and timely. To meet this objective, computer vision techniques have been used by researchers to perform automatic plant phenotyping on video and image data collected from either indoor, controlled environments or from the field. Furthermore, these methods have focused on using traditional pixel-based processing, machine learning, and deep learning for plant phenotyping. In this study, various modern computer vision techniques are implemented to automatically phenotype plants for agriculture applications, thereby reducing manual labor while accurately detecting important traits to help increase yield.

关键词： Crop monitoring Deep learning image processing machine learning Plant phenotyping

来源：评论

学校读者我要写书评

暂无评论

Fashion images Classification using machine Learning, Deep Learning and Transfer Learning Models 7

Fashion Images Classification using Machine Learning, Deep L...

引用

7th International Conference on image and Signal processing and their applications, ISPA 2022

作者： Samia, Bougareche Soraya, Zehani Malika, Mimi Biskra University Dept. of Electrical Engineering Biskra Algeria Mostaganem University Dept. of Electrical Engineering Mostaganem Algeria

ISBN: (纸本)9781665480420

Fashion is the way we present ourselves which mainly focuses on vision, has attracted great interest from computer vision researchers. It is generally used to search fashion products in online shopping malls to know the descriptive information of the product. The main objectives of our paper is to use deep learning (DL) and machine learning (ML) methods to correctly identify and categorize clothing images. In this work, we used ML algorithms (support vector machines (SVM), K-Nearest Neirghbors (KNN), Decision tree (DT), Random Forest (RF)), DL algorithms (Convolutionnal Neurals Network (CNN), AlexNet, GoogleNet, LeNet, LeNet5) and the transfer learning using a pretrained models (VGG16, MobileNet and RestNet50). We trained and tested our models online using google colaboratory with Tensorflow/Keras and Scikit-Learn libraries that support deep learning and machine learning in Python. The main metric used in our study to evaluate the performance of ML and DL algorithms is the accuracy and matrix confusion. The best result for the ML models is obtained with the use of ANN (88.71%) and for the DL models is obtained for the GoogleNet architecture (93.75%). The results obtained showed that the number of epochs and the depth of the network have an effect in obtaining the best results. © 2022 IEEE.

关键词： image classification

来源：评论

学校读者我要写书评

暂无评论

Speech Emotion Recognition Using CNN-LSTM and vision Transformer 13th

Speech Emotion Recognition Using CNN-LSTM and Vision Transf...

引用

13th International Conference on Innovations in Bio-Inspired Computing and applications, IBICA 2022, and 12th World Congress on Information and Communication Technologies, WICT 2022

作者： Kumar, C S Ayush Maharana, Advaith Das Krishnan, Srinath Murali Hanuma, Sannidhi Sri Sai Lal, G. Jyothish Ravi, Vinayakumar Amrita School of Engineering Coimbatore Amrita Vishwa Vidyapeetham Coimbatore India Center for Artificial Intelligence Prince Mohammad Bin Fahd University Khobar Saudi Arabia

ISBN: (纸本)9783031274985

The importance of speech emotion recognition has increased as a result of the acceptance of intelligent conversational assistant services. The communication between humans and machines may be made better via emotion recognition and analysis. We propose the application of attention based deep learning techniques to process and recognize speech emotions. In this paper we look at two major approaches CNN-LSTM and Mel Spectrogram-vision Transformer based models and is compared over to the existing benchmarks. The experimental results roots for the feature extraction strategy of deep learning based approaches, eliminating the need of handpicking the features for traditional machine learning (ML) classifiers present in the current literature. A comparative study and evaluation between CNN-LSTM and vision Transformers (ViT) have been evaluated and established from the experimental results. Both the models performed similarly with CNN-LSTM giving an accuracy of 88.50% when compared to the accuracy of 85.36% by ViT surpassing the existing benchmarks and providing the scope of study of attention and image processing based learning for speech emotion recognition. © 2023, The Author(s), under exclusive license to Springer Nature Switzerland AG.

关键词： image processing

来源：评论

学校读者我要写书评

暂无评论

Recovering image Information from Speckle Noise by image processing 23

Recovering Image Information from Speckle Noise by Image Pro...

引用

6th International Conference on machine vision and applications, ICMVA 2023

作者： Nie, Jianlin Hanson, Steen G. Takeda, Mitsuo Wang, Wei Xi'An Technological University Shaanxi Xi'an China DTU Fotonik Department of Photonics Engineering Technical University of Denmark RoskildeDk-4000 Denmark Utsunomiya University Utsunomiya Tochigi Japan School of Engineering and Physical Sciences Heriot-Watt University EdinburghEH14 4AS United Kingdom

ISBN: (纸本)9781450399531

As a kind of noise, speckle seriously affects the imaging quality of optical imaging system. However, the speckle image carries a large amount of information related to the physical characteristics of the object surface, which can be used as the basis to identify and judge hidden objects. In this paper, speckle noise removal in optical imaging is studied. The average is derived for the squared moduli of spectra of short-exposure speckle images to recover the amplitude information. At the same time, cross-spectrum function is used to recover the phase information. We use this method to process the images. Then, the simulation experiment analysis is carried out by varying two aspects: the stacking numbers and the different objects. The results show that this method can recover the feature information from the speckle image, thus verifying the feasibility of the method. © 2023 ACM.

关键词： image processing

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：