检索结果-内蒙古大学图书馆

Enhancing autonomous pavement crack detection: Optimizing YOLOv5s algorithm with advanced deep learning techniques

MEASUREMENT 2025年 240卷

作者： Zhou, Shuangxi Yang, Dan Zhang, Ziyu Zhang, Jinwen Qu, Fulin Punetha, Piyush Li, Wengui Li, Ning Guangzhou Maritime Univ Sch Civil & Engn Management Guangzhou 510725 Peoples R China East China Jiaotong Univ Sch Civil Engn & Architecture Nanchang 330013 Peoples R China Univ New South Wales Ctr Infrastruct Engn & Safety Sch Civil & Environm Engn Sydney NSW 2052 Australia Univ Technol Sydney Sch Civil & Environm Engn Ultimo NSW 2007 Australia Univ Manchester Dept Solids & Struct Manchester M13 9PL England

To enhance the safety and comfort of vehicle travel, detecting pavement cracks is a critical task in road management. This article introduces an advanced single-stage target detection method utilizing the YOLOv5s algorithm to enhance real-time performance and accuracy. Initially, Squeeze-and-Excitation Networks are integrated into the model to facilitate self-learning for improved crack characterization. Subsequently, anchors computed through the K-means clustering algorithm are closely aligned with the fracture dataset, achieving an adaptation rate of 99.9 % and enhancing the recall rate of the model. Furthermore, the inclusion of the SimSPPF module from YOLOv6 diminishes memory usage and expedites detection speed. By replacing the original nearest up- sampling method with transposed convolution, optimization of up-sampling for crack datasets is achieved. Performance assessments reveal that the refined YOLOv5s algorithm attains an F1 score of 91 %, a mean Average Precision (mAP) of 93.6 %, and a 1.54 % increase in frames per second (fps) for pavement crack detection. This enhancement in detection technology signifies a substantial advancement in the maintenance and longevity of road infrastructure.

关键词： Road maintenance Crack detection YOLO algorithm AI integration Advanced image processing

来源：评论

学校读者我要写书评

暂无评论

MoReSo: A DNN Framework Expediting Content-based Video image Retrieval (CBVIR) 32

MoReSo: A DNN Framework Expediting Content-based Video Image...

引用

32nd European Signal processing Conference (EUSIPCO)

作者： Li, Sinian Profeta, Doruk Barokas Dauwels, Justin Delft Univ Technol Signal Proc Syst Dept Microelect Delft Netherlands

ISBN: (纸本)9789464593617;9798331519773

With the exponential growth of video data, individuals, particularly scholars in the fields of history and sociology, are increasingly reliant on video materials. However, the task of locating specific frames within videos remains a laborious and time-consuming endeavor. Advanced machine learning-assisted video processing techniques have emerged, including text-based video searches, video summarization, real-time object detection, and person re-identification. However, distinct from these, the main challenge of retrieving video frames based on given visual content is how to efficiently and accurately pinpoint the instance occurrences. To expedite the process while maintaining retrieval performance, we propose a two-stage approach, combining KeyFrame Extraction (KFE) and Content-based image Retrieval (CBIR), underpinned a DNN-empowered framework called MoReSo. Our innovations include 1) the integration of improved statistical features with dynamic clustering in the KFE stage and 2) the development of the MoReSo framework, which consists of MobileNet and ResNet backbones with SOA layer to jointly represent video frames, achieving 2.67x increase in efficiency compared to existing solutions. Our framework is evaluated on two datasets: the annotated EHM Historical Database provided by digital history researchers and the widely-used image retrieval benchmark datasets, the Oxford and Paris datasets. The experimental results showcase that the proposed framework and scheme excel among other models in the CBVIR task. We make our code available for further exploration through our GitHub repository. This repository contains the implementation of our model and CBVIR system with a GUI prototype.

关键词： Content-Based Video image Retrieval Content-Based image Retrieval Key Frame Extraction image Retrieval from Video

来源：评论

学校读者我要写书评

暂无评论

Fuzzy-based video compression using bilinear fuzzy relation equations

引用

Journal of Ambient Intelligence and Humanized Computing 2024年第4期15卷 2215-2225页

作者： Cardone, Barbara Di Martino, Ferdinando Dipartimento Di Architettura Università Degli Studi Di Napoli Federico II Via Toledo 402 Naples80134 Italy Università Degli Studi Di Napoli Federico II Centro Interdipartimentale Di Ricerca "A. Calza Bini" Via Toledo 402 Naples Italy

We present a novel color video compression method using the greatest solution of a system of bilinear fuzzy relation equations to assess the similarity between frames. The frames in each band are treated separately and each frame is classified as an Intra frame or a Predictive frame. A frame is labelled as Predictive frame, and compressed more than an Intra-frame, if the similarity value with the previous Intra frame is higher than a selected threshold;A pre-processing activity is performed to select the optimal threshold value of the similarity between frames. The proposed method allows to supply a high quality of the reconstructed frames and has the advantage of not requiring high CPU time and memory storage for its execution;it was tested on color videos of the Fast-Moving Objects dataset;the results show that it produces better performances than the Lukasiewicz similarity-based video compression method and comparable with those achieved by MPEG-4 and the deep learning video compression method DVC_pro. The results show that the quality of the reconstructed frames obtained with BFRE is comparable with that of DVC Pro, but has a lower computational complexity, providing better performances in terms of video encoding speed. © The Author(s) 2024.

关键词： image compression

来源：评论

学校读者我要写书评

暂无评论

Research on style transfer and automation in visual communication design based on deep learning 2

Research on style transfer and automation in visual communic...

引用

2nd International Conference on Big Data, Computational Intelligence, and Applications, BDCIA 2024

作者： Cao, Wenjie Department of Design University of California DavisCA95616 United States

ISBN: (纸本)9781510689053

The integration of deep learning into visual communication design offers transformative possibilities for style transfer and automation. This paper proposes a framework that combines neural style transfer (NST) techniques with automation strategies to enable efficient and flexible design processes. The framework leverages convolutional neural networks (CNNs) to capture content and style features from visual input and applies an adaptive loss function to synthesize designs that balance artistic style and content fidelity. Furthermore, the system incorporates automation pipelines for batch processing, real-time rendering, and parameter optimization, streamlining design workflows. Experimental results demonstrate the framework's ability to generate high-quality design outputs across diverse styles while significantly reducing manual effort. This study provides a novel approach to bridging artistic creativity and computational efficiency, offering practical applications in advertising, branding, and multimedia design. © 2025 SPIE.

关键词： Convolutional neural networks

来源：评论

学校读者我要写书评

暂无评论

HARNESSING VIDEO INTELLIGENCE: INTELLIGENT SYSTEM FOR ADHD DETECTION 49

HARNESSING VIDEO INTELLIGENCE: INTELLIGENT SYSTEM FOR ADHD D...

引用

49th IEEE International Conference on Acoustics, Speech, and Signal processing (ICASSP)

作者： Li, Yichun Nair, Rajesh Naqvi, Syed Mohsen Newcastle Univ Intelligent Sensing & Commun Res Grp Newcastle Upon Tyne Tyne & Wear England NHS Fdn Trust Cumbria Northumberland Tyne & Wear CNTW Newcastle Upon Tyne Tyne & Wear England

ISBN: (纸本)9798350374520;9798350374513

Attention Deficit Hyperactivity Disorder (ADHD) causes significant impairment in various domains. It is known in the Medical Statistical Manual of Mental Disorders, Fifth Edition (DSM-V) that the symptoms of ADHD may manifest in actions and daily behaviors. deep learning methods based on fMRI and EEG have improved the efficiency of the ADHD detection process. However, the cost of the specialized equipment and trained staff required by the existing methods are generally huge. Therefore, we introduce the action recognition network based on raw RGB videos to ADHD detection for the first time. We also extract corresponding action characteristics with two proposed novel measurements: Attention Deficit Ratio (RAD) and Stationary Ratio (RS) based on the action features of ADHD. The two-stage final ADHD detection is decided with RAD and RS fusion and achieves a high accuracy of 95.5% in our real multimodal dataset. The dataset recorded in our Intelligent Sensing Laboratory has been processed and reported to CNTW-NHS Foundation Trust, which will be reviewed by medical consultants/professionals and to be made public in due course.

关键词： ADHD detection intelligent healthcare application RGB video processing machine learning

来源：评论

学校读者我要写书评

暂无评论

Single image Dehazing Using A Tiramisu Auto- Encoder

Single Image Dehazing Using A Tiramisu Auto- Encoder

引用

2024 International Conference on Intelligent Systems for Cybersecurity, ISCS 2024

作者： Ray, Ahan Sharanya, S. SRM Institute of Science and Technology Department of Data Science and Business Systems Chennai India

ISBN: (纸本)9798350375237

This paper presents a comprehensive approach to address the challenge of dehazing on-road images by synthesising datasets and training advanced deep learning models. Leveraging Pix2Pix GAN and introducing a novel Tiramisu Autoencoder architecture, the research endeavours to overcome the scarcity of real-world data through data augmentation and synthesis. The Pix2Pix GAN is modified to generate realistic haze, while the Tiramisu Autoencoder is used to de-haze the images. Challenges in data collection, including the absence of on-road data and time-series data, are elucidated. The novel architecture demonstrates promising results on benchmark datasets. The research boasts advances in both data synthesis and image de-hazing. © 2024 IEEE.

关键词： Roads and streets

来源：评论

学校读者我要写书评

暂无评论

Integrated Aquaculture Monitoring System Using Combined Wireless Sensor Networks and deep Reinforcement learning

引用

Sensors and Materials 2024年第3期36卷 1019-1033页

作者： Sung, Wen-Tsai Isa, Indra Griha Tofik Hsiao, Sung-Jung Department of Electrical Engineering National Chin-Yi University of Technology Zhongshan Rd Section 2 No. 57 Taichung City411030 Taiwan Department of Information Technology Takming University of Science and Technology Taipei City11451 Taiwan

Freshwater fish is one of the commodities experiencing an increasing growth rate from 1990 to 2018. Many efforts have been made to meet market needs, through both fisheries technology and applied technology, one of which is an integrated monitoring system. In this study, an aquaculture monitoring system was developed that integrates wireless sensor networks (WSNs) based on temperature, pH, and turbidity with deep reinforcement learning. The purpose of this study is to produce a convenient, precise, and low-cost aquaculture monitoring system. The stages of the study are (1) the integration of all the WSN components, (2) the validation of the WSNs, (3) the implementation of the analysis model in the system, (4) the implementation of the recommended model into the DRL system, and (5) practical experimentation using the aquaculture monitoring system. The WSN validation results indicate that the average percentage error is 3.23%, whereas at the system modeling stage, the optimal accuracy is 98.80%. In the experiment to monitor real aquaculture environmental conditions, an accuracy of 97% is obtained. © 2024 M Y U Scientific Publishing Division. All rights reserved.

关键词： Wireless sensor networks

来源：评论

学校读者我要写书评

暂无评论

deep Transfer learning from Constrained Source for Abdominal CT and MR image Segmentation

Deep Transfer Learning from Constrained Source for Abdominal...

引用

Conference on Medical Imaging - image processing

作者： Krishnan, Chetana Schmidt, Emma Onuoha, Ezinwanne Mrug, Michal Cardenas, Carlos E. Kim, Harrison Univ Alabama Birmingham Dept Biomed Engn Birmingham AL 35294 USA Univ Alabama Birmingham Dept Nephrol Birmingham AL USA Univ Alabama Birmingham Dept Radiat Oncol Birmingham AL USA Univ Alabama Birmingham Dept Radiol Birmingham AL USA Dept Vet Affairs Med Ctr Birmingham AL 35233 USA

ISBN: (纸本)9781510671577;9781510671560

Medical image segmentation benefits from machine learning advancements, offering potential automation. Yet, accuracy depends on substantial annotated data and significant computing resources. Transfer learning addresses these challenges by leveraging a model's knowledge from one task for another with minor adjustments. The idea is to adapt learned features to new tasks, even with differing datasets but shared characteristics. Studies explore the impact of using large source datasets for limited target datasets. This investigation focuses on transferring knowledge from a limited source to enhance model versatility across various tasks. Our goal involved transferring knowledge from an advanced model trained on T2 weighted MR images related to Autosomal Dominant Polycystic Kidney Disease (ADPKD) for kidney and cyst segmentation (referred to as "Lsource"). This transfer was directed towards five distinct target datasets: CT liver, CT kidneys, CT spleen, MRI kidneys, and CT multimodal data (target datasets 1 through 5). The primary objective was to achieve accurate segmentation on these target datasets while saving time and computational resources. This approach is especially valuable when obtaining a substantial, labeled mouse PKD MRI target dataset is challenging, and the source dataset itself is resource- intensive. Using transfer learning from source 1 onto target sets 1 to 5 resulted in mean Dice Similarity Coefficients (DSCs) of 0.94 +/- 0.04, 0.97 +/- 0.02, 0.95 +/- 0.03, 0.96 +/- 0.01, 0.93 +/- 0.02, respectively. Similarly, employing source 2 yielded mean DSCs of 0.95 +/- 0.04, 0.96 +/- 0.02, 0.95 +/- 0.02, 0.96 +/- 0.02, and 0.93 +/- 0.02 for the same target sets. Despite variations in pathological conditions, image characteristics, and imaging modalities, the transfer learning approach produced DSC values comparable to the initial published outcomes. This accomplishment was achieved with reduced training requirements, faster convergence times, and decreased co

关键词： Transfer learning Medical image segmentation Autosomal polycystic kidney disease Machine learning Abdominal organ segmentation UNet Fine-tuning

来源：评论

学校读者我要写书评

暂无评论

5th International Conference on deep learning, Artificial Intelligence and Robotics, ICDLAIR 2023

5th International Conference on Deep Learning, Artificial In...

引用

5th International Conference on deep learning, Artificial Intelligence and Robotics, ICDLAIR 2023

ISBN: (纸本)9783031609343

The proceedings contain 70 papers. The special focus in this conference is on deep learning, Artificial Intelligence and Robotics. The topics include: A Short Survey on Comparative Study of Modern Cryptography Approach;advances in Computer-Aided Detection and Diagnosis of Retinal Diseases: A Comprehensive Survey of Fundal image Analysis;driver Safety and Drowsiness Detection in Internet of Vehicles with Federated learning;privacy Preserving Fingerprint Classification Using Federated learning;comparative Study of Ensemble learning Models for Smart Meter Load;social-Media Video Summarization Using Convolutional Neural Network and Kohnen’s Self Organizing Map;machine learning and deep Leaning in Predicting Coronary Heart Disease;Augmented Super Resolution GAN (ASRGAN) for image Enhancement Through Reinforced Discriminator;Convolutional Block Attention Assisted Dense Stacked Bi-LSTM for the Generation of RDF Statements;real-time Permanent Change Proposals for Abandoned Object Detection;an Excursion to Ontology-Based Non-functional Requirements Specification;a Review of Traditional and Neural Network Methods for Protecting Privacy in Big Data Analytics;a Long Short-Term Memory learning Based Malicious Node Detection for Clustering in Wireless Sensor Networks;experimental Analysis for Sensor Reduction to Depict real-time Applications Through Regression Techniques;multi-resolution Neural Network for Road Scene Segmentation;A CNN-Based Road Accident Detection and Comparison of Classification Techniques;football Match Result Prediction Using Twitter Statistical/Historical Data;safeguarding Ecosystems and Efficiency in Peer-to-Peer File Sharing Systems: An IoT-Inspired Approach to Pollution Mitigation;a Heuristic for Minimizing Resource Requirement for Quantum Graph Neural Networks;Light-Gated Recurrent Unit Based Acoustic Modeling for Improved Hindi ASR;Detecting Phishing URLs Using Machine learning: A Review;comparative Analysis of Pneumonia Detection from Chest X-ray Using

关键词：

来源：评论

学校读者我要写书评

暂无评论

A Smartphone-based deep learning Framework for Early Detection of Oral Cancer Signs 4

A Smartphone-based Deep Learning Framework for Early Detecti...

引用

4th International Conference on Emerging Systems and Intelligent Computing, ESIC 2024

作者： Baliarsingh, Santos Kumar Dev, Prabhu Prasad Bandyopadhyay, Anjan Dash, Amiya Kumar Pradhan, Roshni Kiit Deemed to Be University School of Computer Engineering Bhubaneswar751024 India

ISBN: (纸本)9798350349856

Oral cancer can become non-fatal if promptly detected and treated with medication. However, failure to diagnose cancer at an early stage poses a significant risk to lives. Therefore, early detection of oral cancer plays a vital role in preserving lives. Recently, there has been an increase in the use of deep learning (DL) algorithms for early disease diagnosis. We introduce a highly effective method for diagnosing pre-cancerous lesions in the oral cavity using smartphone-based deep learning (DL) framework. This method holds the promise of decreasing illness, death rates, and the overall expenses associated with healthcare. Initially, the oral lesion images were captured using a hand-held smartphone. Then, the lesion samples were annotated by skilled oral pathologists who delineated bounding boxes around the affected areas. This annotation process utilized the Visual Geometry Group (VGG) image annotator tool and led to the creation of an oral lesion dataset encompassing four distinct classes. For lesion detection, You Only Look Once (YOLOv8) is employed, while image classification is carried out using ConvNextBase architecture. The proposed method achieves impressive performance with an accuracy of 87.89%. Preliminary results showcase the feasibility of our real-time automated approach for detecting and classifying oral lesions. With its low-cost and non-invasive nature, The proposed framework has a lot of potential as an useful tool to aid the screening process and improve oral cancer detection. © 2024 IEEE.

关键词： Smartphones

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：