检索结果-内蒙古大学图书馆

2024 conference on Visual Communications and image processing

作者： Kato, Haruhisa Kidani, Yoshitaka Kawamura, Kei KDDI Res Inc Saitama Japan

ISBN: (纸本)9798331529543;9798331529550

This paper introduces an advanced intra prediction method designed for the Enhanced Compression Model (ECM), which is the reference software for beyond versatile video coding (VVC) standard. It employs a learning-based method to adaptively assign weights for a weighted average across neighboring samples, resulting in more precise prediction samples. The proposed method derives optimized weights for each intra prediction mode, for each block size, and for each sample position. To achieve a reasonable balance between encoding time and prediction accuracy, the conventional intra prediction mode is shared with the proposed method. Experimental evaluations have demonstrated that the proposed method provides bitrate reduction of up to 0.4%.

关键词： video coding versatile video coding (VVC) intra prediction deep learning

来源：评论

学校读者我要写书评

暂无评论

Enhancing small traffic sign recognition based on an improved YOLOv8 algorithm

引用

SIGNAL image AND video processing 2025年第6期19卷 1-12页

作者： Zhang, Xiaoqing Wang, Weixi Wang, Qi Shan, Liang Yamane, Satoshi Zheng, Bochao Liu, Sifan Shen, Tianxing Ge, Dongming Nanjing Univ Informat Sci & Technol Sch Automat Nanjing 210044 Peoples R China Jiangsu Univ Sci & Technol Sch Mechatron & Automobile Engn Zhenjiang 212003 Peoples R China Nanjing Univ Sci & Technol Sch Automat Nanjing 210094 Peoples R China Saitama Univ Grad Sch Sci & Engn Saitama 3388570 Japan Nanjing Univ Informat Sci Technol Sch Elect & Informat Engn Nanjing 210044 Peoples R China

With advancements in computer vision and artificial intelligence, traffic sign recognition systems have become essential in advanced driver assistance and autonomous driving systems. These systems enable the precise detection of key road information. However, recognizing small traffic signs in real-world scenarios remains a significant challenge due to their limited size and features. In this study, we propose YOLO-TSR, an efficient approach for detecting small traffic signs, inspired by the YOLOv8 framework. This method offers three major contributions: (1) We introduce an efficient attention mechanism in the Backbone to enhance focus on small targets;(2) We propose a downsampling process using slicing and reassembling operations in the backbone, which preserve information and improve feature extraction for small objects;(3) We refine the upsampling process in the head by applying the content-aware CARAFE operation, which enhances the model's detection performance. Experiments on the challenging TT100K and CCTSDB2021 datasets show that YOLO-TSR achieves a mAP50 of 72.73% and a mAP50-95 of 56.57% on TT100K, and a mAP50 of 87.86% and a mAP50-95 of 57.78% on CCTSDB2021, surpassing the performance of the original YOLOv8n on both datasets. Additionally, this method is real-time and demonstrates great potential for applications in advanced driver assistance systems and autonomous driving systems.

关键词： Traffic sign recognition YOLOv8 Deep learning image processing Intelligent vehicle TT100k

来源：评论

学校读者我要写书评

暂无评论

TapToTab: video-Based Guitar Tabs Generation using AI and Audio Analysis 4

TapToTab: Video-Based Guitar Tabs Generation using AI and Au...

引用

4th International Mobile, Intelligent, and Ubiquitous Computing conference

作者： Ghaleb, Ali Elsadawy, Eslam Essam, Ihab Zaky, Seif-Eldin Abdelhakim, Mohamed Fahim, Natalie Bayoumi, Razan Hindy, Hanan Ain Shams Univ Fac Comp & Informat Sci Cairo Egypt

ISBN: (纸本)9798350367782;9798350367775

The automation of guitar tablature generation from video inputs holds significant promise for enhancing music education, transcription accuracy, and performance analysis. Existing methods face challenges with consistency and completeness, particularly in detecting fretboards and accurately identifying notes. To address these issues, this paper introduces an advanced approach leveraging deep learning, specifically YOLO models for real-time fretboard detection, and Fourier Transform-based audio analysis for precise note identification. Experimental results demonstrate substantial improvements in detection accuracy and robustness compared to traditional techniques. This paper outlines the development, implementation, and evaluation of these methodologies, aiming to revolutionize guitar instruction by automating the creation of guitar tabs from video recordings.

关键词： Automated Guitar Transcription Computer Vision YOLO Deep Learning Frequency Analysis Fourier Transform image processing Canny Edge Detection Fretboard Detection Audio-visual Integration Guitar Tablature Generation real-time Performance Analysis

来源：评论

学校读者我要写书评

暂无评论

SYNTHMANTICLIDAR: A SYNTHETIC DATASET FOR SEMANTIC SEGMENTATION ON LIDAR IMAGING 31

SYNTHMANTICLIDAR: A SYNTHETIC DATASET FOR SEMANTIC SEGMENTAT...

引用

2024 International conference on image processing

作者： Montalvo, Javier Carballeira, Pablo Garcia-Martin, Alvaro Univ Autonoma Madrid Video Proc & Understanding Lab Madrid Spain

ISBN: (纸本)9798350349405;9798350349399

Semantic segmentation on LiDAR imaging is increasingly gaining attention, as it can provide useful knowledge for perception systems and potential for autonomous driving. However, collecting and labeling real LiDAR data is an expensive and time-consuming task. While datasets such as SemanticKITTI [1] have been manually collected and labeled, the introduction of simulation tools such as CARLA [2], has enabled the creation of synthetic datasets on demand. In this work, we present a modified CARLA simulator designed with LiDAR semantic segmentation in mind, with new classes, more consistent object labeling with their counterparts from real datasets such as SemanticKITTI, and the possibility to adjust the object class distribution. Using this tool, we have generated SynthmanticLiDAR, a synthetic dataset for semantic segmentation on LiDAR imaging, designed to be similar to SemanticKITTI, and we evaluate its contribution to the training process of different semantic segmentation algorithms by using a naive transfer learning approach. Our results show that incorporating SynthmanticLiDAR into the training process improves the overall performance of tested algorithms, proving the usefulness of our dataset, and therefore, our adapted CARLA simulator. The dataset and simulator are available in https://***/vpulab/SynthmanticLiDAR.

关键词： Dataset LiDAR Segmentation Simulator

来源：评论

学校读者我要写书评

暂无评论

A Survey on real-time Object Detection on FPGAs

引用

IEEE ACCESS 2025年 13卷 38195-38238页

作者： Hozhabr, Seyed Hani Giorgi, Roberto Univ Siena Dept Informat Engn & Math I-53100 Siena Italy Consorzio Interuniv Nazl Informat I-00185 Rome Italy

This paper focuses on real-time object detection systems, analyzing existing Field-Programmable Gate Arrays (FPGAs) implementations that aim to achieve the best efficiency, performance, and accuracy at the same time. These three metrics are typically crucial for domains such as autonomous driving, and robotics. Fortunately, recent advancements in object detection models, particularly based on Convolutional Neural Networks (CNNs), have significantly improved object detection accuracy and speed. When these models are combined with FPGAs, it is possible to achieve even more power efficiency and more easily satisfy real-time constraints. FPGAs can deliver low latency and high throughput by leveraging true parallelism making them suitable platforms for developing real-time object detection systems. This paper reviews existing literature on FPGA-based real-time object detection, discussing commonly used algorithms, acceleration techniques, and optimization strategies. Evaluation metrics and typical datasets for assessing real-time systems are also examined. We have compared the performance of these implementations by using pixel throughput as a fair metric across different systems while processing video streams or images. Insights into state-of-the-art works, comparative analysis, challenges, and future research directions are provided to guide researchers interested in leveraging FPGA devices for real-time object detection applications.

关键词： Object detection real-time systems Field programmable gate arrays Detectors Surveys Accuracy Reviews image edge detection Feature extraction Optimization real-time object detection FPGA convolutional neural networks (CNNs) hardware accelerator

来源：评论

学校读者我要写书评

暂无评论

Non-invasive respiratory infection monitoring using AI-driven thermal imaging and signal classification

引用

SIGNAL image AND video processing 2025年第7期19卷 1-14页

作者： Abisha, D. Natl Engn Coll Dept Comp Sci & Engn Kovilpatti Tamil Nadu India

The COVID-19 pandemic has highlighted the need for efficient and non-contact health screening methods. Signal-based infrared imaging is an emerging field in biomedical engineering that enables remote monitoring of vital signs. While fever is a common symptom, respiratory abnormalities often appear earlier, necessitating advanced screening systems that monitor both body temperature and respiratory patterns. This research presents an artificial intelligence-based screening device for health that identifies human respiratory patterns based on a deep learning model. The device is built with a Convolutional Neural Network (CNN) to extract features and a Long Short-Term Memory (LSTM) network to classify time-series patterns. The Softmax classifier accurately classifies respiratory patterns. It is learned on a specialized dataset of six breathing signal patterns, making it an effective model for real-time public health surveillance. The experimental result demonstrates that the proposed CNN-LSTM model achieves 91% accuracy, 90% precision, 93% recall, and an F1-score of 91%. It can be scaled up even further for medical real-time applications, paves the way to even greater future advancements in automated health surveillance.

关键词： Respiratory signals Health screening Feature extraction Signal processing Deep learning techniques Classification of patterns

来源：评论

学校读者我要写书评

暂无评论

An Empirical Study on Teaching Management and Quality Feedback Based on Human Behavior Detection

引用

JOURNAL OF ORGANIZATIONAL AND END USER COMPUTING 2024年第1期36卷 1页

作者： Zhang, Qian Osman, Siti Zuraidah Md Li, Hongming Lin, Xiao Univ Sains Malaysia Sch Educ Studies Minden Malaysia Univ Florida Coll Educ Gainesville FL USA Shandong Univ Arts Sch Int Art Exchange Jinan Peoples R China

In the current education field, the assessment of teaching management quality mostly relies on subjective judgment and static data, and lacks a real-time and dynamic feedback mechanism. In this study, we propose a deep learning-based human behavior analysis method, which aims to assess teaching management quality in real time by analyzing the behaviors of teachers and students in the classroom. First, in order to detect individual students in the video stream, an augmented detection framework based on YOLO v5s is introduced to process and analyze human actions and interaction patterns in the video data. Immediately after that, we design a channel residual decoupled convolutional neural network to recognize the different states of students. Teaching management quality is assessed by detecting students' classroom attention scores. By conducting experiments in different disciplines and teaching management environments to collect and train the model, the results show that the method can effectively improve the objectivity and accuracy of teaching management quality assessment.

关键词： Deep Learning Object Detection Behavior Recognition Teaching Management Quality image processing

来源：评论

学校读者我要写书评

暂无评论

Dynamic Tactical image Recognition and Analysis in Football Matches Using Convolutional Neural Networks

引用

TRAITEMENT DU SIGNAL 2025年第1期42卷 583-592页

作者： Xie, Qi Baoji Univ Arts & Sci Sch Phys Educ Baoji 721000 Peoples R China

With the increasing complexity of modern football tactics, how to intelligently and accurately analyze tactical changes in real-time during matches has become an important research direction. Traditional manual tactical analysis methods are inefficient and susceptible to subjective bias. Therefore, using computer vision and deep learning technologies for tactical image recognition and analysis in football matches has gradually become a research hotspot. Convolutional Neural Networks (CNNs), as a powerful image processing tool, have been widely applied in video analysis and player detection. However, multi-target motion prediction and tracking management in dynamic football match scenes still face significant challenges. Existing research mainly focuses on static image analysis or simple player tracking, but the high-frequency image updates, player interactions, and occlusion issues in football matches complicate multi-target tracking. While some deep learning-based methods for multi-target detection and tracking have made progress, challenges remain, such as handling high-density player targets and improving motion trajectory prediction accuracy. To address these shortcomings, this study proposes two core techniques based on CNNs: first, multi-target motion prediction, which accurately forecasts players' future positions based on historical motion data;second, multi-target tracking management, which uses deep learning to track and manage each player's movement trajectory in real-time. Through these two techniques, this research aims to improve the realtime and accuracy of tactical analysis in football matches, providing coaches and analysts with more scientific and efficient tactical decision-making support.

关键词： CNN football matches dynamic tactical image multi-target motion prediction multi-target tracking management computer vision

来源：评论

学校读者我要写书评

暂无评论

A real-time fine echo generation method of extended false target with radially high-speed moving

引用

IET RADAR SONAR AND NAVIGATION 2023年第2期17卷 312-325页

作者： Lei, Wei Zhang, Yue Chen, Zengping Sun Yat Sen Univ Sch Elect & Commun Engn Shenzhen Campus66 Gongchang Rd Shenzhen 518107 Peoples R China

The false-target echo generation to Inverse Synthetic Aperture Radar (ISAR) is significant in jamming the enemy ISAR and promoting ISAR development. Generally, it requires false-target echo coherent with the radar, real time and fine. However, conventional methods, such as digital image synthesizer (DIS), cannot meet those requirements. Moreover, existing methods do not consider the target's radial moving. To meet those demands, we propose an improved method in this study. We equivalently model echo formation as the synthesizer of two independent parts: (1) echo of remote target with radial moving and (2) echo of nearby extended target. In part one, accuracy is improved by utilising the Inner Pulse Motion (IPM) model and complexity is simplified by deducing it as a frequency offset modulation. In part two, the fine extended target echo is constructed by using convolution filtering whose resources consumption can be greatly reduced by separating it into an offline stage and a real-time stage. Our method is verified by algorithm simulations and actual experiments. The results indicate that it can build the fine false-target echo in real-time and can adapt to the target's radial velocity, different resolution and size. Compared with the conventional DIS method, our method reduces the computational complexity significantly and has more comprehensive functions.

关键词： Filtering methods in signal processing extended false target Radar equipment, systems and applications radial moving jamming echo formation fine extended target echo synthetic aperture radar high-speed moving real-time fine echo generation method radar imaging Optical, image and video signal processing conventional DIS method nearby extended target remote target real-time stage Electromagnetic compatibility and interference echo enemy ISAR digital image synthesizer false-target echo generation Inverse Synthetic Aperture Radar image filtering

来源：评论

学校读者我要写书评

暂无评论

Research on video Object Detection Based on real-time Linux 6

Research on Video Object Detection Based on Real-Time Linux

引用

6th IEEE International conference on Power, Intelligent Computing and Systems, ICPICS 2024

作者： Yang, Jiying Long, Qi Zhu, Xiaoyun Yang, Yuan Yunnan Open University Yunnan Province Kunming China

ISBN: (纸本)9798350374315

Object detection technology is an important research content in the field of computer vision, and it is one of the important basic technologies for understanding image content. real-time operating system refers to the operating system that can complete the processing of system request tasks within a specified time, and can provide timely response and high reliability are its main characteristics. Since the video-based target detection algorithm has high requirements on computing power and real-time performance, this paper proposes to deploy the target detection algorithm on the real-time operating system. The experiment verifies that the characteristics of the real-time operating system can improve the real-time performance of the target detection algorithm. Aiming at the hardware system of the industrial computer used in this paper, the principle and construction process of Xenomai real-time operating system are analyzed, and the scheme of building Linux+Xenomai real-time operating system on the industrial computer is proposed. Aiming at the application scenario with stable background and single target, the object detection algorithm based on image processing is studied. Based on background difference method and three-frame difference method, an improved algorithm based on adaptive detection window of target region is proposed. Experimental results show that the improved algorithm has better real-time performance than the basic algorithm. Aiming at the application scenario with complex background and multiple targets, the object detection algorithm based on deep learning is studied, and the full convolutional neural network in Dlib machine learning library is selected for research and implementation. According to the hardware and system environment of this paper, a computational scale estimation method of the total convolutional neural network is proposed, and a method of deploying the network model trained in the GPU environment in the real-time operating system e

关键词： Convolutional neural networks

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：