检索结果-内蒙古大学图书馆

Performance analysis of optimized versatile video coding software decoders on embedded platforms

JOURNAL OF real-time image processing 2023年第6期20卷 120页

作者： Saha, Anup Hamidouche, Wassim Chavarrias, Miguel Pescador, Fernando Farhat, Ibrahim Univ Politecn Madrid CITSEM Madrid Spain Univ Rennes INSA Rennes CNRS IETR UMR 6164 Rennes France Technol Innovat Inst TII POB 9639 Masdar City Abu Dhabi U Arab Emirates

In recent years, the global demand for high-resolution videos and the emergence of new multimedia applications have created the need for a new video coding standard. Therefore, in July 2020, the versatile video coding (VVC) standard was released, providing up to 50% bit-rate savings for the same video quality compared to its predecessor high-efficiency video coding (HEVC). However, these bit-rate savings come at the cost of high computational complexity, particularly for live applications and on resource-constrained embedded devices. This paper evaluates two optimized VVC software decoders, named OpenVVC and Versatile video deCoder (VVdeC), designed for low resources platforms. These decoders exploit optimization techniques such as data-level parallelism using single instruction multiple data (SIMD) instructions and functional-level parallelism using frame, tile, and slice-based parallelisms. Furthermore, a comparison of decoding runtime, energy, and memory consumption between the two decoders is presented while targeting two different resource-constraint embedded devices. The results showed that both decoders achieve real-time decoding of full high-definition (FHD) resolution on the first platform using 8 cores and high-definition (HD) real-time decoding for the second platform using only 4 cores with comparable results in terms of the average energy consumed: around 26 J and 15 J for the 8 cores and 4 cores platforms, respectively. Furthermore, OpenVVC showed better results regarding memory usage with a lower average maximum memory consumed during runtime than VVdeC.

关键词： image coding

来源：评论

学校读者我要写书评

暂无评论

Design of Power System Remote video Monitoring System Based on RTP Technology 7th

Design of Power System Remote Video Monitoring System Based ...

引用

7th EAI International conference on Advanced Hybrid Information processing, ADHIP 2023

作者： Yuan, Liang Information Technology Co. Ltd. Beijing100031 China

ISBN: (纸本)9783031505485

Most of the conventional power system remote video monitoring systems are designed based on the SIP principle. In the actual monitoring operation process, there are problems such as poor real-time monitoring and high packet loss rate. In order to solve this problem, RTP technology is introduced and a remote video monitoring system for power system based on RTP technology is designed. Based on the optimized design of the monitoring system hardware, the monitoring system software and software functions are designed. First, the remote video surveillance image of power system is filtered to remove most of the signal noise in the remote video surveillance image. Secondly, the background of power system remote video monitoring is updated to achieve the ideal segmentation effect of video monitoring image. The remote video monitoring module is designed to carry out remote video real-time monitoring of the operating conditions of the power system. The real-time transmission function of remote video image in power system is designed by using RTP technology. According to the system test results, after the proposed monitoring system is applied, with the increase of system running time, its packet loss rate is below 1%, and the real-time performance of remote video monitoring is high. © ICST Institute for Computer Sciences, Social Informatics and Telecommunications Engineering 2024.

关键词： Packet loss

来源：评论

学校读者我要写书评

暂无评论

Fine-Grained Dance Style Classification Using an Optimized Hybrid Convolutional Neural Network Architecture for video processing Over Multimedia Networks

引用

INTERNATIONAL JOURNAL OF INTELLIGENT SYSTEMS 2025年第1期2025卷

作者： Guo, Na Yang, Ahong Wang, Yan Dastbaravardeh, Elaheh Jinan Presch Educ Coll Sch Arts & Educ Jinan 250307 Peoples R China Univ Jinan Sch Mus Jinan Peoples R China Shandong Univ Arts Dance Acad Jinan 250300 Peoples R China Islamic Azad Univ Mashhad Dept Control Engn Mashhad *** Iran

Dance style recognition through video analysis during university training can significantly benefit both instructors and novice dancers. Employing video analysis in training offers substantial advantages, including the potential to train future dancers using innovative technologies. Over time, intricate dance gestures can be honed, reducing the burden on instructors who would, otherwise, need to provide repetitive demonstrations. Recognizing dancers' movements, evaluating and adjusting their gestures, and extracting cognitive functions for efficient evaluation and classification are pivotal aspects of our model. Deep learning currently stands as one of the most effective approaches for achieving these objectives, particularly with short video clips. However, limited research has focused on automated analysis of dance videos for training purposes and assisting instructors. In addition, assessing the quality and accuracy of performance video recordings presents a complex challenge, especially when judges cannot fully focus on the on-stage performance. This paper proposes an alternative to manual evaluation through a video-based approach for dance assessment. By utilizing short video clips, we conduct dance analysis employing techniques such as fine-grained dance style classification in video frames, convolutional neural networks (CNNs) with channel attention mechanisms (CAMs), and autoencoders (AEs). These methods enable accurate evaluation and data gathering, leading to precise conclusions. Furthermore, utilizing cloud space for real-time processing of video frames is essential for timely analysis of dance styles, enhancing the efficiency of information processing. Experimental results demonstrate the effectiveness of our evaluation method in terms of accuracy and F1-score calculation, with accuracy exceeding 97.24% and the F1-score reaching 97.30%. These findings corroborate the efficacy and precision of our approach in dance evaluation analysis.

关键词： Artificial Intelligence artificial neural networks computer vision dance style classification distributed and cloud systems hybrid convolutional neural network image processing internet and social media multimedia

来源：评论

学校读者我要写书评

暂无评论

Stme-net: spatio-temporal motion excitation network for action recognition

引用

JOURNAL OF real-time image processing 2025年第2期22卷 1-13页

作者： Zhao, Qian Su, Yanxiong Zhang, Hui Shanghai Univ Elect Power Coll Elect & Informat Engn Shanghai 201306 Peoples R China Shanghai Normal Univ Coll Informat Mech & Elect Engn Shanghai 200234 Peoples R China

video action recognition, as one of the fundamental tasks in video understanding, relies crucially on accurate temporal modeling. However, accurately modeling the temporal information of videos remains a challenging task. To address this problem, we design two new modules: the Spatial Motion Extraction (SME) module and the Spatio-temporal Motion Excitation (STME) module. The SME module features two branches for extracting motion and spatial features. The motion branch refines pixel differences between neighboring frames through a channel attention module, enhancing detailed motion features. These features are fused with spatial information to yield fine-grained local spatio-temporal features. The STME module, comprising the multi-motion excitation (MME), temporal excitation (TE), and spatio-temporal excitation (STE) sub-modules, efficiently captures long-range motion, temporal, and global spatio-temporal features. The MME introduces a bi-directional, multi-scale structure for effective long-range motion extraction, while the TE module employs a hierarchical pyramid with residual connectivity for fine-grained long-range temporal extraction. The STE module utilizes 3D convolutional layers for global spatio-temporal feature extraction. The seamless integration of these sub-modules within a standard ResNet network forms the Spatio-temporal Motion Excitation Network. Extensive evaluations on Something V1 and V2 and HMDB51 datasets against state-of-the-art methods demonstrate the effectiveness of our approach in achieving accurate recognition of both simple and complex video actions.

关键词： video action recognition Multi-scale Pyramid network Spatio-temporal features

来源：评论

学校读者我要写书评

暂无评论

Surveillance System for real-time High-Precision Recognition of Criminal Faces From Wild videos

引用

IEEE ACCESS 2023年 11卷 56066-56082页

作者： Kim, Hyun-Bin Choi, Nakhoon Kwon, Hye-Jeong Kim, Heeyoul Kyonggi Univ Dept Comp Sci Suwon 16227 South Korea Kyonggi Univ Div Comp Sci & Engn Suwon 16227 South Korea

As violent criminals, such as child sex offenders, tend to have high recidivism rates in modern society, there is a need to prevent such offenders from approaching socially disadvantaged and crime-prone areas, such as schools or childcare centers. Accordingly, national governments and related institutions have installed surveillance cameras and provided additional personnel to manage and monitor them via video surveillance equipment. However, naked-eye monitoring by guards and manual image processing cannot properly evaluate the video captured by surveillance cameras. To address the various problems of conventional systems that simply store and retrieve image data, a system is needed that can actively classify captured images in real-time, in addition to assisting surveillance personnel. Therefore, this paper proposes a video surveillance system based on a composable deep face recognition method. The proposed system detects the faces of criminals in real time from videos captured by a surveillance camera and notifies relevant institutions of the appearance of criminals. For real-time face detection, a down-sampled image forked from the original is used to localize unspecified faces. To improve accuracy and confidence in the recognition task, a scoring method based on face tracking is proposed. The final score combines the recognition confidence and the standard score to determine the embedding distance from the criminal face embedding data. The blind spots of surveillance personnel can be effectively addressed through early detection of criminals approaching crime-prone areas. The contributions of the paper are as follows. The proposed system can process images from surveillance cameras in real-time by using down-sampling. It can effectively identify the identity of criminals by using a face tracking ID unit and minimizes prediction reversal by solving the congested embedding problem in the feature space that may occur when performing identification matching on a la

关键词： Surveillance videos Face recognition Cameras real-time systems Object recognition Crime prevention down-sampling face recognition video

来源：评论

学校读者我要写书评

暂无评论

Faster than real-time detection of shot boundaries, sampling structure and dynamic keyframes in video 8

Faster than real-time detection of shot boundaries, sampling...

引用

8th International conference on Imaging, Signal processing and Communications (ICISPC)

作者： Fassold, Hannes JOANNEUM RES DIGITAL Graz Austria

ISBN: (纸本)9798350367164;9798350367157

The detection of shot boundaries (hardcuts and short dissolves), sampling structure (progressive / interlaced / pulldown) and dynamic keyframes in a video are fundamental video analysis tasks which have to be done before any further high-level analysis tasks. We present a novel algorithm which does all these analysis tasks in an unified way, by utilizing a combination of inter-frame and intra-frame measures derived from the motion field and normalized cross correlation. The algorithm runs four times faster than real-time due to sparse and selective calculation of these measures. An initial evaluation furthermore shows that the proposed algorithm is extremely robust even for challenging content showing large camera or object motion, flashlights, flicker or low contrast / noise.

关键词： video analysis shot detection sampling structure detection keyframe detection real-time analysis

来源：评论

学校读者我要写书评

暂无评论

On-Edge High-Throughput Collaborative Inference for real-time video Analytics

引用

IEEE INTERNET OF THINGS JOURNAL 2024年第20期11卷 33097-33109页

作者： Wang, Xingwang Shen, Muzi Yang, Kun Jilin Univ Coll Comp Sci & Technol Key Lab Symbol Computat & Knowledge Engn Minist Educ Changchun 130012 Peoples R China Jilin Univ Software Coll Changchun 130012 Peoples R China Nanjing Univ Sch Intelligent Software & Engn Suzhou 210093 Jiangsu Peoples R China Univ Essex Sch Comp Sci & Elect Engn Colchester CO4 3SQ England

Performing video analytics tasks based on deep neural networks (DNNs) on resource-constrained mobile devices is extremely challenging because of the huge volume of video data and the computationally intensive nature of DNN. One promising solution is to offload tasks to the edge servers for execution. However, due to explosive growth in the number of end devices, more and more mobile devices are connected to the edge servers. This makes it difficult for the edge server to meet the specific service level objective (SLO) of on-edge video analytics when facing concurrent computing requests, especially in the real-time scene. To address this issue, this article presents EHCI, an on-edge high-throughput collaborative inference framework for real-time video analytics. On the mobile device, EHCI crops the key regions from the current video frame based on the local detection cache and offloads these regions to the edge server, which can significantly reduce bandwidth consumption and computation costs. Besides, considering concurrent DNN inference requests from multiple mobile devices, EHCI uses a key region patching method to achieve high-throughput DNN inference on the edge server, along with a scheduling algorithm to meet the SLO for each mobile device. It has been validated with testing that the EHCI outperforms the state-of-the-art technology by 159% in achieved throughput, reduces the average end-to-end delay by 36%, and the application accuracy sacrifice is within a reasonable range.

关键词： Servers Mobile handsets Visual analytics image edge detection Streaming media Throughput Graphics processing units Collaborative inference edge computing video analytics video object detection

来源：评论

学校读者我要写书评

暂无评论

EFFICIENT LEARNED WAVELET image AND video CODING 31

EFFICIENT LEARNED WAVELET IMAGE AND VIDEO CODING

引用

2024 International conference on image processing

作者： Meyer, Anna Prativadibhayankaram, Srivatsa Kaup, Andre Friedrich Alexander Univ Erlangen Nunrnberg Multimedia Commun & Signal Proc Erlangen Germany Fraunhofer Inst Integrated Circuits Moving Picture Technol Erlangen Germany

ISBN: (纸本)9798350349405;9798350349399

Learned wavelet image and video coding approaches provide an explainable framework with a latent space corresponding to a wavelet decomposition. The wavelet image coder iWave++ achieves state-of-the-art performance and has been employed for various compression tasks, including lossy as well as lossless image, video, and medical data compression. However, the approaches suffer from slow decoding speed due to the autoregressive context model used in iWave++. In this paper, we show how a parallelized context model can be integrated into the iWave++ framework. Our experimental results demonstrate a speedup factor of over 350 and 240 for image and video compression, respectively. At the same time, the rate-distortion performance in terms of Bjontegaard delta bitrate is slightly worse by 1.5% for image coding and 1% for video coding. In addition, we analyze the learned wavelet decomposition by visualizing its subband impulse responses.

关键词： Learned image compression learned video compression discrete wavelet transform interpretability

来源：评论

学校读者我要写书评

暂无评论

real time video processing Using Polar Fire FPGA 1

Real Time Video Processing Using Polar Fire FPGA

引用

1st International conference on Electronics, Communication and Signal processing, ICECSP 2024

作者： Kumar, Ujjwal Rout, Mrutyunjay Palipudi, Aditya Sashank NIT Jamshedpur Dept.of ECE Jharkhand Jamshedpur India Manager Microchip Technology Telangana Hyderabad India

ISBN: (纸本)9798350364590

This effort aims to create a hardware resource-efficient real-time video processing system employing Polar Fire FPGA technology. This paper presents the interface between two IMX 334 camera modules and a Polar-Fire FPGA and displays the video to the 1920 x 1080 monitor output via HDMI 1.4 TX port in picture in picture mode. The IMX 334 camera module supports CSI2 serial data output (4 lane/8lane, RAW10/RAW12 output) at 60 frames per second at full HD and at 30 fps at 4k resolution. The solution comes with an easy-to-use graphical user interface (GUI) for controlling image and video settings, along with picture-in-picture mode features. © 2024 IEEE.

关键词： Graphical user interfaces

来源：评论

学校读者我要写书评

暂无评论

A Special-Purpose video Streaming Codec for Internet-based Remote Driving

A Special-Purpose Video Streaming Codec for Internet-based R...

引用

IEEE/ASME International conference on Advanced Intelligent Mechatronics (AIM)

作者： Adwani, Neel Silvestrini-Cordero, Kevin Rojas-Cessa, Roberto Han, Tao Wang, Cong New Jersey Inst Technol Elect & Comp Engn Newark NJ 07102 USA

ISBN: (纸本)9798350355376;9798350355369

real-time teleoperation of robotic systems over the Internet is a desirable technology in many ways. Latency of the video feedback has been hampering its development. This paper takes the application of remote driving to introduce an unconventional codec that provides a very low latency for Internet-based video streaming. The proposed method preserves just enough information in the video for essential perception and decision-making of a remote driver. Thanks to a unique integration of several image processing and data streaming techniques, the proposed codec can realize a glass-to-glass latency of around 90ms. A series of tests are conducted over the real consumer Internet to analyze the latency and verify the effectiveness of remote driving with the proposed codec.

关键词： remote driving teleoperation low-latency video streaming video codec

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：