检索结果-内蒙古大学图书馆

ACM SIGGRAPH conference

作者： Wang, Lizhen Zhao, Xiaochen Sun, Jingxiang Zhang, Yuxiang Zhang, Hongwen Yu, Tao Liu, Yebin Tsinghua Univ Beijing Peoples R China NNKosmos Technol Hangzhou Peoples R China

ISBN: (纸本)9798400701597

Face reenactment methods attempt to restore and re-animate portrait videos as realistically as possible. Existing methods face a dilemma in quality versus controllability: 2D GAN-based methods achieve higher image quality but suffer in fine-grained control of facial attributes compared with 3D counterparts. In this work, we propose StyleAvatar, a real-time photo-realistic portrait avatar reconstruction method using StyleGAN-based networks, which can generate high-fidelity portrait avatars with faithful expression control. We expand the capabilities of StyleGAN by introducing a compositional representation and a sliding window augmentation method, which enable faster convergence and improve translation generalization. Specifically, we divide the portrait scenes into three parts for adaptive adjustments: facial region, non-facial foreground region, and the background. Besides, our network leverages the best of UNet, StyleGAN and time coding for video learning, which enables high-quality video generation. Furthermore, a sliding window augmentation method together with a pre-training strategy are proposed to improve translation generalization and training performance, respectively. The proposed network can converge within two hours while ensuring high image quality and a forward rendering time of only 20 milliseconds. Furthermore, we propose a real-time live system, which further pushes research into applications. Results and experiments demonstrate the superiority of our method in terms of image quality, full portrait video generation, and real-time re-animation compared to existing facial reenactment methods. Training and inference code for this paper are at https://***/LizhenWangT/StyleAvatar.

关键词： Facial Reenactment StyleGAN video Portraits Deep Learning Rendering-to-video Translation

来源：评论

学校读者我要写书评

暂无评论

A real-time Human Tracking System Using Multiple Cameras to Reduce Network Loads 21

A Real-Time Human Tracking System Using Multiple Cameras to ...

引用

IEEE 21st Consumer Communications and Networking conference (CCNC)

作者： Katayama, Hikaru Yano, Hideto Yoshihisa, Tomoki Shimonishi, Hideyuki Osaka Univ Grad Sch Info Sci & Tech Osaka Japan Osaka Univ Cyber Media Ctr Osaka Japan Shiga Univ Grad Sch Data Sci Shiga Japan

ISBN: (纸本)9798350304572

In recent years, many surveillance cameras have been installed in cities, and human tracking technology has received much attention. In most current human-tracking technologies, servers collect images of people and then analyze their features from the data. In this method, the network loads on the servers increase as the number of people tracked increases, causing problems, such as packet loss and loss of real-time performance. In this paper, we propose two real-time human tracking methods. The methods conduct a human tracking process without servers by sharing extracted human features among devices. Experimental evaluations of the amount of communication traffic and processing time using multiple cameras have shown that the two proposed methods can distribute the network load with slight deterioration in processing speed and tracking accuracy.

关键词： Human detection Human recognition Multi-camera tracking Distributed process video surveillance

来源：评论

学校读者我要写书评

暂无评论

ALL-INTRA RATE CONTROL USING LOW COMPLEXITY video FEATURES FOR VERSATILE video CODING 30

ALL-INTRA RATE CONTROL USING LOW COMPLEXITY VIDEO FEATURES F...

引用

30th IEEE International conference on image processing (ICIP)

作者： Menon, Vignesh V. Henkel, Anastasia Rajendran, Prajit T. Helmrich, Christian R. Wieckowski, Adam Bross, Benjamin Timmerer, Christian Marpe, Detlev Alpen Adria Univ Christian Doppler Lab ATHENA Klagenfurt Austria Fraunhofer HHI Video Commun & Applicat Dept Berlin Germany Univ Paris Saclay List CEA F-91120 Palaiseau France Univ Paris Saclay Gif Sur Yvette France

ISBN: (纸本)9781728198354

Versatile video Coding (VVC) allows for large compression efficiency gains over its predecessor, High Efficiency video Coding (HEVC). The added efficiency comes at the cost of increased runtime complexity, especially for encoding. It is thus highly relevant to explore all available runtime reduction options. This paper proposes a novel first pass for two-pass rate control in all-intra configuration, using low-complexity video analysis and a Random Forest (RF)-based machine learning model to derive the data required for driving the second pass. The proposed method is validated using VVenC, an open and optimized VVC encoder. Compared to the default two-pass rate control algorithm in VVenC, the proposed method achieves around 32% reduction in encoding time for the preset faster, while on average only causing 2% BD-rate increase and achieving similar rate control accuracy.

关键词： Rate control Complexity reduction Random Forest Machine learning VVC

来源：评论

学校读者我要写书评

暂无评论

Comprehensive Review for video Surveillance Based Suspicious Human Activity Detection 2

Comprehensive Review for Video Surveillance Based Suspicious...

引用

2nd International conference on Advances in Computation, Communication and Information Technology, ICAICCIT 2024

作者： Rajpoot, Lucky Madaan, Rosy School of Engineering and Technology Manav Rachna International Institute of Research and Studies Faridabad India

ISBN: (纸本)9798331541217

The detection of potentially illicit behaviors from recorded video footage is an emerging field of study in the domain of image processing and computer vision. Detecting suspicious activities is essential for maintaining the security of both businesses and communities. Surveillance cameras are primarily utilized in public areas to monitor and guarantee security. To provide continuous monitoring of public spaces, it is crucial to utilize intelligent video surveillance systems capable of properly detecting and classifying human actions in real-time as either benign or potentially threatening. This study presents a state-of-the-art review that highlights the general progress made in the last several years in spotting suspicious behaviors from surveillance recordings. article briefly summarize the problems and challenges related to identifying questionable human behaviors. This article aims to provide researchers in this subject with an overview of the literature on a number of suspicious activity recognition systems, including an overview of their general architecture. © 2024 IEEE.

关键词： video recording

来源：评论

学校读者我要写书评

暂无评论

Evaluation of Hardware-based video Encoders on Modern GPUs for UHD Live-Streaming 33

Evaluation of Hardware-based Video Encoders on Modern GPUs f...

引用

33rd IEEE International conference on Computer Communications and Networks (ICCCN)

作者： Arunruangsirilert, Kasidis Katto, Jiro Waseda Univ Dept Comp Sci & Commun Engn Tokyo Japan

ISBN: (纸本)9798350348439;9798350384611

Many GPUs have incorporated hardware-accelerated video encoders, which allow video encoding tasks to be offloaded from the main CPU and provide higher power efficiency. Over the years, many new video codecs such as H.265/HEVC, VP9, and AV1 were added to the latest GPU boards. Recently, the rise of live video content such as VTuber, game live-streaming, and live event broadcasts, drives the demand for high-efficiency hardware encoders in the GPUs to tackle these real-time video encoding tasks, especially at higher resolutions such as 4K/8K UHD. In this paper, RD performance, encoding speed, as well as power consumption of hardware encoders in several generations of NVIDIA, Intel GPUs as well as Qualcomm Snapdragon Mobile SoCs were evaluated and compared to the software counterparts, including the latest H.266/VVC codec, using several metrics including PSNR, SSIM, and machine-learning based VMAF. The results show that modern GPU hardware encoders can match the RD performance of software encoders in real-time encoding scenarios, and while encoding speed increased in newer hardware, there is mostly negligible RD performance improvement between hardware generations. Finally, the bitrate required for each hardware encoder to match YouTube transcoding quality was also calculated.

关键词： video Encoders Graphic processing Unit (GPU) Hardware Acceleration Ultra High-Definition (UHD) Live-Streaming

来源：评论

学校读者我要写书评

暂无评论

BP-EVD: Forward Block-Output Propagation for Efficient video Denoising

引用

IEEE TRANSACTIONS ON image processing 2022年 31卷 3809-3824页

作者： Ostrowski, Piotr Kopa Katsaros, Efklidis Wesierski, Daniel Jezierska, Anna Gdansk Univ Technol Fac Elect Telecommun & Informat ETI Dept Robot & Decis Syst PL-80233 Gdansk Poland Dept Biomed Engn Fac Elect Telecommun & Informat ETI Gdansk Univ Technol PL-80233 Gdansk Poland Gdansk Univ Technol Fac Elect Telecommun & Informat ETI Dept Multimedia Syst PL-80233 Gdansk Poland Syst Res Inst Dept Modelling & Optimizat Dynam Syst PL-01447 Warsaw Poland

Denoising videos in real-time is critical in many applications, including robotics and medicine, where varying-light conditions, miniaturized sensors, and optics can substantially compromise image quality. This work proposes the first video denoising method based on a deep neural network that achieves state-of-the-art performance on dynamic scenes while running in real-time on VGA video resolution with no frame latency. The backbone of our method is a novel, remarkably simple, temporal network of cascaded blocks with forward block output propagation. We train our architecture with short, long, and global residual connections by minimizing the restoration loss of pairs of frames, leading to a more effective training across noise levels. It is robust to heavy noise following Poisson-Gaussian noise statistics. The algorithm is evaluated on RAW and RGB data. We propose a denoising algorithm that requires no future frames to denoise a current frame, reducing its latency considerably. The visual and quantitative results show that our algorithm achieves state-of-the-art performance among efficient algorithms, achieving from two-fold to two-orders-of-magnitude speed-ups on standard benchmarks for video denoising.

关键词： Noise reduction Streaming media Training Noise level real-time systems Noise measurement Gaussian noise Low-latency video denoising real-time Poisson-Gaussian noise Gaussian noise

来源：评论

学校读者我要写书评

暂无评论

Towards Low-Complexity VVENC Encoding: Challenges, Strategies, and Solutions

Towards Low-Complexity VVENC Encoding: Challenges, Strategie...

引用

2025 IEEE International Students' conference on Electrical, Electronics and Computer Science, SCEECS 2025

作者： Lone, Mohd Rafi Sharma, Ajay Najar, Ashfaq Ahmad Vit Bhopal University Kothrikalan Madhya Pradesh Sehore466114 India

ISBN: (纸本)9798331529833

Versatile video Coding (VVC) offers compression efficiency improvements of 50% and 75% compared to High Efficiency video Coding (HEVC) and Advanced video Coding (AVC), respectively. However, the VVC encoder software (VVENC), while efficient, is extremely complex and unsuitable for real-time encoding. This paper examines the overall complexity of VVENC by profiling the time consumption of its individual stages. We identify the key contributors to the encoder's computational load. Although various efforts have been made to reduce the complexity of specific stages, such as the Transform and Quantization units, these reductions have not significantly lowered latency. As a result, real-time encoding remains out of reach. This paper highlights these challenges and explores possible strategies to reduce the overall complexity of the encoder, aiming for more efficient future implementations. © 2025 IEEE.

关键词： video signal processing

来源：评论

学校读者我要写书评

暂无评论

Analysis and Development of Deep Learning Depth Estimation Techniques for Volumetric Capture and Free Viewpoint video 24

Analysis and Development of Deep Learning Depth Estimation T...

引用

15th ACM Multimedia Systems conference (ACM MMSys)

作者： Uson, Javier Cabrera, Julian Univ Politecn Madrid ETSI Telecomunicac Informat Proc & Telecommun Ctr Grp Tratamiento Imagenes Madrid Spain

ISBN: (纸本)9798400704123

Volumetric capture is an important topic in eXtended reality (XR) as it enables the integration of realistic three-dimensional content into virtual scenarios and immersive applications. Certain systems are even capable of delivering these volumetric captures live and in real-time, opening the door to interactive use cases such as immersive videoconferencing. One example of such systems is FVV Live, a Free Viewpoint video (FVV) application capable of working in real-time with low delay Current breakthroughs in Artificial Intelligence (AI) in general and deep learning in particular report great success when applied to the computer vision tasks involved in volumetric capture, helping to overcome the quality and bandwidth restrictions that these systems often face. Despite their promising results, state-of-the-art approaches still come with the disadvantage of requiring large processing power and time. This project aims to advance the volumetric capture state-of-the-art applying the previously mentioned deep learning techniques, optimizing the models to work in real-time while still delivering high quality. The technology developed will be validated integrating it into immersive video communication systems such as FVV Live in order to overcome their main restrictions and to improve the quality delivered to the end user.

关键词： Deep learning

来源：评论

学校读者我要写书评

暂无评论

SDN- based Internet of video Things Platform Enabling real-time Edge/Cloud video Analytics 17

SDN- based Internet of Video Things Platform Enabling Real-T...

引用

IEEE 17th International conference on the Experience of Designing and Application of CAD Systems (CADSM)

作者： Kochan, Orest Beshley, Mykola Beshley, Halyna Shkoropad, Yuriy Ivanochko, Iryna Seliuchenko, Nadiia Hubei Univ Technol Sch Comp Sci Wuhan Peoples R China Lviv Polytech Natl Univ Dept Informat Measuring Technol 12 Bandery Str Lvov Ukraine Lviv Polytech Natl Univ Dept Telecommun 12 Bandery Str Lvov Ukraine Comenius Univ Fac Management Dept Informat Syst Bratislava 82005 Slovakia Lviv Polytech Natl Univ Dept Management & Int Business 12 Bandery Str Lvov Ukraine Lviv Polytech Natl Univ Dept Business Econ & Investment 12 Bandery Str Lvov Ukraine

ISBN: (纸本)9798350310856

The trend of recent years is the continuous development of the Internet of Things (IoT). Among such things, a significant share is occupied by visual sensors and video cameras that generate large amounts of data. In turn, the need to attract significant storage resources, transmission throughput, and processing power is an inevitable solution for real-time video analytics. Thus, the combination of smart cameras with the computing paradigm of Cloud/Edge and IoT architectures form the next generation of video surveillance systems, called the "Internet of video Things" (IoVT In this paper, a new IoVT platform is developed that, in addition to harmoniously combining Edge/Cloud computing, uses SDN to overcome challenges such as flexible management, control, and maintenance of IoVT devices. In particular, within the proposed IoVT platform, an algorithm for the dynamic selection of Edge or Cloud computing is implemented using an SDN controller to provide effective video analytics in real-time. This algorithm considers such parameters as the priority of computational tasks, the number of video streams, and the image quality with the ability to adapt to a specific application by software configuration of the IoVT platform. We also demonstrate the effectiveness of the proposed solutions on real equipment and discuss several promising areas of application of the developed platform.

关键词： Internet of video Things Software-defined Networking Cloud Computing Edge Computing Quality of Service

来源：评论

学校读者我要写书评

暂无评论

YOLO glass: video-based smart object detection using squeeze and attention YOLO network

引用

SIGNAL image AND video processing 2024年第3期18卷 2105-2115页

作者： Sugashini, T. Balakrishnan, G. Saranathan Coll Engn Dept Comp Sci & Engn Trichy 620012 India

Visually impairments or blindness people need guidance in order to avoid collision risks with outdoor obstacles. Recently, technology has been proving its presence in all aspects of human life, and new devices provide assistance to humans on a daily basis. However, due to real-time dynamics or a lack of specialized knowledge, object detection confronts a reliability difficulty. To overcome the challenge, YOLO Glass a video-based Smart object detection model has been proposed for visually impaired person to navigate effectively in indoor and outdoor environments. Initially the captured video is converted into key frames and pre-processed using Correlation Fusion-based disparity approach. The pre-processed images were augmented to prevent overfitting of the trained model. The proposed method uses an obstacle detection system based on a Squeeze and Attendant Block YOLO Network model (SAB-YOLO). A proposed system assists visually impaired users in detecting multiple objects and their locations relative to their line of sight, and alerts them by providing audio messages via headphones. The system assists blind and visually impaired people in managing their daily tasks and navigating their surroundings. The experimental results show that the proposed system improves accuracy by 98.99%, proving that it can accurately identify objects. The detection accuracy of the proposed method is 5.15%, 7.15% and 9.7% better that existing YOLO v6, YOLO v5 and YOLO v3, respectively.

关键词： Visually impairment Deep learning Outdoor object detection Wearable system

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：