检索结果-内蒙古大学图书馆

IEEE Design and Test 2025年

作者： Muñoz, Marcello M. Maass, Denis Perleberg, Murilo Agostini, Luciano Corrêa, Guilherme Porto, Marcelo Graduate Program in Computing - PPGC Video Technology Research Group ViTech Federal University of Pelotas - UFPel Pelotas Brazil

One of the most important new tools of the Versatile video Coding (VVC) standard is the Affine Motion Estimation (AME). The AME contribution to the coding efficiency comes with a high computational cost, especially for real-time high-resolution video encoding, making dedicated hardware for battery-powered devices mandatory. Considering this scenario, this work presents a dedicated hardware design for the Affine Reconstructor of the VVC standard, focusing on high coding efficiency for real-time processing of UHD video. ASIC synthesis results show a circuit area of 179.9k Gates, with a power dissipation of 107.76 mW when targeting the processing of 4K UHD@60fps videos. The proposed design shows the lowest impact in BD-Rate (only 0.54%) in the current literature. © 2013 IEEE.

关键词： Motion estimation

来源：评论

学校读者我要写书评

暂无评论

Evaluation of Coarse-to-Fine Spatio-Temporal Information Fusion (CF-STIF) Network

Evaluation of Coarse-to-Fine Spatio-Temporal Information Fus...

引用

IEEE Latin American Symposium on Circuits and Systems (LASCAS)

作者： Luis Carlos Linares Gilberto Kreisler Daniel Palomino Guilherme Corrêa Bruno Zatt Video Technology Research Group (ViTech) Graduate Program in Computing (PPGC) Federal University of Pelotas (UFPel) Pelotas Brazil

ISBN: (数字)9798331522124

ISBN: (纸本)9798331522131

This paper presents an evaluation of the Coarse-to-Fine Spatio-Temporal Information Fusion (CF-STIF) network for enhancing the quality of compressed videos across multiple codecs, including HEVC, VVC, VP9, and AV1. The CF-STIF network leverages spatiotemporal fusion and deep learning techniques to reduce compression artifacts and improve video quality. The evaluation extends existing methods by employing multiple quality metrics such as PSNR, SSIM, LPIPS. The CF-STIF network has been integrated with the Spatio-Temporal Deformable Fusion (STDF) training scheme in order to execute the model. Results demonstrate that CF-STIF achieves the highest quality improvements for HEVC-encoded videos, with an average PSNR increase of 0.813 dB and superior visual quality as measured by SSIM. However, the performance significantly drops for other codecs, particularly AV1, highlighting the need for future adaptations to optimize CF-STIF for diverse compression standards.

关键词： Training Deep learning Adaptation models Visualization Codecs Computational modeling Computer architecture Quality assessment Standards videos

来源：评论

学校读者我要写书评

暂无评论

Computational Cost Analysis and Reduction of VVC Multiple Transform Selection 16

Computational Cost Analysis and Reduction of VVC Multiple Tr...

引用

16th IEEE Latin American Symposium on Circuits and Systems, LASCAS 2025

作者： Camargo, Caroline Silveira, Bianca Correa, Guilherme Federal University of Pelotas - UFPel Video Technology Research Group - ViTech Graduate Program in Computing - Ppgc Pelotas Brazil

ISBN: (纸本)9798331522124

The growing demand for high-definition online videos emphasizes the need for efficient video codecs like H.266/VVC, which offer significant compression potential. However, its implementation presents challenges, particularly in terms of computational cost, as is the case of the Multiple Transform Selection (MTS) tool. This study analyzes the performance of MTS modes, showing that explicit MTS improves coding efficiency but increases encoding time, while implicit MTS offers modest efficiency gains with less computational cost. A machine learning-based approach is proposed, using decision trees, to accelerate encoder decisions for both intra and inter predicted blocks in explicit MTS, reducing encoding time by an average of 7.98%, with only a 0.89% increase in BD-rate. These results highlight the potential for optimization of explicit MTS in both intra and inter transformations. © 2025 IEEE.

关键词： Multiresolution analysis

来源：评论

学校读者我要写书评

暂无评论

Computational Cost Analysis and Reduction of VVC Multiple Transform Selection

Computational Cost Analysis and Reduction of VVC Multiple Tr...

引用

IEEE Latin American Symposium on Circuits and Systems (LASCAS)

作者： Caroline Camargo Bianca Silveira Guilherme Correa Video Technology Research Group - ViTech Graduate Program in Computing - PPGC Federal University of Pelotas - UFPel Pelotas Brazil

ISBN: (数字)9798331522124

ISBN: (纸本)9798331522131

关键词： Circuits and systems Transforms Machine learning Encoding Computational efficiency Decision trees video codecs Optimization videos

来源：评论

学校读者我要写书评

暂无评论

FIFS: A Machine Learning-Based Fast AV1 Interpolation Filter Search

FIFS: A Machine Learning-Based Fast AV1 Interpolation Filter...

引用

IEEE Latin American Symposium on Circuits and Systems (LASCAS)

作者： William Kolodziejski Marcelo Porto Luciano Agostini Video Technology Research Group (ViTech) Graduate Program in Computing (PPGC) Federal University of Pelotas (UFPel) Pelotas RS Brazil

ISBN: (数字)9798331522124

ISBN: (纸本)9798331522131

AV1 is a codec developed by huge technology companies to be used in current and future commercial video applications. It introduces and improves several tools from its predecessor VP9, designed for various video scenarios. One of the improved tools is the Fractional Motion Estimation (FME), which generates sub-pixel predictors. AV1 employs four sets of interpolation filters, requiring significant computational effort during the Interpolation Filter Search (IFS) to identify the best filter to be used. This work proposes the FIFS, a machinelearning method developed to reduce the processing time of IFS while maintaining minimal impact on coding efficiency. This method achieves over 52% reductions in IFS time, with only a slight increase in BD-BR of $\mathbf{0.14 \%}$ . To the best of the authors' knowledge, this is the first work in the literature to propose a machine learning-based approach for the AV1 IFS.

关键词： video coding Interpolation Filters Codecs Motion estimation Computational modeling Focusing Machine learning Companies Encoding

来源：评论

学校读者我要写书评

暂无评论

Fast VVC Test Zone Search and Affine Motion Estimation Using Machine Learning

Fast VVC Test Zone Search and Affine Motion Estimation Using...

引用

IEEE Latin American Symposium on Circuits and Systems (LASCAS)

作者： Ramiro Viana Marta Loose Ruhan Conceição Marcelo Porto Guilherme Corrêa Luciano Agostini Video Technology Research Group (ViTech) Graduate Program in Computing (PPGC) Federal University of Pelotas (UFPel) Pelotas/RS Brazil

ISBN: (数字)9798331522124

ISBN: (纸本)9798331522131

As the demand for video transmission surges on remote work, education, and streaming services, the need for continuous advancements in video encoding technologies becomes increasingly evident. Adapting to the evolving requirements of efficient video delivery and consumption necessitates ongoing development and enhancement in video encoding standards, with Versatile video Coding (VVC) emerging as a notable example. This paper provides an overview of key algorithms within InterFrame prediction of VVC, mainly focusing on the Test Zone Search (TZS) and the Affine Motion Estimation (AME), two of the most computationally intensive tools inside the VVC. Furthermore, this paper introduces a fast TZS and AME approach using Machine Learning, specifically employing Decision Trees. The proposed approach achieved an average reduction of over $\mathbf{2 0 \%}$ in total VVC encoding time while maintaining less than a 1 % impact on BD-BR coding efficiency.

关键词： video coding Motion estimation Machine learning Streaming media Encoding Software Remote working Decision trees Surges Standards

来源：评论

学校读者我要写书评

暂无评论

Multi-Codec video Quality Enhancement Model Based on Spatio- Temporal Deformable Fusion

Multi-Codec Video Quality Enhancement Model Based on Spatio-...

引用

IEEE Latin American Symposium on Circuits and Systems (LASCAS)

作者： Gilberto Kreisler Garibaldi Da Silveira Bruno Zatt Daniel Palomino Guilherme Corrêa Video Technology Research Group (ViTech) Graduate Program in Computing (PPGC) Federal University of Pelotas (UFPel) Pelotas Brazil

The popularization of mobile phones and other multimedia portable devices paved the way for the increase in video consumption worldwide. However, it is impossible to transmit a non-compressed video due to the high bandwidth required. To achieve significant compression rates, video codecs usually employ methods that damage the visual quality perceived by the end user in non-negligible levels. Different architectures based on deep learning have been recently proposed for video Quality Enhancement (VQE). Still, most of them are trained and validated using videos generated by a single codec under fixed configurations. With the increase of video coding formats and standards on the market, VQE methods that apply to different contexts are desired. This paper proposes a new VQE model based on the Spatio- Temporal Deformable Fusion (STDF) archi-tecture, providing quality gains for videos compressed according to different formats and standards, such as HEVC, VVC, VP9, and AVI. The results demonstrate that by considering different video coding standards and formats to build the STDF model, a significant increase in VQE is achieved, with an average PSNR increment of up to 0.382 dB.

关键词：

来源：评论

学校读者我要写书评

暂无评论

Neural Network Based Light Field Predictors: An Evaluation

Neural Network Based Light Field Predictors: An Evaluation

引用

IEEE Latin American Symposium on Circuits and Systems (LASCAS)

作者： Matheus Jahnke Ítalo D. Machado Lucas Ikenoue Bruno Zatt Video Technology Research Group (ViTech) Graduate Program in Computing (PPGC) Federal University of Pelotas (UFPel) Pelotas Brazil

Processing and storing the 4D structure of light fields can be challenging and expensive due to the high-dimensional data and its unique characteristics. There are plenty of works employing convolutional neural networks (CNNs) for light field prediction and encoding. Nonetheless, to the best of our knowledge, the literature lacks an efficiency evaluation of different CNN architectures as well as 4D neural networks for these purposes. Therefore, this paper presents an experimental study that assesses the performance of pipeline and U-net convolutional neural networks for light field block prediction in both spatial and angular dimensions. Additionally, we compare these architectures with a novel 4D network that aims to exploit the light field data structure. The results of the study show that U-net and 4D networks outperform classical CNN architectures in terms of prediction accuracy and residue generation. Furthermore, the spatial dimension prediction provides more valuable information for the networks to learn, improving their prediction by 5dB.

关键词：

来源：评论

学校读者我要写书评

暂无评论

video Quality Enhancement Using Multi-Domain Spatio-Temporal Deformable Fusion 16

Video Quality Enhancement Using Multi-Domain Spatio-Temporal...

引用

16th IEEE Latin American Symposium on Circuits and Systems, LASCAS 2025

作者： Kreisler, Gilberto Da Silveira Junior, Garibaldi Zatt, Bruno Palomino, Daniel Correa, Guilherme Video Technology Research Group - ViTech Graduate Program in Computing - UFPel Pelotas Brazil

ISBN: (纸本)9798331522124

Lossy video compression introduces visual artifacts that degrade video quality, where deep neural networks (DNNs) are effective in enhancement. However, conventional DNN-based methods often focus on a single video compression standard, limiting their deployment in multiple cases. To overcome this issue, this study introduces a multi-domain video quality enhancement architecture based on the Spatio-Temporal Deformable Fusion (STDF) technique. This method enables the model to enhance videos compressed with multiple codecs, maintaining reliable performance across standards. After trained, the proposed architecture was tested with videos compressed by the High Efficiency video Coding (HEVC) encoder, the Versatile video Coding (VVC) encoder, the VP9 codec and the AOMedia video 1 (AV1) codec. Results show an average Peak Signal-to-Noise Ratio (PSNR) improvement between 0.228 dB and 0.787 dB. © 2025 IEEE.

关键词： video streaming

来源：评论

学校读者我要写书评

暂无评论

video Quality Enhancement Using Multi-Domain Spatio-Temporal Deformable Fusion

Video Quality Enhancement Using Multi-Domain Spatio-Temporal...

引用

IEEE Latin American Symposium on Circuits and Systems (LASCAS)

作者： Gilberto Kreisler Garibaldi da Silveira Júnior Bruno Zatt Daniel Palomino Guilherme Correa Video Technology Research Group - ViTech Graduate Program in Computing - UFPel Pelotas Brazil

ISBN: (数字)9798331522124

ISBN: (纸本)9798331522131

关键词： Training Visualization Codecs PSNR Artificial neural networks video compression Quality assessment video recording Standards High efficiency video coding

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：