检索结果-内蒙古大学图书馆

An adaptive CU size decision algorithm based on gradient boosting machines for 3D-HEVC inter-coding

MULTIMEDIA TOOLS AND APPLICATIONS 2023年第21期82卷 32539-32557页

作者： Bakkouri, Siham Elyousfi, Abderrahmane Sultan Moulay Slimane Univ Higher Sch Technol Beni Mellal 23000 Morocco Ibn Zohr Univ Dept Comp Sci Natl Engn Sch Appl Sci Agadir 80000 Morocco Ibn Zohr Univ Comp Syst & Vis Lab Fac Sci Agadir 80000 Morocco

3D high-efficiency video coding (3D-HEVC) is an extension of the HEVC standard for coding of texture videos and depth maps. 3D-HEVC inherits the same quadtree coding structure as HEVC for both texture and depth components, in which the coding units (CUs) are recursively conducted on different sizes, namely, depth levels. However, the recursive splitting process of the CU causes extensive computational complexity. To reduce this computational burden, this paper presents an adaptive CU size decision algorithm for texture videos and depth maps. The proposed algorithm is divided into three steps. In the first step, the average local variance (ALV) is extracted from each CU size to define their homogeneity. Then, a classification-based gradient boosting machines (GBM) is employed to analyze and build a binary classification model from the extracted ALV features. The GBM model is employed to extract and efficiently get suitable thresholds for texture and depth map CUs. In the last step, a fast CU size decision algorithm is performed based on adaptive thresholds for texture videos and depth maps. The experimental results show that the proposed algorithm reduces a significant amount of encoding time, while the loss in coding efficiency is negligible.

关键词： 3D-HEVC Machine learning Gradient boosting machines CU size inter-coding

来源：评论

学校读者我要写书评

暂无评论

Fast SHVC inter-coding based on Bayesian decision with coding depth estimation

引用

JOURNAL OF REAL-TIME IMAGE PROCESSING 2021年第6期18卷 2269-2285页

作者： Lu, Yu Huang, Xudong Liu, Huaping Yin, Haibing Shen, Liquan Hangzhou Dianzi Univ Sch Commun Engn Hangzhou 310018 Peoples R China Oregon State Univ Sch Elect Engn & Comp Sci Corvallis OR 97331 USA Shanghai Univ Sch Commun & Informat Engn Shanghai 200444 Peoples R China

The scalable extension of the high efficiency video coding standard named SHVC supports flexible access for various terminals in heterogeneous networks. However, it is difficult to use in real-time scenarios because of the high complexity of the hierarchical coding structure. In this paper, a novel method for SHVC inter-coding is proposed to reduce the coding complexity in a manner that is compatible with quality scalability and spatial scalability. First, the depth range of the coding tree units is estimated from a reference table generated from a statistical probability distribution based on the correlation between the current coding unit (CU) and its adjacent CUs. Within this depth range, a fast CU partitioning method based on Bayesian minimum risk and a fast prediction unit (PU) selection method based on Bayesian maximum probability are adopted to improve time efficiency. Three different methods, namely, histogram estimation, Gaussian modelling and neighbouring prediction, are used to calculate the conditional probabilities for discrete or continuous features in the Bayesian decision method. The significant advantage of the proposed method is that the time savings in the enhancement layer for each sequence exceeds 60% with negligible quality loss.

关键词： SHVC inter-coding Fast coding Depth range Bayesian decision

来源：评论

学校读者我要写书评

暂无评论

Machine learning-based fast CU size decision algorithm for 3D-HEVC inter-coding

引用

JOURNAL OF REAL-TIME IMAGE PROCESSING 2021年第3期18卷 983-995页

作者： Bakkouri, Siham Elyousfi, Abderrahmane Ibn Zohr Univ Agadir Morocco Fac Sci Comp Syst & Vis Lab Agadir Morocco Ibn Zohr Univ Agadir Morocco Natl Engn Sch Appl Sci Dept Comp Sci Agadir Morocco

3D-high efficiency video coding (3D-HEVC) is an extension of the high efficiency video coding (HEVC) standard for the compression of the texture videos and depth maps. In 3D-HEVC inter-coding, the coding unit (CU) is recursively performed on variable sizes, namely, depth levels. The CU size decision process is conducted using all the possible depth levels to obtain the one with the least rate-distortion (RD) cost using the Lagrange multiplier. These tools achieve the highest coding efficiency but incur a very high computational complexity. In this paper, a fast CU size decision algorithm is proposed to reduce the complexity caused by the CU size splitting process. The proposed algorithm is based on the CU homogeneity classification using machine learning technology. First, the tensor feature is extracted to characterize the homogeneity of CU, which has a strong relationship with CU sizes. Then, a boosted decision stump algorithm is employed to analyze and construct a binary classification model from the extracted features and find suitable thresholds for the proposed method. Finally, an efficient early termination of CU splitting is released based on adaptive thresholds for texture videos and depth maps. The experimental results show that the proposed algorithm reduces a significant encoding time, while the loss in coding efficiency is negligible.

关键词： 3D-HEVC Machine learning inter-coding CU size Binary classification AdaBoost

来源：评论

学校读者我要写书评

暂无评论

Bayesian-theory-based Fast CU Size and Mode Decision Algorithm for 3D-HEVC Depth Video inter-coding

引用

KSII TRANSACTIONS ON interNET AND INFORMATION SYSTEMS 2018年第4期12卷 1730-1747页

作者： Chen, Fen Liu, Sheng Peng, Zongju Hu, Qingqing Jiang, Gangyi Yu, Mei Ningbo Univ Fac Informat Sci & Engn Ningbo 315211 Zhejiang Peoples R China

Multi-view video plus depth (MVD) is a mainstream format of 3D scene representation in free viewpoint video systems. The advanced 3D extension of the high efficiency video coding (3D-HEVC) standard introduces new prediction tools to improve the coding performance of depth video. However, the depth video in 3D-HEVC is time consuming. To reduce the complexity of the depth video inter coding, we propose a fast coding unit (CU) size and mode decision algorithm. First, an off-line trained Bayesian model is built which the feature vector contains the depth levels of the corresponding spatial, temporal, and inter-component (texture-depth) neighboring largest CUs (LCUs). Then, the model is used to predict the depth level of the current LCU, and terminate the CU recursive splitting process. Finally, the CU mode search process is early terminated by making use of the mode correlation of spatial, inter-component (texture-depth), and inter-view neighboring CUs. Compared to the 3D-HEVC reference software HTM-10.0, the proposed algorithm reduces the encoding time of depth video and the total encoding time by 65.03% and 41.04% on average, respectively, with negligible quality degradation of the synthesized virtual view.

关键词： Depth video coding 3D-HEVC inter-coding CU size decision Mode decision

来源：评论

学校读者我要写书评

暂无评论

Shuffled Discrete Sine Transform in inter-Prediction coding

引用

ETRI JOURNAL 2017年第5期39卷 672-682页

作者： Choi, Jun-woo Kim, Nam-Uk Lim, Sung-Chang Kang, Jungwon Kim, Hui Yong Lee, Yung-Lyul Sejong Univ Comp Engn Dept Seoul South Korea ETRI Broadcasting Media Res Lab Daejeon South Korea

Video compression exploits statistical, spatial, and temporal redundancy, as well as transform and quantization. In particular, the transform in a frequency domain plays a major role in energy compaction of spatial domain data into frequency domain data. The high efficient video coding standard uses the type-II discrete cosine transform (DCT-II) and type-VII discrete sine transform (DST-VII) to improve the coding efficiency of residual data. However, the DST-VII is applied only to the Intra 4 x 4 residual block because it yields relatively small gains in the larger block than in the 4 x 4 block. In this study, after rearranging the data of the residual block, we apply the DST-VII to the inter-residual block to achieve coding gain. The rearrangement of the residual block data is similar to the arrangement of the basis vector with a the lowest frequency component of the DST-VII. Experimental results show that the proposed method reduces the luma-chroma (Cb+Cr) BD rates by approximately 0.23% to 0.22%, 0.44% to 0.58%, and 0.46% to 0.65% for the random access, low delay B, and low delay P configurations, respectively.

关键词： HEVC/H.265 inter-coding Residual data DST-VII DCT-II

来源：评论

学校读者我要写书评

暂无评论

inter-Frame Skip coding Mode For Point Cloud Geometry Compression in Solid G-PCC

Inter-Frame Skip Coding Mode For Point Cloud Geometry Compre...

引用

2025 IEEE international Conference on Acoustics, Speech, and Signal Processing, ICASSP 2025

作者： Huang, Ren Zhang, Wei Han, Jiangxue Yang, Fuzheng School of Telecommunications Engineering Xidian University Xi'an China Pengcheng Laboratory Shenzhen China

ISBN: (纸本)9798350368741

The rapid advancement of 3D sensing and rendering technologies has expanded the use of point clouds across various fields. To address the challenge of managing large point clouds, Point Cloud Compression (PCC) has gained significant research interest in recent years. The Moving Picture Experts Group (MPEG) has played an important role in standardizing PCC by designing frameworks such as Solid Geometry-based Point Cloud Compression (Solid G-PCC), using octree and Trisoup to compress the geometry of dynamic point clouds. This paper introduces a skip coding mode to improve the inter-frame coding by early terminating octree and Trisoup node generation based on a rate-distortion optimization consideration. Experiments show a 6.3% average improvement in geometry coding efficiency. © 2025 IEEE.

关键词： geometry compression inter-coding octree Point cloud Trisoup

来源：评论

学校读者我要写书评

暂无评论

Predictive Generalized Graph Fourier Transform for Attribute Compression of Dynamic Point Clouds

引用

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY 2021年第5期31卷 1968-1982页

作者： Xu, Yiqun Hu, Wei Wang, Shanshe Zhang, Xinfeng Wang, Shiqi Ma, Siwei Guo, Zongming Gao, Wen Chinese Acad Sci Inst Comp Technol Beijing 100190 Peoples R China Univ Chinese Acad Sci Sch Comp Sci & Technol Beijing 100049 Peoples R China Peking Univ Natl Engn Lab Video Technol Beijing 100871 Peoples R China Peking Univ Wangxuan Inst Comp Technol Beijing 100871 Peoples R China City Univ Hong Kong Dept Comp Sci Hong Kong Peoples R China

As 3D scanning devices and depth sensors advance, dynamic point clouds have attracted increasing attention as a format for 3D objects in motion, with applications in various fields such as immersive telepresence, navigation for autonomous driving and gaming. Nevertheless, the tremendous amount of data in dynamic point clouds significantly burden transmission and storage. To this end, we propose a complete compression framework for attributes of 3D dynamic point clouds, focusing on optimal inter-coding. Firstly, we derive the optimal inter-prediction and predictive transform coding assuming the Gaussian Markov Random Field model with respect to a spatio-temporal graph underlying the attributes of dynamic point clouds. The optimal predictive transform proves to be the Generalized Graph Fourier Transform in terms of spatio-temporal decorrelation. Secondly, we propose refined motion estimation via efficient registration prior to inter-prediction, which searches the temporal correspondence between adjacent frames of irregular point clouds. Finally, we present a complete framework based on the optimal inter-coding and our previously proposed intra-coding, where we determine the optimal coding mode from rate-distortion optimization with the proposed offline-trained lambda-Q model. Experimental results show that we achieve around 17% bit rate reduction on average over competitive dynamic point cloud compression methods.

关键词： Dynamic point clouds attribute coding inter-coding generalized graph Fourier transform

来源：评论

学校读者我要写书评

暂无评论

An Advanced LiDAR Point Cloud Sequence coding Scheme for Autonomous Driving 20

An Advanced LiDAR Point Cloud Sequence Coding Scheme for Aut...

引用

28th ACM international Conference on Multimedia (MM)

作者： Sun, Xuebin Wang, Sukai Wang, Miaohui Cheng, Shing Shin Liu, Ming Shenzhen Univ Coll Elect & Informat Engn Shenzhen Peoples R China Hong Kong Univ Sci & Technol Dept Elect & Comp Engn Hong Kong Peoples R China Chinese Univ Hong Kong Stone Robot Inst Shatin Hong Kong Peoples R China Chinese Univ Hong Kong Dept Mech & Automat Engn Shatin Hong Kong Peoples R China Guangdong Key Lab Intelligent Informat Proc Shenzhen Peoples R China Shenzhen Inst Artificial Intelligence & Robot Soc Shenzhen Peoples R China

ISBN: (纸本)9781450379885

Due to the huge volume of point cloud data, storing or transmitting it is currently difficult and expensive in autonomous driving. Learning from the high efficiency video coding (HEVC) coding framework, we propose an advanced coding scheme for large-scale LiDAR point cloud sequences, in which several techniques have been developed to remove the spatial and temporal redundancy. The proposed strategy consists mainly of intra-coding and inter-coding. For intra-coding, we utilize a cluster-based prediction method to remove the spatial redundancy. For inter-coding, a predictive recurrent network is designed, which is capable of generating future frames according to the previously encoded frames. By calculating the residual error between the predicted and real point cloud data, the temporal redundancy can be removed. Finally, the residual data is quantized and encoded by lossless coding schemes. Experiments are conducted on the KITTI data set with four different scenes to verify the effectiveness and efficiency of the proposed method. Our approach can deal with multiple types of point cloud data from the simple to more complex, and yields better performance in terms of compression ratio compared with octree, Google Draco, MPEG TMC13 and other recently proposed methods.

关键词： LiDAR Point CLoud Sequence Intra-coding inter-coding Predictive Recurrent Network

来源：评论

学校读者我要写书评

暂无评论

Patch-Based Conditional Context coding of Stereo Disparity Images

引用

IEEE SIGNAL PROCESSING LETTERS 2014年第10期21卷 1220-1224页

作者： Tabus, Ioan Tampere Univ Technol Dept Signal Proc FIN-33101 Tampere Finland

This letter proposes a method for lossless coding the left disparity image, L, from a stereo disparity image pair (L, R), conditional on the right disparity image, R, by keeping track of the transformation of the constant patches from R to L. The disparities in R are used for predicting the disparities in L, and the locations of the pixels where the prediction is erroneous are encoded in a first stage, conditional on the patch-labels of R image, allowing the decoder to already reconstruct with certainty some elements of the L image, e.g., the disparity values at certain pixels and parts of the contours of left image patches. Second, the contours of the patches in L image that are still unknown after first stage are conditionally encoded using a mixed conditioning context: the usual causal current context from the contours of L and a noncausal context extracted from the contours in the correctly estimated part of L obtained in the first stage. The depth values in the patches of L image are finally encoded, if they are not already known from the prediction stage. The new algorithm, dubbed conditional crack-edge region value (C-CERV), is shown to perform significantly better than the non-conditional coding method CERV and than another existing conditional coding method, over the Middlebury corpus. C-CERV is shown to reach lossless compression ratios of 100-250 times for those images that have a high precision of the disparity map.

关键词： Arithmetic coding context tree coding inter-coding lossless disparity image compression

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：