检索结果-内蒙古大学图书馆

Low-Complexity image and video coding Based on an Approximate Discrete Tchebichef Transform

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR video TECHNOLOGY 2017年第5期27卷 1066-1076页

作者： Oliveira, Paulo A. M. Cintra, Renato J. Bayer, Fabio M. Kulasekera, Sunera Madanayake, Arjuna Univ Fed Pernambuco Dept Estat Signal Proc Grp BR-50670901 Recife PE Brazil Univ Erlangen Nurnberg Dept Multimedia Commun & Signal Proc D-91054 Erlangen Germany Univ Rennes Inst Rech Informat & Syst Aleatoires Inst Res Comp Sci & Automat Equipe Cairn F-35000 Rennes France Inst Natl Sci Appl Lab Informat Images & Syst Informat F-69365 Lyon France Univ Fed Santa Maria Dept Estat BR-97105900 Santa Maria RS Brazil Univ Fed Santa Maria Lab Ciencias Espaciais Santa Maria LACESM BR-97105900 Santa Maria RS Brazil Univ Akron Dept Elect & Comp Engn Akron OH 44325 USA

The usage of linear transformations has great relevance for data decorrelation applications, like image and video compression. In that sense, the discrete Tchebichef transform (DTT) possesses useful coding and decorrelation properties. The DTT transform kernel does not depend on the input data and fast algorithms can be developed to real-time applications. However, the DTT fast algorithm presented in literature possess high computational complexity. In this paper, we introduce a new low-complexity approximation for the DTT. The fast algorithm of the proposed transform is multiplication free and requires a reduced number of additions and bit-shifting operations. image and video compression simulations in popular standards show good performance of the proposed transform. Regarding hardware resource consumption for FPGA shows a 43.1% reduction in configurable logic blocks and ASIC place and route realization shows a 57.7% reduction in the area-time figure compared with the 2D version of the exact DTT.

关键词： Approximate transforms discrete Tchebichef transform (DTT) fast algorithms image and video coding

来源：评论

学校读者我要写书评

暂无评论

A Class of Low-Complexity DCT-Like Transforms for image and video coding

引用

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR video TECHNOLOGY 2022年第7期32卷 4364-4375页

作者： da Silveira, Thiago L. T. Canterle, Diego Ramos Coelho, Diego F. G. Coutinho, Vitor A. Bayer, Fabio M. Cintra, Renato J. Univ Fed Rio Grande do Sul UFRGS Inst Inform & X00E1 Tica BR-90040060 Porto Alegre RS Brazil Univ Sao Paulo Inst Math & Stat BR-05508090 Sao Paulo Brazil Recife Ctr Adv Studies & Syst CESAR BR-50030390 Recife PE Brazil Univ Fed Santa Maria UFSM Dept Estat BR-97105900 Santa Maria RS Brazil Univ Fed Santa Maria UFSM LACESM BR-97105900 Santa Maria RS Brazil Univ Fed Pernambuco UFPE Dept Estat Signal Proc Grp BR-50670901 Recife PE Brazil

The discrete cosine transform (DCT) is a relevant tool in signal processing applications, mainly known for its good decorrelation properties. Current image and video coding standards-such as JPEG and HEVC-adopt the DCT as a fundamental building block for compression. Recent works have introduced low-complexity approximations for the DCT, which become paramount in applications demanding real-time computation and low-power consumption. The design of DCT approximations involves a trade-off between computational complexity and performance. This paper introduces a new multiparametric transform class encompassing the round-off DCT (RDCT) and the modified RDCT (MRDCT), two relevant multiplierless 8-point approximate DCTs. The associated fast algorithm is provided. Four novel orthogonal low-complexity 8-point DCT approximations are obtained by solving a multicriteria optimization problem. The optimal 8-point transforms are scaled to lengths 16 and 32 while keeping the arithmetic complexity low. The proposed methods are assessed by proximity and coding measures with respect to the exact DCT. image and video coding experiments and hardware realization are performed. The novel transforms perform close to or outperform the current state-of-the-art DCT approximations.

关键词： Transforms Discrete cosine transforms Complexity theory video coding Arithmetic Hardware Encoding DCT approximation low-complexity transforms image and video coding

来源：评论

学校读者我要写书评

暂无评论

Just Recognizable Distortion for Machine Vision Oriented image and video coding

引用

INTERNATIONAL JOURNAL OF COMPUTER VISION 2021年第10期129卷 2889-2906页

作者： Zhang, Qi Wang, Shanshe Zhang, Xinfeng Ma, Siwei Gao, Wen Peking Univ Natl Engn Lab Video Technol Beijing Peoples R China Univ Chinese Acad Sci Sch Comp Sci & Technol Beijing Peoples R China Peng Cheng Lab Shenzhen Guangdong Peoples R China

Machine visual intelligence has exploded in recent years. Large-scale, high-quality image and video datasets significantly empower learning-based machine vision models, especially deep-learning models. However, images and videos are usually compressed before being analyzed in practical situations where transmission or storage is limited, leading to a noticeable performance loss of vision models. In this work, we broadly investigate the impact on the performance of machine vision from image and video coding. Based on the investigation, we propose Just Recognizable Distortion (JRD) to present the maximum distortion caused by data compression that will reduce the machine vision model performance to an unacceptable level. A large-scale JRD-annotated dataset containing over 340,000 images is built for various machine vision tasks, where the factors for different JRDs are studied. Furthermore, an ensemble-learning-based framework is established to predict the JRDs for diverse vision tasks under few- and non-reference conditions, which consists of multiple binary classifiers to improve the prediction accuracy. Experiments prove the effectiveness of the proposed JRD-guided image and video coding to significantly improve compression and machine vision performance. Applying predicted JRD is able to achieve remarkably better machine vision task accuracy and save a large number of bits.

关键词： image and video coding Machine vision Deep learning Just noticeable distortion

来源：评论

学校读者我要写书评

暂无评论

A Discrete Tchebichef Transform Approximation for image and video coding

引用

IEEE SIGNAL PROCESSING LETTERS 2015年第8期22卷 1137-1141页

作者： Oliveira, Paulo A. M. Cintra, Renato J. Bayer, Fabio M. Kulasekera, Sunera Madanayake, Arjuna Univ Fed Pernambuco Dept Estat Signal Proc Grp Recife PE Brazil INSA Lyon LIRIS Lyon France Univ Fed Santa Maria Dept Estat BR-97119900 Santa Maria RS Brazil Univ Fed Santa Maria LACESM BR-97119900 Santa Maria RS Brazil Univ Akron Dept Elect & Comp Engn Akron OH 44325 USA

In this letter, we introduce a low-complexity approximation for the discrete Tchebichef transform (DTT). The proposed forward and inverse transforms are multiplication-free and require a reduced number of additions and bit-shifting operations. Numerical compression simulations demonstrate the efficiency of the proposed transform for image and video coding. Furthermore, Xilinx Virtex-6 FPGA based hardware realization shows 44.9% reduction in dynamic power consumption and 64.7% lower area when compared to the literature.

关键词： Approximate DTT fast algorithms image and video coding

来源：评论

学校读者我要写书评

暂无评论

image restoration for frame- and object-based video coding using an adaptive constrained least-squares approach

引用

SIGNAL PROCESSING 2000年第11期80卷 2337-2345页

作者： Kaup, A Siemens Corp Technol Networks & Multimedia Commun D-81730 Munich Germany

Especially at low bit-rates current DCT-based video coding standards suffer from the disadvantage that coarse quantization and a rigid block structure result in noticeable blocking and ringing noise. In this paper we propose a spatially adaptive method for reduction of these coding artifacts based on the principle of constrained least-squares image restoration. A strictly local filter is developed which adapts to the spatial image characteristics as well as to the coding conditions. Due to the small filter kernel, the method can be applied to frame as well as object-based video coding and is also suited for quality improvement in arbitrarily shaped MPEG-4 coded video material. The proposal is numerically scalable and yields visually pleasing results for intra- as well as interframe coded images. This is also reflected by consistent PSNR improvements between 0.2 and 1.3 dB. As post-processing technique, it is compatible to all existing image and video coding standards. (C) 2000 Elsevier Science B.V. All rights reserved.

关键词： deblocking blocking artifacts ringing image enhancement image and video coding MPEG-4

来源：评论

学校读者我要写书评

暂无评论

Seeking the optimal accuracy-rate equilibrium in face recognition

引用

NEUROCOMPUTING 2025年 624卷

作者： Tian, Yu Ou, Fu-Zhao Wang, Shiqi Chen, Baoliang Kwong, Sam City Univ Hong Kong Dept Comp Sci Hong Kong Peoples R China South China Normal Univ Dept Comp Sci Guangzhou Peoples R China Lingnan Univ Sch Data Sci Hong Kong Peoples R China

Considerable facial data is often compressed prior to analysis to accommodate limitations in transmission or storage capacities. However, this compression may lead to the loss of crucial identity details, thereby diminishing the effectiveness of facial recognition (FR) systems. In this study, we aim to establish an optimal accuracy-rate equilibrium (ARE) to maximize the compression ratio without substantially compromising the performance of the FR system. We first investigate the effect of image compression in deep FR. Based on the definition of ARE in FR, we proposed a method to locate the ARE values of face images with true acceptance rate and false accept rate. Subsequently, we develop an ARE prediction method for the FR system (ARE-FR), which automatically infers ARE images of face images. The goal of the proposed ARE-FR is to maximize redundancy removal without impairment of robust identity information. Considering that high-level semantic features effectively capture crucial identity information, we force the proposed ARE-FR to focus only on the features in identity-related regions when predicting the ARE images. These features are derived from the interactive relationships between deep and shallow features. Experimental results have demonstrated that combining our proposed ARE-FR with the image coding algorithm is capable of saving more bits while maintaining the performance of the FR system.

关键词： Just noticeable distortion Face recognition Deep neural network image and video coding

来源：评论

学校读者我要写书评

暂无评论

Deep linear autoencoder and patch clustering-based unified one-dimensional coding of image and video

引用

JOURNAL OF ELECTRONIC IMAGING 2017年第5期26卷 053016-053016页

作者： Li, Honggui Yangzhou Univ Phys Coll Sci & Technol Yangzhou Jiangsu Peoples R China

This paper proposes a unified one-dimensional (1-D) coding framework of image and video, which depends on deep learning neural network and image patch clustering. First, an improved K-means clustering algorithm for image patches is employed to obtain the compact inputs of deep artificial neural network. Second, for the purpose of best reconstructing original image patches, deep linear autoencoder (DLA), a linear version of the classical deep nonlinear autoencoder, is introduced to achieve the 1-D representation of image blocks. Under the circumstances of 1-D representation, DLA is capable of attaining zero reconstruction error, which is impossible for the classical nonlinear dimensionality reduction methods. Third, a unified 1-D coding infrastructure for image, intraframe, interframe, multiview video, three-dimensional (3-D) video, and multiview 3-D video is built by incorporating different categories of videos into the inputs of patch clustering algorithm. Finally, it is shown in the results of simulation experiments that the proposed methods can simultaneously gain higher compression ratio and peak signal-to-noise ratio than those of the state-of-the-art methods in the situation of low bitrate transmission. (C) 2017 SPIE and IS&T

关键词： deep linear autoencoder nonlinear dimensionality reduction image and video coding patch clustering united one-dimensional coding

来源：评论

学校读者我要写书评

暂无评论

Statistical motion learning for improved transform domain Wyner-Ziv video coding

引用

IET image PROCESSING 2010年第1期4卷 28-41页

作者： Martins, R. Brites, C. Ascenso, J. Pereira, F. Univ Tecn Lisboa Inst Telecomunicacoes Inst Super Tecn P-1049001 Lisbon Portugal Inst Super Engn Lisboa Inst Telecomunicacoes P-1959007 Lisbon Portugal

Wyner - Ziv (WZ) video coding is a particular case of distributed video coding (DVC), the recent video coding paradigm based on the Slepian - Wolf and Wyner - Ziv theorems which exploits the source temporal correlation at the decoder and not at the encoder as in predictive video coding. Although some progress has been made in the last years, WZ video coding is still far from the compression performance of predictive video coding, especially for high and complex motion contents. The WZ video codec adopted in this study is based on a transform domain WZ video coding architecture with feedback channel-driven rate control, whose modules have been improved with some recent coding tools. This study proposes a novel motion learning approach to successively improve the rate-distortion (RD) performance of the WZ video codec as the decoding proceeds, making use of the already decoded transform bands to improve the decoding process for the remaining transform bands. The results obtained reveal gains up to 2.3 dB in the RD curves against the performance for the same codec without the proposed motion learning approach for high motion sequences and long group of pictures (GOP) sizes.

关键词： improved transform domain Wyner-Ziv video coding Wyner-Ziv theorem discrete cosine transforms Slepian-Wolf theorem rate-distortion performance image motion analysis motion sequences Computer vision and image processing techniques image and video coding statistical motion learning group of pictures sizes video coding image sequences video signal processing Integral transforms in numerical analysis rate distortion theory discrete cosine transform WZ video codec feedback channel-driven rate control

来源：评论

学校读者我要写书评

暂无评论

Residue-free video coding with pixelwise adaptive spatio-temporal prediction

引用

IET image PROCESSING 2008年第3期2卷 131-138页

作者： Day, M. G. Robinson, J. A. Univ York Dept Elect Visual Syst Lab York YO10 5DD N Yorkshire England

The authors introduce residue-free video coding, in which motion-compensated predictions from surrounding frames and spatial predictions from the current frame are combined adaptively on a pixel-by-pixel basis. The consequence is that residue frames, blocks or regions are never explicitly formed. The authors describe a practical embodiment of a residue-free coder - temporal prediction trees - in which the local adaptation is conditioned frame to frame by a control parameter derived from global motion statistics. Using fixed-block-size motion compensation, the resulting coder is competitive with conventional residue-based compression, and at higher data rates is able to outperform H.264/AVC for high-activity sequences.

关键词： spatiotemporal phenomena global motion statistics video sequence video coding statistical analysis image sequences H.264/AVC Computer vision and image processing techniques image and video coding motion compensation fixed-block-size motion-compensation pixelwise adaptive spatio-temporal prediction residue-free video coding Combinatorial mathematics residue-free coder-temporal prediction tree Other topics in statistics trees (mathematics) video signal processing

来源：评论

学校读者我要写书评

暂无评论

General framework of the construction of biorthogonal wavelets based on Bernstein bases: theory analysis and application in image compression

引用

IET COMPUTER VISION 2011年第1期5卷 50-67页

作者： Yang, X. Shi, Y. Yang, B. Beihang Univ Key Lab Math Informat & Behav Semant Minist Educ Sch Math & Syst Sci Beijing 100191 Peoples R China

The authors present a general framework of the construction of biorthogonal wavelets based on Bernstein bases along with theory analysis and application. The presented framework possesses the largest possible regularity, the required vanishing moments and the passband flatness of frequency response of filters. Based on this concept, the authors establish explicit formulas for filters of biorthogonal wavelets with arbitrary odd lengths. Meanwhile, a new family of parametric filters with symmetry is constructed. The choice of filter bank in wavelet compression is a critical issue that affects image quality. In this study, an optimal model of FIR aiming at image compression is brought forward, and the optimal finite impulse response (FIR) filters can be obtained correspondingly through sequential quadratic programming (SQP) and genetic algorithm (GA). The authors demonstrate the performance of the new family of filters given in this study for image compression with very encouraging results.

关键词： Optimisation techniques theory analysis image quality wavelet transforms biorthogonal wavelet construction genetic algorithm Bernstein base image and video coding Integral transforms band-pass filters genetic algorithms FIR filters passband flatness filter bank frequency response quadratic programming wavelet compression Computer vision and image processing techniques Filtering methods in signal processing image coding sequential quadratic programming FIR filter filter frequency response image compression

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：