检索结果-内蒙古大学图书馆

Understanding perceptual distortion in MPEG scalable audio coding

IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING 2005年第3期13卷 422-431页

作者： Creusere, CD New Mexico State Univ Klipsch Sch Elect & Comp Engn Las Cruces NM 88003 USA

In this paper, we study coding artifacts in MPEG-compressed scalable audio. Specifically, we consider the MPEG advanced audio coder (AAC) using bit slice scalable arithmetic coding (BSAC) as implemented in the MPEG-4 reference software. First we perform human subjective testing using the comparison category rating (CCR) approach, quantitatively comparing the performance of scalable BSAC with the nonscaled TwinVQ and AAC algorithms. This testing indicates that scalable BSAC performs very poorly relative to TwinVQ at the lowest bitrate considered (16 kb/s) largely because of an annoying and seemingly random mid-range tonal signal that is superimposed onto the desired output. In order to better understand and quantify the distortion introduced into compressed audio at low bit rates, we apply two analysis techniques: Reng bifrequency probing and time-frequency decomposition. Using Reng probing, we conclude that aliasing is most likely not the cause of the annoying tonal signal;instead, time-frequency or spectrogram analysis indicates that its cause is most likely suboptimal bit allocation. Finally, we describe the energy equalization quality metric (EEQM) for predicting the relative perceptual performance of the different coding algorithms and compare its predictive ability with that of ITU Recommendation ITU`-R BS.1387-1.

关键词： audio analysis audio coding audio quality metrics objective quality assessment perceptual distortion scalable coding

来源：评论

学校读者我要写书评

暂无评论

Fixed Quality Layered Audio Based on scalable Lossless coding

引用

IEEE TRANSACTIONS ON MULTIMEDIA 2009年第3期11卷 422-432页

作者： Li, Te Rahardja, Susanto Koh, Soo Ngee Nanyang Technol Univ Sch Elect & Elect Engn Singapore 639798 Singapore Nanyang Technol Univ Sch Elect & Elect Engn Singapore 639798 Singapore

The paper addresses a bitstream scalable coder based on the MPEG-4 scalable lossless (SLS) coding system where, in contrast to SLS, the bitrate of the enhancement layer is not fixed but instead an attempt is made to create a quality-fixed enhancement layer. With a PCM audio input, the proposed structure is able to produce an audio version with near-transparent quality on top of the existing low-quality version. In particular, the proposed fixed quality enhancing process with checking procedures is able to provide the minimum amount of enhancement for the low-quality version to obtain a near-transparent quality that is almost indistinguishable from the CD quality. In addition, a bitrate estimation model is proposed. The model enables the direct estimation of the enhancing bitrate from two parameters extracted from the encoding process of the low-quality version. Evaluation results indicate that a better defined quality level is guaranteed compared to a fixed bitrate setting and that in the mean a lower (approximately 20%) bitrate is attained. It is also shown that the estimation model proposed is able to accurately predict the necessary enhancing bitrate and at the same time, reduce the complexity by around 17%.

关键词： Audio coding scalable coding transparent quality

来源：评论

学校读者我要写书评

暂无评论

Learned scalable video coding for humans and machines

引用

EURASIP JOURNAL ON IMAGE AND VIDEO PROCESSING 2024年第1期2024卷 41页

作者： Hadizadeh, Hadi Bajic, Ivan V. Simon Fraser Univ Sch Engn Sci 8888 Univ Dr Burnaby BC V5A 1S6 Canada

Video coding has traditionally been developed to support services such as video streaming, videoconferencing, digital TV, and so on. The main intent was to enable human viewing of the encoded content. However, with the advances in deep neural networks (DNNs), encoded video is increasingly being used for automatic video analytics performed by machines. In applications such as automatic traffic monitoring, analytics such as vehicle detection, tracking and counting, would run continuously, while human viewing could be required occasionally to review potential incidents. To support such applications, a new paradigm for video coding is needed that will facilitate efficient representation and compression of video for both machine and human use in a scalable manner. In this manuscript, we introduce an end-to-end learnable video codec that supports a machine vision task in its base layer, while its enhancement layer, together with the base layer, supports input reconstruction for human viewing. The proposed system is constructed based on the concept of conditional coding to achieve better compression gains. Comprehensive experimental evaluations conducted on four standard video datasets demonstrate that our framework outperforms both state-of-the-art learned and conventional video codecs in its base layer, while maintaining comparable performance on the human vision task in its enhancement layer.

关键词： Video compression Video analytics scalable coding Deep learning coding for machines

来源：评论

学校读者我要写书评

暂无评论

Parallel TCP and scalable video coding for jitter free video transmission over MIMO wireless networks

引用

TELECOMMUNICATION SYSTEMS 2016年第4期61卷 733-753页

作者： Chaurasia, Avinash Kumar Jagannatham, Aditya K. IIT Kanpur MWN Lab Kanpur Uttar Pradesh India IIT Kanpur ACES 205D Dept Elect Engn Kanpur Uttar Pradesh India

There is a significant rise in demand for video transmission over 3G and 4G wireless networks due to the rising popularity of video streaming websites such as YouTube. The market for video streaming over wireless networks is expected to increase sharply in the future. Both of the two basic transport layer protocols without modifications are not suited for video transmission over wireless networks. UDP (user datagram protocol) suffers from inherent unreliability, resulting in corrupted video due to frequent corruption of packets. Inherent features of wireless networks such as noise, interference, etc. result in packet corruption. On the other hand, the performance of TCP (transmission control protocol) is worse than UDP (Thangaraj et al. in Telecommun Syst 45(4):303-312, 2010) because of frequently corrupted packets. Due to its reliable data transfer feature, TCP continuously retransmits the corrupted packet until successful reception at the receiver. This leads to jitter in video playback and poor end user quality of experience. Multiple TCP connections with appropriate optimization can lead to an increased efficiency of bandwidth utilization in comparison to single TCP based video transmission over wireless networks. It has been shown that multiple TCP connections enhance the video transmission and playback experience by providing reliable communication. The parallel TCP scheme proposed in this paper enhances the quality of video transmission and playback experience over MIMO wireless networks employing scalable hierarchical wavelet decomposition based video encoding with multiple TCP connections.

关键词： MIMO Parallel TCP Video transmission scalable coding Wireless networks

来源：评论

学校读者我要写书评

暂无评论

Analysis of Prediction Algorithms for Residual Compression in a Lossy to Lossless scalable Video coding System Based on HEVC 37

Analysis of Prediction Algorithms for Residual Compression i...

引用

Conference on Applications of Digital Image Processing XXXVII

作者： Heindel, Andreas Wige, Eugen Kaup, Andre FAU D-91058 Erlangen Germany

ISBN: (纸本)9781628412444

Lossless image and video compression is required in many professional applications. However, lossless coding results in a high data rate, which leads to a long wait for the user when the channel capacity is limited. To overcome this problem, scalable lossless coding is an elegant solution. It provides a fast accessible preview by a lossy compressed base layer, which can be refined to a lossless output when the enhancement layer is received. Therefore, this paper presents a lossy to lossless scalable coding system where the enhancement layer is coded by means of intra prediction and entropy coding. Several algorithms are evaluated for the prediction step in this paper. It turned out that Sample-based Weighted Prediction is a reasonable choice for usual consumer video sequences and the Median Edge Detection algorithm is better suited for medical content from computed tomography. For both types of sequences the efficiency may be further improved by the much more complex Edge-Directed Prediction algorithm. In the best case, in total only about 2.7% additional data rate has to be invested for scalable coding compared to single-layer JPEG-LS compression for usual consumer video sequences. For the case of the medical sequences scalable coding is even more efficient than JPEG-LS compression for certain values of QP.

关键词： Enhancement layer HEVC lossy to lossless scalable coding SELC SWP

来源：评论

学校读者我要写书评

暂无评论

TOWARDS coding FOR HUMAN AND MACHINE VISION: A scalable IMAGE coding APPROACH

TOWARDS CODING FOR HUMAN AND MACHINE VISION: A SCALABLE IMAG...

引用

IEEE International Conference on Multimedia and Expo (ICME)

作者： Hu, Yueyu Yang, Shuai Yang, Wenhan Duan, Ling-Yu Liu, Jiaying Peking Univ Beijing Peoples R China

ISBN: (纸本)9781728113319

The past decades have witnessed the rapid development of image and video coding techniques in the era of big data. However, the signal fidelity-driven coding pipeline design limits the capability of the existing image/video coding frameworks to fulfill the needs of both machine and human vision. In this paper, we come up with a novel image coding framework by leveraging both the compressive and the generative models, to support machine vision and human perception tasks jointly. Given an input image, the feature analysis is first applied, and then the generative model is employed to perform image reconstruction with features and additional reference pixels, in which compact edge maps are extracted in this work to connect both kinds of vision in a scalable way. The compact edge map serves as the basic layer for machine vision tasks, and the reference pixels act as a sort of enhanced layer to guarantee signal fidelity for human vision. By introducing advanced generative models, we train a flexible network to reconstruct images from compact feature representations and the reference pixels. Experimental results demonstrate the superiority of our framework in both human visual quality and facial landmark detection, which provide useful evidence on the emerging standardization efforts on MPEG VCM (Video coding for Machine)(1).

关键词： Video coding for Machine Image coding scalable coding Generative Compression

来源：评论

学校读者我要写书评

暂无评论

scalable AUDIO coding USING WATERMARKING

SCALABLE AUDIO CODING USING WATERMARKING

引用

IEEE International Conference on Multimedia and Expo Workshops (ICMEW)

作者： Movassagh, Mahmood Kabal, Peter McGill Univ Dept Elect & Comp Engn Montreal PQ Canada

ISBN: (纸本)9781479900152

A scalable audio coding method is proposed using a technique, Quantization Index Modulation, borrowed from watermarking. Some of the information of each layer output is embedded (watermarked) in the previous layer. This approach leads to a saving in bitrate while keeping the distortion almost unchanged. This makes the scalable coding system more efficient in terms of Rate-Distortion. The results show that the proposed method outperforms the scalable audio coding based on reconstruction error quantization which is used in practical systems such as MPEG-4 AAC.

关键词： scalable coding Quantization Index Modulation Watermarking Entropy coding

来源：评论

学校读者我要写书评

暂无评论

FRACTIONAL COMPENSATION FOR SPATIAL scalable VIDEO coding

FRACTIONAL COMPENSATION FOR SPATIAL SCALABLE VIDEO CODING

引用

IEEE International Conference on Multimedia and Expo

作者： Sun, Xiaoyan Wu, Feng Microsoft Research Asia China

ISBN: (纸本)9781424442904

This paper proposes a novel fractional compensation approach for spatial scalable video coding. It simultaneously exploits inter layer correlation and intra layer correlation by learning-based mapping. Instead of using an enhancement layer reconstruction as an entire reference, a set of reference pairs are generated from high-frequency components of both base layer and enhancement layer reconstructions at previous frame. The reference set, which consists of low-resolution and high-resolution patches, can be generated in both encoder and decoder by on-line learning. During the encoding of enhancement layer, a prediction is first gotten from base layer, from which low-resolution patches are extracted. These patches are then used as indices to find the matched high-resolution patches from the reference set. Finally, the prediction enhanced by the high-resolution patches is used for coding. The proposed approach does not need any motion bits. With our proposed FC approach, the performance of H.264 SVC can be improved up to 2.4dB in spatial scalable coding.

关键词： video coding motion estimation scalable coding spatial scalability

来源：评论

学校读者我要写书评

暂无评论

SEMANTICALLY scalable IMAGE coding WITH COMPRESSION OF FEATURE MAPS

SEMANTICALLY SCALABLE IMAGE CODING WITH COMPRESSION OF FEATU...

引用

IEEE International Conference on Image Processing (ICIP)

作者： Yan, Ning Liu, Dong Li, Houqiang Wu, Feng Univ Sci & Technol China CAS Key Lab Technol Geospatial Informat Proc & Ap Hefei 230027 Peoples R China

ISBN: (纸本)9781728163956

In this paper, we consider a novel image coding paradigm, termed semantically scalable coding. In the new paradigm, coded bitstream serves for multiple different semantic analysis tasks, and different tasks require different semantic granularities of the image. Thus, the bitstream is designed to be scalable in the sense that progressive decoding of the bitstream provides coarse-to-fine semantic granularities. As a concrete example, we consider the task of coarse-grained and fine-grained image classification. We present a method to compress the multiple deep feature maps that are intermediate representations of an image passing a trained deep network. The deep-layer feature maps can serve for coarse-grained image classification while the shallow-layer feature maps can serve for fine-grained image classification. Experimental results demonstrate the feasibility of the proposed method, as well as the advantage of the semantically scalable coding paradigm.

关键词： Convolutional neural network feature map compression scalable coding image representation

来源：评论

学校读者我要写书评

暂无评论

Packet-Loss Robust scalable Speech coding Using the Discrete Wavelet Transform

Packet-Loss Robust Scalable Speech Coding Using the Discrete...

引用

IEEE International Symposium on Circuits and Systems (ISCAS)

作者： Seto, Koji Ogunfunmi, Tokunbo Santa Clara Univ Dept Elect Engn Santa Clara CA 95053 USA

ISBN: (纸本)9781479934324

This paper presents a new scalable speech codec for IP networks using the discrete wavelet transform (DWT). The scalable narrowband speech coding scheme based on the internet low bitrate codec (iLBC) was previously presented and achieved speech quality equivalent to G.718 for narrowband signals. Whereas the performance of the core layer was satisfactory, the higher speech quality by the addition of the enhancement layer which employed the modified discrete cosine transform (MDCT) was desired. We propose the utilization of the DWT instead of the MDCT to encode the core-layer coding error in the enhancement layer. The experimental simulation results show that the DWT is a promising technique to use for encoding highly non-stationary signals such as the coding error.

关键词： Discrete wavelet transform (DWT) internet low bitrate codec (iLBC) packet loss scalable coding speech coding voice over Internet protocol (VoIP)

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：