Multiple quantization (MQ) encoder is a basic entropy encoder widely used in multimedia encoding systems, especially in JPEG2000 and JBIG2, which play a key role and make outstanding contributions to digital imaging. ...
详细信息
Multiple quantization (MQ) encoder is a basic entropy encoder widely used in multimedia encoding systems, especially in JPEG2000 and JBIG2, which play a key role and make outstanding contributions to digital imaging. However, due to its strict serial working characteristics, the MQ encoder has become a key bottleneck restricting the optimization of JPEG2000 encoding performance. Many single ejection encoder architectures use pipeline staging to improve the throughput, but they have all reached their limits. A high-performance MQ encoder based on a novel lookup table is proposed to overcome this bottleneck, which improves the throughput while utilizing reasonable hardware resources and with low-power consumption. Experimental results show that the proposed encoder architecture can encode two symbols per cycle with a throughput of 1204.82 MSymbol/s, and it has a power consumption of only 11.05 mW. Compared with the conventional multi-context coder, the hardware utilization is reduced by 26%, and the hardware utilization efficiency is improved by 16%. Compared with the conventional single-context coder, the throughput is improved by 14.3%, and the power consumption is reduced by 40.1%.
The MQ arithmetic coding, which is an adaptive arithmetic coding developing from Q coding, has become a major throughput bottleneck of JPEG2000 compression due to its inherent serial operations. To overcome the bottle...
详细信息
The MQ arithmetic coding, which is an adaptive arithmetic coding developing from Q coding, has become a major throughput bottleneck of JPEG2000 compression due to its inherent serial operations. To overcome the bottleneck, this brief proposes a high-performance hardware architecture of the multicontext MQ coder. The proposed architecture is capable of concurrent coding for two adjacent more probable symbols (MPSs). Performance analysis results show that the proposed coder consumes 1.61 CXD pairs per cycle and achieves a throughput of 506.93 MSymbols/s under 0.5 bpp bit rate. The proposed architecture not only achieves high throughput, but also maintains both low hardware utilization and low power consumption. Compared with the state-of-the-art two-context coder, the figure of merit (FoM) is increased by 38%. Compared with the single-context coder, the power-delay product (PDP) is reduced by 49%.
This paper presents a two-stage Multiple-Model Compression (MMC) approach for sampled electrical waveforms. To limit latency, the processing is window-based, with a window length commensurate to the electrical period....
详细信息
ISBN:
(纸本)9798350385885;9798350385878
This paper presents a two-stage Multiple-Model Compression (MMC) approach for sampled electrical waveforms. To limit latency, the processing is window-based, with a window length commensurate to the electrical period. For each window, the first stage compares several parametric models to get a coarse representation of the samples. The second stage then compares different residual compression techniques to minimize the norm of the reconstruction error. The allocation of the rate budget among the two stages is optimized. The proposed MMC approach provides better signal-to-noise ratios than state-of-the-art solutions on periodic and transient waveforms.
Context modeling plays a very important role in data compression. In the arithmetic coding process, one usually classifies the data to be encoded into several contexts according to the characteristic of the causal par...
详细信息
ISBN:
(数字)9786165904773
ISBN:
(纸本)9786165904773
Context modeling plays a very important role in data compression. In the arithmetic coding process, one usually classifies the data to be encoded into several contexts according to the characteristic of the causal part. Each context has a frequency table, which reflects the probability distribution of the data belonging to this context. However, in practice, it often happens that the characteristic of the causal part is between that of two contexts. For example, if we assign the context according to whether the average gradient of the causal part is greater or smaller than 10, when the average gradient is around 10, if we assign the case into one of the contexts, since the input is not typical for both two contexts, the coding gain may not be achieved. In this work, we proposed an idea of mixed context. That is, in the intermediate case, the frequency table is constructed from that of the two adjacent contexts. Moreover, after encoding the data, the frequency table of both the two contexts are adjusted. Experiments for the example of JPEG DC term and lossless image encoding show that, with the proposed algorithm, the coding efficiency can be much improved and the frequency table converges to the truth probability distribution quickly.
This paper presents the intra prediction and mode coding of the Versatile Video coding (VVC) standard. This standard was collaboratively developed by the Joint Video Experts Team (JVET). It follows the traditional arc...
详细信息
This paper presents the intra prediction and mode coding of the Versatile Video coding (VVC) standard. This standard was collaboratively developed by the Joint Video Experts Team (JVET). It follows the traditional architecture of a hybrid block-based codec that was also the basis of previous standards. Almost all intra prediction features of VVC either contain substantial modifications in comparison with its predecessor H.265/HEVC or were newly added. The key aspects of these tools are the following: 65 angular intra prediction modes with block shape-adaptive directions and 4-tap interpolation filters are supported as well as the DC and Planar mode, Position Dependent Prediction Combination is applied for most of these modes, Multiple Reference Line Prediction can be used, an intra block can be further subdivided by the Intra Subpartition mode, Matrix-based Intra Prediction is supported, and the chroma prediction signal can be generated by the Cross Component Linear Model method. Finally, the intra prediction mode in VVC is coded separately for luma and chroma. Here, a Most Probable Mode list containing six modes is applied for luma. The individual compression performance of tools is reported in this paper. For the full VVC intra codec, a bitrate saving of 25% on average is reported over H.265/HEVC using an objective metric. Significant subjective benefits are illustrated with specific examples.
Light field (LF) technology is considered as a promising way for providing a high-quality virtual reality (VR) content. However, such an imaging technology produces a large amount of data requiring efficient LF image ...
详细信息
Light field (LF) technology is considered as a promising way for providing a high-quality virtual reality (VR) content. However, such an imaging technology produces a large amount of data requiring efficient LF image compression solutions. In this paper, we propose a LF image coding method based on a view synthesis and view quality enhancement techniques. Instead of transmitting all the LF views, only a sparse set of reference views are encoded and transmitted, while the remaining views are synthesized at the decoder side. The transmitted views are encoded using the versatile video coding (VVC) standard and are used as reference views to synthesize the dropped views. The selection of non-reference dropped views is performed using a rate-distortion optimization based on the VVC temporal scalability. The dropped views are reconstructed using the LF dual discriminator GAN (LF-D2GAN) model. In addition, to ensure that the quality of the views is consistent, at the decoder, a quality enhancement procedure is performed on the reconstructed views allowing smooth navigation across views. Experimental results show that the proposed method provides high coding performance and overcomes the state-of-the-art LF image compression methods by -36.22% in terms of BD-BR and 1.35 dB in BD-PSNR. The web page of this work is available at https://***/***.
Joint Photographic Experts Group (JPEG) XS is a new International Standard from the JPEG Committee (formally known as ISO/International Electrotechnical Commission (IEC) JTC1/SC29/WG1). It defines an interoperable, vi...
详细信息
Joint Photographic Experts Group (JPEG) XS is a new International Standard from the JPEG Committee (formally known as ISO/International Electrotechnical Commission (IEC) JTC1/SC29/WG1). It defines an interoperable, visually lossless low-latency lightweight image coding that can be used for mezzanine compression within any AV market. Among the targeted use cases, one can cite video transport over professional video links (serial digital interface (SDI), internet protocol (IP), and Ethernet), real-time video storage, memory buffers, omnidirectional video capture and rendering, and sensor compression (for example, in cameras and the automotive industry). The core coding system is composed of an optional color transform, a wavelet transform, and a novel entropy encoder, processing groups of coefficients by coding their magnitude level and packing the magnitude refinement. Such a design allows for visually transparent quality at moderate compression ratios, scalable end-to-end latency that ranges from less than one line to a maximum of 32 lines of the image, and a low-complexity real-time implementation in application-specific integrated circuit (ASIC), field-programmable gate array (FPGA), central processing unit (CPU), and graphics processing unit (GPU). This article details the key features of this new standard and the profiles and formats that have been defined so far for the various applications. It also gives a technical description of the core coding system. Finally, the latest performance evaluation results of recent implementations of the standard are presented, followed by the current status of the ongoing standardization process and future milestones.
Due to the huge data volume of high-resolution remote sensing imagery (RSI) and limited transmission bandwidth, RSIs are typically compressed for efficient transmission and storage. However, most of the existing compr...
详细信息
Due to the huge data volume of high-resolution remote sensing imagery (RSI) and limited transmission bandwidth, RSIs are typically compressed for efficient transmission and storage. However, most of the existing compression algorithms are developed based on optimizing for the human perceptual that are not suitable for remote sensing image applications where RSIs are usually used for machine interpretation tasks, such as semantic segmentation for ground-object recognition. In this article, we propose an image coding for machines (ICMs) paradigm based on contrastive learning in a fully supervised manner to boost semantic segmentation of compressed RSIs. Specifically, we build an end-to-end compression framework to make full use of the global semantic information by clustering intracategory projected embeddings and spacing intercategory embeddings apart, to compensate for the loss of feature discriminability during the compression process and reconstruct the decision boundaries between different categories. Compared to the state-of-the-art image compression methods, our proposed method significantly improves the performance of semantic segmentation on the remote sensing labeling benchmark datasets.
Point clouds are a very rich 3D visual representation model, which has become increasingly appealing for multimedia applications with immersion, interaction and realism requirements. Due to different acquisition and c...
详细信息
Point clouds are a very rich 3D visual representation model, which has become increasingly appealing for multimedia applications with immersion, interaction and realism requirements. Due to different acquisition and creation conditions as well as target applications, point clouds' characteristics may be very diverse, notably on their density. While geographical information systems or autonomous driving applications may use rather sparse point clouds, cultural heritage or virtual reality applications typically use denser point clouds to more accurately represent objects and people. Naturally, to offer immersion and realism, point clouds need a rather large number of points, thus asking for the development of efficient coding solutions. The use of deep learning models for coding purposes has recently gained relevance, with latest developments in image coding achieving state-of-the-art performance, thus making natural the adoption of this technology also for point cloud coding. This paper presents a novel deep learning-based solution for point cloud geometry coding which is able to efficiently adapt to the content's characteristics. The proposed coding solution divides the point cloud into 3D blocks and selects the most suitable available deep learning coding model to code each block, thus maximizing the compression performance. In comparison to the state-of-the-art MPEG G-PCC Trisoup standard, the proposed coding solution offers average quality gains up to 4.9 and 5.7 dB for PSNR D1 and PSNR D2, respectively.
The Joint Video Exploration Team (JVET) recently launched the standardization of the next-generation video coding named Versatile Video coding (VVC) with the inherited technical framework from its predecessor High-Eff...
详细信息
The Joint Video Exploration Team (JVET) recently launched the standardization of the next-generation video coding named Versatile Video coding (VVC) with the inherited technical framework from its predecessor High-Efficiency Video coding (HEVC). The simplified Enhanced Multiple transform (EMT) has been adopted as the primary residual codingtransform solution, termed Multiple transform Selection (MTS). In MTS, only the transform set consisting of DST-VII and DCT-VIII remains, excluding the other transform sets and the dependency on intra prediction modes. Significant coding gains are achieved by introducing new DST/DCT transforms, but the full matrix implementation is relatively costly compared to partial butterfly in terms of both software run-time and operation counts. In this work, we exploit the inherent features existing in DST-VII and DCT-VIII. Instead of repeating the element-wise additions and multiplications in full matrix operation, these features can be leveraged to achieve more efficient implementations which only use partial elements to derive the identical results. Existing transform matrices are further tuned to utilize these (anti-)symmetric features. A partial butterfly-type fast algorithm with dual-implementation support is proposed for DST-VII/DCT-VIII transform in VVC. Complexity analysis including operation counts and software run-time are conducted to validate the effectiveness. In addition, we prove the features are perfectly supported by theory. The proposed fast methods achieve noticeable software run-time savings without compromising on coding performance by comparing with the VVC Test Model VTM-3.0. It is shown that under Common Test Condition (CTC) with inter MTS enabled, an average of 9%, 0%, and 3% decoding time savings are achieved for All Intra (AI), Random Access (RA) and Low Delay B (LDB), respectively. Under low QP test condition with inter MTS enabled, the proposed fast methods achieve 1%, 2% and 4% decoding time savings on averag
暂无评论