Multichannel synthetic aperture radar (MC-SAR) allows for high-resolution imaging of a wide swath (HRWS), at the cost of acquiring and downlinking a significantly larger amount of data, compared with conventional SAR ...
详细信息
Multichannel synthetic aperture radar (MC-SAR) allows for high-resolution imaging of a wide swath (HRWS), at the cost of acquiring and downlinking a significantly larger amount of data, compared with conventional SAR systems. In this letter, we discuss the potential of efficient data volume reduction (DVR) for MC-SAR. Specifically, we focus on methods based on transform coding (TC) and linear predictive coding (LPC), which exploit the redundancy introduced in the raw data by the finer azimuth sampling peculiar to the MC system. The proposed approaches, in combination with a variable-bit quantization, allow for the optimization of the resulting performance and data rate. We consider three exemplary yet realistic MC-SAR systems, and we conduct simulations and analyses on synthetic SAR data considering different radar backscatter distributions, which demonstrate the effectiveness of the proposed methods.
作者:
Bajpai, ShrishIntegral Univ
Fac Engn & Informat Technol Dept Elect & Commun Engn Lucknow Uttar Pradesh India
The hyperspectral image provides rich spectral information content, which facilitates multiple applications. With the rapid advancement of the spatial and spectral resolution of optical instruments, the image data siz...
详细信息
The hyperspectral image provides rich spectral information content, which facilitates multiple applications. With the rapid advancement of the spatial and spectral resolution of optical instruments, the image data size has increased by many folds. For that, it requires a compression algorithm having low coding complexity, low coding memory demand and high coding efficiency. In recent years, many coding algorithms are proposed. The wavelet transform-based set-partitioned hyperspectral compression algorithms have superior coding performance. These algorithms employ linked lists or state tables to track the significant/insignificant of the partitioned sets/coefficients. The proposed algorithm uses the pyramid hierarchy property of wavelet transform. The markers are used to track the significance/insignificance of the pyramid level. A single pyramid level has many sets. An insignificant pyramid level having multiple sets is represented as a single bit in proposed compression algorithm, while a single insignificant set in 3D Set Partition Embedded bloCK (3D-SPECK) and 3D-Listless SPECK (3D-LSK) is represented as a single bit. Through this, the requirement of the bits in the proposed algorithm is less than other wavelet transform compression algorithms at the high bit planes. The simulation result shows that the proposed compression algorithm has high coding efficiency with very less coding complexity and moderate coding memory requirement. The reduced coding complexity improves the performance of the image sensor and lowers the power consumption. Thus, the proposed compression algorithm has great potential in low-resource onboard hyperspectral imaging systems.
Visual data coding is an enabling technology for various applications and is now ubiquitously adopted in modern image processing, communications, and computer vision systems. To enable interoperability between devices...
详细信息
Visual data coding is an enabling technology for various applications and is now ubiquitously adopted in modern image processing, communications, and computer vision systems. To enable interoperability between devices manufactured and services provided by different enterprises, a series of standards targeting visual data coding have been crafted in the past three decades. Several standardization organizations, such as ISO/IEC JTC 1/SC 29 consisting of Joint Picture Experts Group (JPEG) and Moving Picture Experts Group (MPEG), 1 ITU-T SG 16 Video coding Experts Group (VCEG), 2 IEEE Data Compression Standards Committee Audio Video coding Working Group (1857 WG), 3 MPAI Community, 4 have been creating these standards from many contributions of academia and industry. While most of these visual coding standards have been successfully deployed in many applications, there are more challenges nowadays, especially to accommodate the large volume of visual data in limited storage and limited bandwidth transmission links. Compression efficiency improvements are still needed, especially considering emerging data representation formats ranging from 8K/HDR image/video to rich plenoptic data.
In this paper we propose novel extensions to JPEG 2000 for the coding of discontinuous media which includes piecewise smooth imagery such as depth maps and optical flows. These extensions use breakpoints to model disc...
详细信息
In this paper we propose novel extensions to JPEG 2000 for the coding of discontinuous media which includes piecewise smooth imagery such as depth maps and optical flows. These extensions use breakpoints to model discontinuity boundary geometry and apply a breakpoint dependent Discrete Wavelet transform (BP-DWT) to the input imagery. The highly scalable and accessible coding features provided by the JPEG 2000 compression framework are preserved by our proposed extensions, with the breakpoint and transform components encoded as independent bit streams that can be progressively decoded. Comparative rate-distortion results are provided along with corresponding visual examples which highlight the advantages of using breakpoint representations with accompanying BD-DWT and embedded bit-plane coding. Recently our proposed extensions have been adopted and are in the process of being published as a new Part 17 to the JPEG 2000 family of coding standards.
Light fields are one of the emerging 3D representation formats with an effective potential to offer very realistic and immersive visual experiences. This capability comes at the cost of a very large amount of acquired...
详细信息
Light fields are one of the emerging 3D representation formats with an effective potential to offer very realistic and immersive visual experiences. This capability comes at the cost of a very large amount of acquired data which practical use requires efficient coding solutions. This need was already addressed by the JPEG Pleno Light Field coding standard for static light fields, which has specified two coding modes, named 4D-transform and 4D-Prediction. While the first offers better compression performance for smaller baseline light fields, the second excels for larger baseline light fields. This paper intends to propose a novel light field coding mode, the Slanted 4D-transform coding mode, which extends the 4D-transform coding mode based on the conventional 4D-DCT, to offer better compression performance than both the available JPEG Pleno coding modes, independently of the baseline. The key idea is to apply first to each 4D block in the light field an adaptive, hierarchical geometric transformation, which makes the data in the block more energy-compaction friendly for the following 4D-DCT. The rate-distortion performance results show that the proposed Slanted 4D-transform codec is able to outperform both the already standardized JPEG Pleno coding modes, showing BD-Rates gains of 31.03% and 28.30% for the 4D-transform and 4D-Prediction modes, respectively, thus implying that a single coding mode can efficiently code all types of light fields.
Multi-view video (MVV) data processed by three-dimensional (3D) video systems often suffer from compression artifacts, which can degrade the rendering quality of 3D spaces. In this paper, we focus on the task of artif...
详细信息
Multi-view video (MVV) data processed by three-dimensional (3D) video systems often suffer from compression artifacts, which can degrade the rendering quality of 3D spaces. In this paper, we focus on the task of artifact reduction in multi-view video compression using spatial and temporal motion priors. Previous MVV quality enhancement networks using a warping-and-fusion approach employed reference-to-target motion priors to exploit inter-view and temporal correlation among MVV frames. However, these motion priors were sensitive to quantization noise, and the warping accuracy was degraded, when the target frame used low-quality features in the corresponding search. To overcome these limitations, we propose a novel approach that utilizes bilateral spatial and temporal motion priors, leveraging the geometry relations of a structured MVV camera system, to exploit motion coherency. Our method involves a multi-view prior generation module that produces both unidirectional and bilateral warping vectors to exploit rich features in adjacent reference MVV frames and generate robust warping features. These features are further refined to account for unreliable alignments cross MVV frames caused by occlusions. The performance of the proposed method is evaluated in comparison with state-of-the-art MVV quality enhancement networks. Synthetic MVV dataset facilitates to train our network that produces various motion priors. Experimental results demonstrate that the proposed method significantly improves the quality of the reconstructed MVV frames in recent video coding standards such as the multi-view extension of High Efficiency Video coding and the MPEG immersive video standard.
The Moving Picture Experts Group (MPEG) is responsible for standardizing MPEG immersive video (MIV) for immersive video coding and is involved in research and development focusing on providing six degrees of freedom t...
详细信息
The Moving Picture Experts Group (MPEG) is responsible for standardizing MPEG immersive video (MIV) for immersive video coding and is involved in research and development focusing on providing six degrees of freedom through a reference software known as the test model for immersive video. To efficiently compress and transmit multiview videos with texture and depth pairings, the encoder part of the MIV codec framework reduces the pixel rate by removing redundancy between views and densely packing the remaining regions into an atlas as patches. The decoder part reconstructs multiview videos from the transmitted atlas to synthesize and render arbitrary viewports, and the depth information has a significant impact on the quality of the rendered viewport. However, the existing method of handling depth values in the MIV codec fails to adequately address the information loss that occurs during quantization or transmission. To preserve and transmit depth information more accurately, we propose a method for expanding the depth dynamic range using min-max linear scaling on a patch-by-patch basis. In addition, we efficiently encode the per-patch minimum and maximum values of depth required by the decoder to recover the original depth values and include them in the metadata. The experimental results indicate that for computer-generated sequences, the proposed method provides PSNR-based Bjontegaard delta-rate gains of 9.1% and 3.3% in the end-to-end performance for high- and low-bitrate cases, respectively. In addition, subjective quality improvements are observed by reducing the artifacts that primarily occur at the object boundaries in the rendered viewport.
This letter proposes a fast dual-layer lossless coding for high dynamic range images (HDRIs) in the Radiance format. The coding, which consists of a base layer and a lossless enhancement layer, provides a standard dyn...
详细信息
This letter proposes a fast dual-layer lossless coding for high dynamic range images (HDRIs) in the Radiance format. The coding, which consists of a base layer and a lossless enhancement layer, provides a standard dynamic range image (SDRI) without requiring an additional algorithm at the decoder and can losslessly decode the HDRI by adding the residual signals (residuals) between the HDRI and SDRI to the SDRI, if desired. To suppress the dynamic range of the residuals in the enhancement layer, the coding directly uses the mantissa and exponent information from the Radiance format. To further reduce the residual energy, each mantissa is modeled (estimated) as a linear function, i.e., a simple linear regression, of the encoded-decoded SDRI in each region with the same exponent. This is called simple linear regressive mantissa estimator. Experimental results show that, compared with existing methods, our coding reduces the average bitrate by approximately 1.57-6.68% and significantly reduces the average encoder implementation time by approximately 87.13-98.96%.
This paper presents a method to effectively compress the intermediate layer feature map of a convolutional neural network for the potential structures of Video coding for Machines, which is an emerging technology for ...
详细信息
This paper presents a method to effectively compress the intermediate layer feature map of a convolutional neural network for the potential structures of Video coding for Machines, which is an emerging technology for future machine consumption applications. Notably, most extant studies compress a single feature map and hence cannot entirely consider both global and local information within the feature map. This limits performance maintenance during machine consumption tasks that analyze objects with various sizes in images/videos. To address this problem, a multiscale feature map compression method is proposed that consists of two major processes: receptive block based principal component analysis (RPCA) and uniform integer quantization. The RPCA derives the complete basis kernels of a feature map by selecting a set of major basis kernels that can represent a sufficient percentage of global or local information according to the variable-size receptive blocks of each feature map. After transforming each feature map using the set of major basis kernels, a uniform integer quantizer converts the 32-bit floating-point values of the set of major basis kernels, corresponding RPCA coefficients, and a mean vector to five-bit integer representation values. Experiment results reveal that the proposed method reduces the amount of feature maps by 99.30% with a loss of 8.30% in the average precision (AP) on the OpenImageV6 dataset and 0.77% in AP(M) and 0.47% in AP(L) on the MS COCO 2017 validation set while outperforming previous PCA-based feature map compression methods even at higher compression rates.
The Discrete Tchebichef transform (DTT) is a transform method based on discrete orthogonal Tchebichef polynomials, which have applications found in image compression and video coding. Our method is to construct all DT...
详细信息
ISBN:
(纸本)9781450384636
The Discrete Tchebichef transform (DTT) is a transform method based on discrete orthogonal Tchebichef polynomials, which have applications found in image compression and video coding. Our method is to construct all DTT-related discrete orthogonal transforms in the required size (corresponding to the coding unit supported by H.266/VVC). To investigate the feature of Tchebichef polynomials, we make use of a novel discrete orthogonal matrix generation method with determined DTT roots, and scaling and rounding a DTT that depends on the quantization parameter, instead of integer approximation. We can obtain an accurate integer DTT matrix. Experimental results show that this method can improve the video quality and require fewer bit rates.
暂无评论