A commonly encountered problem in the communication of predictively encoded video is that of predictive mismatch or drift. The problem of predictive mismatch manifests itself in numerous communication scenarios, inclu...
详细信息
A commonly encountered problem in the communication of predictively encoded video is that of predictive mismatch or drift. The problem of predictive mismatch manifests itself in numerous communication scenarios, including on-demand streaming, real-time streaming and multicast streaming. This paper proposes a state-free video encoding architecture that alleviates this problem. The main benefit of state-free encoding is that there is no need for the encoder and the decoder to maintain the same state, or equivalently, predict using the same predictor. This facilitates robust communication of causally encoded media. The proposed approach is based on the Wyner-Ziv theorem in information theory. Consequently, it leverages the superior performance of coset codes for the Wyner-Ziv problem for predictive coding. A video codec, with state-free functionality, based on the H.26L encoding standard is proposed. The performance of the proposed codec is within 1-2.5 dB of the H.26L encoder.
In this paper, we investigate whether different scanning pattern affect the image quality reconstructed from adaptive quantization processing. Different space filling curves, such as Peano curve, Hilbert curve, Moore ...
详细信息
In this paper, we investigate whether different scanning pattern affect the image quality reconstructed from adaptive quantization processing. Different space filling curves, such as Peano curve, Hilbert curve, Moore curve, are implemented and compared in terms of compression ratio as well as image quality expressed in PSNR. The investigated space filling curves preserve the spacial neighborhood property of the pixel array, which is useful for predictive coding. While using those space filling curves, higher reconstructed image quality can be expected compared with the traditional raster scanning scheme.
In this paper we present a bit allocation approach based on motion vector analysis for improved rate-distortion performance. Bit allocation is done at the macroblock level such that the macroblock with high priority i...
详细信息
In this paper we present a bit allocation approach based on motion vector analysis for improved rate-distortion performance. Bit allocation is done at the macroblock level such that the macroblock with high priority is coded finely and the one with low priority is coded coarsely. In order to calculate macroblock priorities, first reference counts for each pixel are determined through motion vector analysis. A reference count of a pixel is defined as total number of pixels in the remaining GOP which use that pixel as a reference. Then macroblock wise reference counts is obtained by summing the pixel wise reference counts, which are then scaled and prioritized. Based upon the priority value the given fixed quant of each frame is modulated at the macroblock level. The algorithm is applied to H.264/AVC encoding and PSNR gains of up to 1.4 dB are achieved
Technological advances have made possible a number of new applications in the area of 3D video. One of the enabling technologies for many of these 3D applications is multiview video coding, which has received signific...
详细信息
Technological advances have made possible a number of new applications in the area of 3D video. One of the enabling technologies for many of these 3D applications is multiview video coding, which has received significant attention in the last several years. However, the fundamental need of multiview coding for applications like immersive tele-conferencing has not been addressed. In this paper we define the boundaries of the problem, and show how a simple algorithm can yield gains of up to 2times reduction in bitrate with similar PSNR in the synthesized view. Our algorithm is based on using an estimate of the viewer position to compute the expected contribution of each pixel to the synthesized view, and encoding each macroblock of each camera views with quality proportional to the likelihood that the pixel will be used in the synthetic image.
This paper investigates the encoding of vehicular position information using predictive algorithms in inter-vehicle communications (IVC) from the source- and channel-coding viewpoints. Assuming the 15-mode vehicular d...
详细信息
This paper investigates the encoding of vehicular position information using predictive algorithms in inter-vehicle communications (IVC) from the source- and channel-coding viewpoints. Assuming the 15-mode vehicular driving model, three types of schemes are compared: (1) an ordinary pulse-code modulation (PCM) scheme that transmits position information every sampling period, (2) predictive coding schemes, and (3) a novel scheme using predicted information. This paper estimates the decoded errors caused by transmission errors, when position information obtained from a positioning system is transmitted. Simulation results show that the novel scheme is effective as a coding scheme in IVC.
Two different aspects of distributed source coding is discussed. First, a previously developed distributed uniform scalar quantization method is improved by adopting non-uniform quantization. It is observed that compa...
详细信息
Two different aspects of distributed source coding is discussed. First, a previously developed distributed uniform scalar quantization method is improved by adopting non-uniform quantization. It is observed that compared to uniform quantization, the non-uniform scheme further approaches to the distortion-rate bound by 0.5 bit. Turning then to sources with memory, the tradeoff between exploitation of time and space correlation is exposed. It is shown by example that optimal transforms deviate from their traditional counterparts. Specifically, optimal transforms in the distributed framework do not necessarily fully decorrelate each sequence in time
Motion estimation is the most time-consuming module in any video coding standard. Existing block-matching algorithms (BMA) reduce the search time of the motion estimation process with moderate loss in quality. In this...
详细信息
Motion estimation is the most time-consuming module in any video coding standard. Existing block-matching algorithms (BMA) reduce the search time of the motion estimation process with moderate loss in quality. In this paper, a novel block-matching algorithm for fast motion estimation namely direct ion-based block-matching algorithm (DBM) is proposed. Existing BMA initiate its search from the center of the search area for every frame, which is time consuming. The proposed algorithm utilizes the temporal correlation in the motion vectors between successive frames to reduce the search time. The motion vectors of the previous frame imply the direction of motion in the succeeding frames thus minimizing the search domain. Experimental results show that DBM provides faster searching with moderate to low distortion compared to existing BMA.
In the era of big data, the sheer volume and complexity of datasets pose significant challenges in machine learning, particularly in image processing tasks. This paper introduces an innovative Autoencoder-based Datase...
详细信息
ISBN:
(数字)9798350349399
ISBN:
(纸本)9798350349405
In the era of big data, the sheer volume and complexity of datasets pose significant challenges in machine learning, particularly in image processing tasks. This paper introduces an innovative Autoencoder-based Dataset Condensation Model backed by Koopman operator theory that effectively packs large datasets into compact, information-rich representations. Inspired by the predictive coding mechanisms of the human brain, our model leverages a novel approach to encode and reconstruct data, maintaining essential features and label distributions. The condensation process utilizes an autoencoder neural network architecture, coupled with Optimal Transport theory and Wasserstein distance, to minimize the distributional discrepancies between the original and synthesized datasets. We present a two-stage implementation strategy: first, condensing the large dataset into a smaller synthesized subset; second, evaluating the synthesized data by training a classifier and comparing its performance with a classifier trained on an equivalent subset of the original data. Our experimental results demonstrate that the classifiers trained on condensed data exhibit comparable performance to those trained on the original datasets, thus affirming the efficacy of our condensation model. This work not only contributes to the reduction of computational resources but also paves the way for efficient data handling in constrained environments, marking a significant step forward in data-efficient machine learning.
1 1
Thanks to the generous support of ARO grant W911NF-23-2-0041
We propose a new scheme for compressing on image set by building its minimal-cost prediction structure. Existing prediction-based video coding methods can be easily extended and incorporated into this scheme to achiev...
详细信息
We propose a new scheme for compressing on image set by building its minimal-cost prediction structure. Existing prediction-based video coding methods can be easily extended and incorporated into this scheme to achieve higher compression efficiency. According to this prediction structure, we also develop a progressive transmission approach for interactive object movie (OM) browsing.
暂无评论