To overcome some of the well-known artefacts stemming from block-based motion compensation, grid-based techniques have been proposed in the past as a promising alternative to block matching. While the latter is restri...
详细信息
ISBN:
(纸本)0819423564
To overcome some of the well-known artefacts stemming from block-based motion compensation, grid-based techniques have been proposed in the past as a promising alternative to block matching. While the latter is restricted to simple translational motion, grid-based compensation employs e.g. an affine model when triangular meshes are assumed. The theoretically superior model, however, will perform worse at object boundaries where the connectivity constraint of the meshes causes smoothing of the underlying discontinuous motion vector field. This effect can be diminished by providing an individual grid for each object in a scene. The crucial part of grid-based motion prediction is the technique of tracking mesh vertices. In contrast to block matching, motion estimation of vertices cannot be done independently without sacrificing prediction gain. This paper discuses different algorithms for vertex tracking. The issue of tracking at object boundaries and the influence of the resampling algorithm on the prediction gain are addressed in detail. Object grids, carrying both shape and motion information, are evaluated further in terms of shape coding efficiency and temporal scalability. Both aspects become important when aiming at functional coding for low bit rates, as it is currently being investigated in the framework of MPEG-4.
Embedded captions in TV programs such as news broadcasts, documentaries and coverage of sports events provide important information on the underlying events. In digitalvideo libraries, such captions represent a highl...
详细信息
ISBN:
(纸本)0819420425
Embedded captions in TV programs such as news broadcasts, documentaries and coverage of sports events provide important information on the underlying events. In digitalvideo libraries, such captions represent a highly condensed form of key information on the contents of the video. In this paper we propose a scheme to automatically detect the presence of captions embedded in video frames. The proposed method operates on reduced image sequences which are efficiently reconstructed from compressed MPEG video and thus does not require full frame decompression. The detection, extraction and analysis of embedded captions help to capture the highlights of visual contents in video documents for better organization of video, to present succinctly the important messages embedded in the images, and to facilitate browsing, searching and retrieval of relevant clips.
digitalvideo traffic is inherently bursty for two reasons: inherent motion of objects and cameras, and the artifacts of compressionalgorithms. Because digitalvideo playback requires bandwidth guarantees from the un...
详细信息
digitalvideo traffic is inherently bursty for two reasons: inherent motion of objects and cameras, and the artifacts of compressionalgorithms. Because digitalvideo playback requires bandwidth guarantees from the underlying I/O and network systems, the bursty nature of video traffic forces the bandwidth reservations to be made at the level of peak data rates rather than average data rates. This work addresses the burstiness problem in digitalvideo traffic by proposing changes to the MPEG compression/decompression algorithm. The resulting algorithm, block-by-block (BBB) difference coding, successfully minimizes the difference between peak and average bit rates by a factor of 2 to 3 in average, without compromising the compression efficiency, coding speed and video quality.
Traditional objective measurements are of limited effectiveness in predicting the quality of compressed images. Subjective assessment is the most reliable method of evaluation of compression systems performance now. B...
详细信息
ISBN:
(纸本)0819423564
Traditional objective measurements are of limited effectiveness in predicting the quality of compressed images. Subjective assessment is the most reliable method of evaluation of compression systems performance now. But subjective assessment is time consuming. That is why it is important to simplify the procedure of subjective measurement. To achieve it one may use the method of paired comparisons. This paper describes applications of the method of paired comparisons in the field of assessment of concatenated compression systems.
Embedding information into multimedia data is a topic that has gained increasing attention recently. For video broadcast applications, watermarking of video, and especially of already encoded video, is interesting. We...
详细信息
ISBN:
(纸本)0819423564
Embedding information into multimedia data is a topic that has gained increasing attention recently. For video broadcast applications, watermarking of video, and especially of already encoded video, is interesting. We present a scheme for robust interoperable watermarking of MPEG-2 encoded video. The watermark is embedded either into the uncoded video or into the MPEG-2 bitstream, and can be retrieved from the decoded video. The scheme working on encoded video is of much lower complexity than a complete decoding process followed by watermarking in the pixel domain and re-encoding. Although an existing MPEG-2 bitstream is partly altered, the scheme avoids drift problems. The scheme has been implemented and practical results show that a robust watermark can be embedded into MPEG encoded video which can be used to transmit arbitrary binary information at a data rate of several bytes/second.
Block based motion estimation is an efficient interframe predictor, making it an important component in video coding schemes. A significant portion of a video codec's computational budget however, is allocated to ...
详细信息
ISBN:
(纸本)0819420425
Block based motion estimation is an efficient interframe predictor, making it an important component in video coding schemes. A significant portion of a video codec's computational budget however, is allocated to the task of computing motion vectors. For low bit-rate video coding applications such as teleconferencing, motion vector information occupies a substantial percentage of the available channel bandwidth. In this paper we present a method that accelerates motion vector computation by using spatio-temporal prediction to bias the search (in a statistical sense) towards the most probable direction of the motion using object trajectories from previously computed frames. Furthermore, since the motion vectors are linearly predicted, they can be coded efficiently. Linear predictive motion vector coding compares favorably to other motion estimation methods and can be incorporated within existing videocompression standards.
In this paper, we present an approach to characterize video sequences using information theoretic measures. This characterization is then used to efficiently represent a volume of video. In a typical video sequence, s...
详细信息
ISBN:
(纸本)0819420425
In this paper, we present an approach to characterize video sequences using information theoretic measures. This characterization is then used to efficiently represent a volume of video. In a typical video sequence, sometimes texture reveals structure, in other cases motion does it. In addition, the temporal and spatial extents are variables. The attempt of this work is to build this structure by looking at a given region over a multiplicity of frames and scales using entropy measures. We then present a hierarchically structured class of coders that efficiently represent this volume of video. The structure built in the analysis stage is used to control and select amongst this class of coders.
To deal with the temporal redundancy of an input video sequence H.261 Standard specifics to use displaced frame differences with reference frames and, optionally, motion compensation technique. Usually, encoding proce...
详细信息
ISBN:
(纸本)0819423564
To deal with the temporal redundancy of an input video sequence H.261 Standard specifics to use displaced frame differences with reference frames and, optionally, motion compensation technique. Usually, encoding process is performed in a loop comprising reconstruction of a reference frame in spatial domain along with the computation of a difference between a successive input video frame and its reference, followed by DCT compression of that displaced frame difference. In this paper we show that the reference frames can be reconstructed in the transform domain instead, having no impact on the computational accuracy and output bitstream. In this way, we represent all reference frames in the transform domain in a form of dequantized DCT coefficients, so the next inter-frame is a difference between the next DCT-transformed input picture and the current reference frame. This inter-frame is then quantized and entropy encoded on a regular basis. The output bitstream remains to be H.261-compliant, while the compression ratio as well as quality of the decompressed video are the same as in conventional implementation. We also present the performance results for the software codec running on Pentium PC.
In this work is presented the architecture of an MPEG-1 stream transmission system appropriate for point-to-point transfer of live video and audio over TCP/IP local area networks. The hardware and software modules of ...
详细信息
ISBN:
(纸本)0819423564
In this work is presented the architecture of an MPEG-1 stream transmission system appropriate for point-to-point transfer of live video and audio over TCP/IP local area networks. The hardware and software modules of the system are presented as well. Experimental results on the statistical behavior of the generated and transmitted MPEG-1 stream are quoted.
An efficient algorithm for dynamically multiplexing MPEG2 encoded video sources is presented. Sources are grouped into classes regarding different combined levels of spatial detail and amount of movement. Simulations ...
详细信息
ISBN:
(纸本)0819423564
An efficient algorithm for dynamically multiplexing MPEG2 encoded video sources is presented. Sources are grouped into classes regarding different combined levels of spatial detail and amount of movement. Simulations were performed using different associations of sources belonging to distinct classes, different bit rates and GOP structures. The implications associated to a real implementation are analyzed and a modular architecture is proposed. Simulation results are presented and discussed, showing that sequences with higher spatial detail and motion are those which exhibit the higher quality improvements. These results are almost not affected by the non alignment, at GOP level, between video sequences.
暂无评论