This paper presents a region-based scalable coding technique that can be used in interactive transmission of images over networks. This method has a capability of near lossless coding for a specific region of interest...
详细信息
ISBN:
(纸本)0819444111
This paper presents a region-based scalable coding technique that can be used in interactive transmission of images over networks. This method has a capability of near lossless coding for a specific region of interest (ROI), while the rest of the region is coded with a high quality lossy codec. The enhancement layers add refinement to the quality of the images that have been reconstructed using the basic layer of the intra frame. The proposed coding technique uses multiple quantizers with thresholds (QT) for layering and it creates a bit plane for each layer of both the intra and residual frames. The bit plane is then partitioned into sets of small areas to be coded independently. Run-length and entropy coding are applied to each of the sets to provide scalability for the entire image sets resulting in high picture quality in the end user-specified ROI. We tested this technique by applying it to various test image sequences and we consistently achieved a high level of performance.
Current wavelet-based image coders obtain high performance thanks to the identification and the exploitation of the statistical properties of natural images in the transformed domain. Zerotree-based algorithms, as &qu...
详细信息
ISBN:
(纸本)0819444111
Current wavelet-based image coders obtain high performance thanks to the identification and the exploitation of the statistical properties of natural images in the transformed domain. Zerotree-based algorithms, as "Embedded Zerotree Wavelets" (EZW) and "Set Partitioning In Hierarchical Trees" (SPIHT), offer high Rate-Distortion (RD) coding performance and low computational complexity by exploiting statistical dependencies among insignificant coefficients on hierarchical subband structures. Another possible approach tries to predict the clusters of significant coefficients by means of some form of morphological dilation. An example of a morphology-based coder is the "Significance-Linked Connected Component Analysis" (SLCCA) that has shown performance which are comparable to the zerotree-based coders but is not embedded. A new embedded bit-plane coder is proposed here based on morphological dilation of significant coefficients and context based arithmetic coding. The algorithm is able to exploit both intra-band and inter-band statistical dependencies among wavelet significant coefficients. Moreover, the same approach is used both for two and three-dimensional wavelet-based image compression. Finally the algorithms are tested on some 2D images and on a medical volume, by comparing the RD results to those obtained with the state-of-the-art wavelet-based coders.
This paper proposes a classification of the parallelisms in general-purpose processor based systems in three main categories. One category is the intra-processor parallelism that includes multimedia instructions and s...
详细信息
ISBN:
(纸本)0819444111
This paper proposes a classification of the parallelisms in general-purpose processor based systems in three main categories. One category is the intra-processor parallelism that includes multimedia instructions and superscalar and VLIW architectures. The former takes advantage of data parallelism. The latter benefit from instruction level parallelism. Another category is the inter-processor parallelism. We consider the parallelism between processors inside shared memory symmetric multiprocessor systems and in distributed memory clusters of workstations. Finally, in the last category, main features of the system level parallelism are studied including the input/output operations, the memory hierarchy and the exploitation of external processing. The potential gain is studied for each type of parallelism available in general-purpose processor based systems from a theoretical point of view as well as for existing image and video applications. The results in this paper showed that the exploitation of the different levels of parallelism available in PC workstations can lead to considerable gains in speed when optimizing a multimedia application. Finally the results of this work can be used to influence the design of new multimedia systems and media processors.
This paper proposes an algorithm for extracting the boundary of an object. In order to take fall advantage of global shape, our approach uses global shape parameters derived from Point Distribution Model (PDM). Unlike...
详细信息
ISBN:
(纸本)0819444111
This paper proposes an algorithm for extracting the boundary of an object. In order to take fall advantage of global shape, our approach uses global shape parameters derived from Point Distribution Model (PDM). Unlike PDM, the proposed method models global shape using curvature as well as edge. The objective function for applying the shape model is formulated using Bayesian rule. This method can extract a boundary of an object by evaluating the solution maximizing the objective function iteratively. Experimental results show that the proposed method requires less computational cost than the PDM and it is robust to noise, pose variation, and some occlusion.
This paper proposes a novel low-complexity region-based video-coding algorithm that automatically identifies moving foreground objects, compresses them with higher quality than the background and efficiently encodes t...
详细信息
ISBN:
(纸本)0819444111
This paper proposes a novel low-complexity region-based video-coding algorithm that automatically identifies moving foreground objects, compresses them with higher quality than the background and efficiently encodes the video in an H.263+ compliant bitstream. Global motion estimation is first performed using the MSE algorithm. The original sequence is then segmented into foreground and background regions by using global and local motion information predicted from the previous frame. This enables the separation of moving objects with respect to a static background, even in the presence of camera motion. A modified TMN8 rate control algorithm is proposed to assign more bits to the foreground region, and the segmented video is then encoded into an H.263+ compliant bitstream. As block-matching motion estimation is used to obtain the local motion field and foreground/background identification is also block-based, the proposed algorithm has lower complexity than previously proposed pixel-based algorithms. Hence it is can be easily implemented in software or ASIC-based real-time applications. It is also particularly useful for mobile applications where bandwidth is highly constrained and low power requirements restrict processing complexity.
Blind image quality assessment refers to the problem of evaluating the visual quality of an image without any reference. It addresses a fundamental distinction between fidelity and quality, i.e. human vision system us...
详细信息
ISBN:
(纸本)0780376226
Blind image quality assessment refers to the problem of evaluating the visual quality of an image without any reference. It addresses a fundamental distinction between fidelity and quality, i.e. human vision system usually does not need any reference to determine the subjective quality of a target image. In this paper, we propose to appraise the image quality by three objective measures: edge sharpness level, random noise level and structural noise level. They jointly provide a heuristic approach of characterizing most important aspects of visual quality. We investigate various mathematical tools (analytical, statistical and PDE-based) for accurately and robustly estimating those three levels. Extensive experiment results are used to justify the validity of our approach.
This paper introduces our recent research work on the development of a scalable foveated visual information coding and communication system, which follows two emerging trends in visual communication research. One is t...
详细信息
ISBN:
(纸本)0819447145
This paper introduces our recent research work on the development of a scalable foveated visual information coding and communication system, which follows two emerging trends in visual communication research. One is to design rate scalable image and video codecs, which allow the extraction of coded visual information at continuously varying bit rates from a single compressed bitstream. The other is to incorporate human visual system models to improve the state-of-the-art of image and video coding techniques by better exploiting the properties of the intended receiver. The central idea of the proposed system is to organize the encoded bitstream to provide the best decoded visual information at an arbitrary bit rate in terms of foveated visual quality measurement. Such a scalable foveated visual information processing system has many potential applications in the field of visualcommunications. Significant examples include network image browsing, network videoconferencing, robust visual communication over noisy channels, and visual communication over active networks.
We propose a method for extracting object symmetries from a digital image. To achieve this we examine the way in which the human visual system processes and organises visual information. Psychological evidence is comb...
详细信息
ISBN:
(纸本)0780374029
We propose a method for extracting object symmetries from a digital image. To achieve this we examine the way in which the human visual system processes and organises visual information. Psychological evidence is combined with physiological processing. The evidence is based on image structure and the processing is based on the Hierarchical Cluster Model (HCM) which is used to model the human brain.
MPEG-2 and MPEG-4 are two most popular international video compression standards. They will coexist in different systems and networks for a long time. For compatibility of these two video standards, this paper put for...
详细信息
ISBN:
(纸本)0819444111
MPEG-2 and MPEG-4 are two most popular international video compression standards. They will coexist in different systems and networks for a long time. For compatibility of these two video standards, this paper put forward three transcoding algorithms on the basis of detailed analysis of computational complexity. In the paper, the principle and implementation of two efficient transcoding algorithms are emphasized. Both of them are tradeoff between the computational complexity and reconstructed video quality. One is motion-information reusing algorithm, which meets the highest video quality requirement of studio and post-processing environments. The other is real-time low complexity transcoding algorithm, which is proper for low delay requirement of the network server and user terminal with limited computational performance. The simulation results of many test sequences verified the high performance of the algorithms.
Video decoding at reduced resolution with resizing embedded in the decoding loop saves computational resources such as memory, memory bandwidth and CPU cycles. Key to such embedded resizing is proper filtering/scaling...
详细信息
ISBN:
(纸本)0819444111
Video decoding at reduced resolution with resizing embedded in the decoding loop saves computational resources such as memory, memory bandwidth and CPU cycles. Key to such embedded resizing is proper filtering/scaling of DCT data, and motion compensation at the reduced resolution. Although MPEG-2 video decoding with embedded resizing has been investigated in the past, little work has been reported on solving problems associated with interlaced video undergoing decoding -with embedded resizing. In particular, annoying artifacts may occur in moving areas of interlaced video due to improper scaling or motion compensation. In this paper, we introduce the notion of the Local Interlacing Property for interlaced moving areas and propose algorithms to detect and process data with the Local Interlacing Property properly in the context of decoding with embedded resizing. Specifically, we demonstrate that 1) vertical high frequency in interlaced moving areas should be preserved during downscaling, and 2) phase shift must be added for motion compensation in interlaced moving areas wider certain circumstances. Experimental results show that our method effectively removes artifacts in interlaced moving areas, making MPEG-2 video decoding with embedded resizing a practical tradeoff for interlaced video.
暂无评论