In this paper we present a new computationally efficient and effective technique for detection of abrupt scene changes in MPEG-4/2 compressed video sequences. We combine the de image based approach of Yeo and Liu(1) w...
详细信息
ISBN:
(纸本)0819431273
In this paper we present a new computationally efficient and effective technique for detection of abrupt scene changes in MPEG-4/2 compressed video sequences. We combine the de image based approach of Yeo and Liu(1) with the bit allocation change based approach of Feng, Lo and Mehrpour.(2) The bit allocation based approach has the advantage of computational simplicity since it only requires entropy decoding of the sequence. Since extraction of de images from I-Frames/Objects is simple, the de image based technique of Yeo is a good alternative for comparison of I-frames/objects. For P-frames/objects however, Yeo's algorithm requires additional computation. We find that the bit allocation change based approach is prone to false detection in comparison of intra-coded objects in MPEG-4 sequences. However, if a suspected scene/object change has been located accurately in a group of consecutive frames/objects, the bit allocation based technique quickly and accurately locates the cut point therein. This motivates us to use de image based detection between successive I-Frames/Objects to identify the sub-sequences with scene/object changes, and then use bit allocation based detection to find the cut point therein. Our technique thus has only a marginally greater complexity than the completely bit allocation based technique but has greater accuracy. It is applicable to both MPEG-2 sequences and MPEG-4 multiple-object sequences. In the MPEG-4 multiple object case, we use a weighted sum of the change in each object of the frame using the area of the object as the weight.
The MPEG-2 video standards are targeted for high-quality video broadcast and distribution, and are optimized for efficient storage and transmission. However, it is difficult to process MPEG-2 for video browsing and da...
详细信息
The MPEG-2 video standards are targeted for high-quality video broadcast and distribution, and are optimized for efficient storage and transmission. However, it is difficult to process MPEG-2 for video browsing and database applications without first decompressing the video. Yeo and Liu [1] have proposed fast algorithms for the direct extraction of spatially reduced images from MPEG-1 video. Reduced images have been demonstrated to be effective for shot detection, shot browsing and editing, and temporal processing of video for video presentation and content annotation. In this paper, we develop new tools to handle the extra complexity in MPEG-2 video for extracting spatially reduced images. In particular, we propose new classes of Discrete Cosine Transform (DCT) domain and DCT inverse motion compensation operations for handling the interlaced modes in the different frame types of MPEG-2, and design new and efficient algorithms for generating spatially reduced images of an MPEG-2 video. We also describe key video applications on the extracted reduced images.
The Scene Transition Graph (STG) [1] is a directed graph structure that compactly captures both image content and temporal flow of video. An STG offers a condensed view of the story content, serves as the summary of t...
详细信息
The Scene Transition Graph (STG) [1] is a directed graph structure that compactly captures both image content and temporal flow of video. An STG offers a condensed view of the story content, serves as the summary of the clip represented, and allows nonlinear access to its story element. It can serve as a valuable tool for both the analysis of video structure and presentation of high level visual summary for video browsing applications. In this paper, we study new techniques for classification and simplification of the STG, and present better means of visualizing the graph through dynamic visual display and simplified structures. In other words, our techniques improve significantly the existing graph structure to enable more succinct presentation of the graphs which leads to more efficient utilization of the screen spaces. In addition, a technique that captures and presents visually the temporal dynamics of the video sequence is described. We have tested the graph visualization techniques on various programming types and the new tools are found to effectively handle video from a wider variety than the existing STG structure.
The rapidly growing interest in building up multimedia applications has created a need for applying database technology to multimedia systems, to support the efficient access, query, and retrieval of complex multimedi...
详细信息
ISBN:
(纸本)0819429880
The rapidly growing interest in building up multimedia applications has created a need for applying database technology to multimedia systems, to support the efficient access, query, and retrieval of complex multimedia information. In this paper, an Object Relational Multimedia Data Model (ORMD) has been proposed for modeling of multimedia objects. The ORMD model supports not only various relationships but also the hyper links among the multimedia objects. Therefore, querying and navigating of multimedia objects can be facilitated. Based on this data model, an infrastructure design of multimedia database system has been presented. Multimedia extensions of data types and query language have been designed to facilitate various queries of multimedia objects. Currently, we are implementing a prototype, called Multimedia Information retrieval System to support the efficient querying and navigating of multimedia objects. The development of MIR System shows the potential of features of extending it to diverse multimedia applications, such as digital libraries, Internet video server etc.
This paper describes a new fade and dissolve detection methodology that utilizes wavelet transformation. This approach takes advantage of the production aspects of video as well as mimicing human perception. Each fram...
详细信息
This paper describes a new fade and dissolve detection methodology that utilizes wavelet transformation. This approach takes advantage of the production aspects of video as well as mimicing human perception. Each frame of the video is first decomposed into low-resolution component and high-resolution component using wavelet transformation. The possible gradual changes are first detected with edge spectrum average (ESA) feature which is obtained from the high-resolution component, in the mean time, the changing statistics of the ESA is studied to identify fades. Double chromatic difference is applied later on the low-resolution component to identify the dissolve transitions.
Content-Based imageretrieval (CBIR) has become one of the most active research areas in the past few years. Many visual feature representations have been explored and many systems built. While these research efforts ...
详细信息
Content-Based imageretrieval (CBIR) has become one of the most active research areas in the past few years. Many visual feature representations have been explored and many systems built. While these research efforts establish the basis of CBIR, the usefulness of the proposed approaches is limited. Specifically, these efforts have relatively ignored two distinct characteristics of CBIR systems: (1) the gap between high level concepts and low level features;(2) subjectivity of human perception of visual content. This paper proposes a relevance feedback based interactive retrieval approach, which effectively takes into account the above two characteristics in CBIR. During the retrieval process, the user's high level query and perception subjectivity are captured by dynamically updated weights based on the user's relevance feedback. The experimental results show that the proposed approach greatly reduces the user's effort of composing a query and captures the user's information need more precisely.
This paper studies the file caching issue in video-on-demand (VOD) servers. Because the characteristics of video files are very different from those of conventional files, different type of caching algorithms must be ...
详细信息
This paper studies the file caching issue in video-on-demand (VOD) servers. Because the characteristics of video files are very different from those of conventional files, different type of caching algorithms must be developed. For VOD servers, the goal is to optimize resource allocation and tradeoff between memory and disk bandwidth. This paper first proves that resource allocation and tradeoff between memory and disk bandwidth is an NP-complete problem. Then, a heuristic algorithm, called the generalized relay mechanism, is introduced and a simulation-based optimization procedure is conducted to evaluate the effects of applying the generalized relay mechanism.
In this paper, we propose a new content-based indexing algorithm that utilizes pixel-wise entropy and extracts features such as color and entropy from an image as indices. We propose a technique that fulfills both glo...
详细信息
In this paper, we propose a new content-based indexing algorithm that utilizes pixel-wise entropy and extracts features such as color and entropy from an image as indices. We propose a technique that fulfills both global and regional searching. Global searching scheme utilizes entropy features with multilevel-multiresolution. As resolution of the image is reduced, another information of the image is revealed. As gray-level of the image is reduced, we see how large the gray-level differences are between neighboring pixels. Regional searching utilizes color features that are extracted from regions separated by entropy measures. Our algorithm provides not only the automated extraction of entropy-based regions but also the representation of their color contents. Thus, we can classify images using entropy and multi-resolution multi-level based features. Various experiments show the promising future of the proposed algorithm.
We propose a new image feature called the color correlogram as a generic color-spatial indexing tool to tackle various problems that arise in content-based imageretrieval and video browsing. Informally speaking, a co...
We propose a new image feature called the color correlogram as a generic color-spatial indexing tool to tackle various problems that arise in content-based imageretrieval and video browsing. Informally speaking, a correlogram represents the spatial correlation of colors in an image. While the computing and storage costs of correlograms match those of histograms, the presence of spatial information makes the former more stable to tolerate large image appearance changes than the latter. This makes the correlogram very attractive for applications such as content-based imageretrieval and cut detection. To validate this, we first show that the correlogram, used as an image feature, is scalable for imageretrieval on very large imagedatabases. Our experimental results on a database over 200,000 images suggest that the color correlogram is much more effective than the color histogram (and variants) with the same amount of information for these applications. We also propose a new distance metric called relative distance metric for comparing image feature vectors. It outperforms other distance functions in most cases and improves the performance of color histograms and histogram-based features. To further enhance the quality of retrieval, we then present two supervised learning methods--learning the query, learning the metric--and combine these learning methods with color correlograms. Our experiments show that these learning methods are quite effective with even a little effort from users. We also adapt the correlogram to handle the problems of image subregion querying, object localization and tracking. We propose the correlogram intersection for object detection and correlogram correction for object localization. These simple methods perform better than methods based on color histograms. Finally, we propose a method for hierarchical classification of images via supervised learning. This scheme uses correlogram as the low-level feature and performs feature-space reconfig
Similarity retrieval of images based on texture and color features has generated a lot of interest recently. Most of these similarity retrievals are based on the computation of the Euclidean distance between the targe...
详细信息
Similarity retrieval of images based on texture and color features has generated a lot of interest recently. Most of these similarity retrievals are based on the computation of the Euclidean distance between the target feature vector and the feature vectors in the database. Euclidean distance, however, does not necessarily reflect either relative similarity required by the user. In this paper, a method based on nonlinear multidimensional scaling is proposed to provide a mechanism for the user to dynamically adjust the similarity measure. The results show that a significant improvement on the precision versus recall curve has been achieved.
暂无评论