This paper introduces a model of spatio-temporal database that we are developing to query interesting events in video sequences. The database that we are designing is pushing the state of the art for a number of field...
详细信息
This paper introduces a model of spatio-temporal database that we are developing to query interesting events in video sequences. The database that we are designing is pushing the state of the art for a number of fields, and there are many issues that are still waiting a satisfactory solution. In this paper we present our (albeit still partial) answer to some of these problems, and the future directions of our work. Our design is divided in two layers: a Logbook which operates as a short time repository of unsummarized and unprocessed data, and a long term spatio-temporal database which stores and queries summarized data.
We propose anew simple image coder based on Discrete Wavelet Transform (DWT). The DWT coefficients are coded in bitplanes. We use a variable order Markovian model to code the DWT coefficient bitplanes. Recently, we ha...
详细信息
ISBN:
(纸本)0819424331
We propose anew simple image coder based on Discrete Wavelet Transform (DWT). The DWT coefficients are coded in bitplanes. We use a variable order Markovian model to code the DWT coefficient bitplanes. Recently, we have developed this method that used 65 contexts(7). In this paper, the number of contexts is reduced to 34. We show the experimental results, both in terms of distortion measurement and visual comparison, and compare them to well-known methods.
In order to flexibly and efficiently store, manage, and present video data streams, continuous video data must be chopped into video objects and stored into database. This paper investigates systematic strategies for ...
详细信息
ISBN:
(纸本)0819420441
In order to flexibly and efficiently store, manage, and present video data streams, continuous video data must be chopped into video objects and stored into database. This paper investigates systematic strategies for supporting continuous and synchronized presentation of video data streams in multimedia database systems. Compressed video data streams are segmented and stored as sets of video objects coupled with specified synchronization requirements. Strategies for efficiently scheduling and buffering video objects are presented which guarantee the hiccup-free presentations of video streams. Delay effects are considered in these strategies. We propose to extend the existing object-oriented database system (OODBS) techniques to include the proposed video presentation mechanisms. We are currently designing and implementing a multimedia presentation tool (termed MediaShow) on top of O2, a well-known OODBS, as a basis for our implementation. However, the design strategies can be generally used in any OODBS environments that support C++ interface.
With the advance of multimedia technologies and the explosive expansion of the World Wide Web, the volume of image and video data increases rapidly. An efficient and effective multimedia data retrieval technique is ne...
详细信息
ISBN:
(纸本)0819439932
With the advance of multimedia technologies and the explosive expansion of the World Wide Web, the volume of image and video data increases rapidly. An efficient and effective multimedia data retrieval technique is needed. In this paper, we propose an approach based on feature points for the content-based imageretrieval. The feature points extracted from the multiresolution representation of the query image and database image are first matched to determine the matching pairs. Then, the matching pairs are classified into groups, finally, two similarity measurements based on different similarity requirements are proposed to compute the similarity degree. We perform a series of experiments to study. the characteristics of this approach, and compare with the region-based approach on similar-shot sequence retrieval. The comparison shows the superiority of this approach.
We present a generic model to describe image and video content by a combination of semantic entities and low level features for semantically meaningful and fast retrieval. The proposed model includes semantic entities...
详细信息
ISBN:
(纸本)0819439932
We present a generic model to describe image and video content by a combination of semantic entities and low level features for semantically meaningful and fast retrieval. The proposed model includes semantic entities such as Object, Event and Actors to express relations between title first two. The use of Actors entity increases the efficiency of certain types of search, while the use of semantic and linguistic roles increases the expression capability of the model. The model also contains links to high-level media segments such as actions and interactions, and low level media segments such as elementary motion and reaction units, as well as low-level features such as motion parameters and trajectories. Based on this model, we propose image and videoretrieval combining semantic and low-level information. The retrieval performance of our system is tested by using query-by-annotation, query-by-example, query-by-sketch, and a combination of them.
video content characterization is a challenging problem in videodatabases. The aim of such characterization is to generate indices that can describe a video clip in terms of objects and their actions in the clip. Gen...
详细信息
ISBN:
(纸本)0819424331
video content characterization is a challenging problem in videodatabases. The aim of such characterization is to generate indices that can describe a video clip in terms of objects and their actions in the clip. Generally, such indices are extracted by performing image analysis on video clips. Many such indices can also be generated by analyzing the embedded audio information of video clips. Indices pertaining to context, scene emotion, and actors or characters present in a video dip appear especially suitable for generation via audio analysis techniques of keyword spotting, and speech and speaker recognition. In this paper, we examine the potential of speaker identification techniques for characterizing video clips in terms of actors present in them. We describe a three-stage processing system consisting of a shot boundary detection stage, an audio classification stage, and a speaker identification stage to determine the presence of different actors in isolated shots. Experimental results using the movie Few Good Men are presented to show the efficacy of speaker identification for labeling video clips in terms of persons present in them.
In this paper we describe a framework of analyzing programs belonging to different TV program genres using Hidden Markov Models and pseudo-semantic features derived from video shots. Clustering using Gaussian mixture ...
详细信息
ISBN:
(纸本)0819439932
In this paper we describe a framework of analyzing programs belonging to different TV program genres using Hidden Markov Models and pseudo-semantic features derived from video shots. Clustering using Gaussian mixture models is used to determine the order of the models. Results for initial genre classification experiments using two simple features derived from video shots are given.
This paper investigates clustering techniques as a method of organizing imagedatabases to support popular visual management functions such as searching, browsing and navigation. Different types of hierarchical agglom...
详细信息
ISBN:
(纸本)0819429880
This paper investigates clustering techniques as a method of organizing imagedatabases to support popular visual management functions such as searching, browsing and navigation. Different types of hierarchical agglomerative clustering techniques are studied as a method of organizing features spaces as well as summarizing image groups by the selection of a few appropriate representatives. retrieval performance using both single and multiple level hierarchies are experimented with and the algorithms show an interesting relationship between the top k correct retrievals and the number of comparisons required. Some arguments are given to support the use of such cluster-based techniques for managing distributed imagedatabases.
作者:
Luo, MBai, XSXu, GYTsinghua Univ
Dept Comp Sci & Technol State Key Lab Intelligent Technol & Syst Beijing 100084 Peoples R China
An explosion of on-line image and video data in digital form is already well underway. With the exponential rise in interactive information exploration and dissemination through the World-Wide Web, the major inhibitor...
详细信息
ISBN:
(纸本)0819444162
An explosion of on-line image and video data in digital form is already well underway. With the exponential rise in interactive information exploration and dissemination through the World-Wide Web, the major inhibitors of rapid access to on-line video data are the management of capture and storage, and content-based intelligent search and indexing techniques. This paper proposes an approach for content-based analysis and event-based indexing of sports video. It includes a novel method to organize shots-classifying shots as close shots and far shots, an original idea of blur extent-based event detection, and an innovative local mutation-based algorithm for caption detection and retrieval. Results on extensive real TV programs demonstrate the applicability of our approach.
Similarity between images is used for storage and retrieval in imagedatabases. In the literature, several similarity measures have been proposed that may be broadly categorized as: (1) metric based, (2) set-theoretic...
详细信息
ISBN:
(纸本)081941767X
Similarity between images is used for storage and retrieval in imagedatabases. In the literature, several similarity measures have been proposed that may be broadly categorized as: (1) metric based, (2) set-theoretic based, and (3) decision-theoretic based measures. In each category, measured based on crisp logic as well as fuzzy logic are available. In some applications such as imagedatabases, measures based on fuzzy logic would appear to be naturally better suited, although so far no comprehensive experimental study has been undertaken. In this paper, we report results of some of the experiments designed to compare various similarity measures for application to imagedatabases. We are currently working with texture images and intend to work with face images in the near future. As a first step for comparison, the similarity matrices for each of the similarity measures are computed over a set of selected textures and are presented as visual images. Comparative analysis of these images reveals the relative characteristics of each of these measures. Further experiments are needed to study their sensitivity to small changes in images such as illumination, magnification, orientation, etc. We describe these experiments (sensitivity analysis, transition analysis, etc.) that are currently in progress. The results from these experiments offer assistance in choosing the appropriate measure for applications to imagedatabases.
暂无评论