This paper describes the development of a prototype of a video database system, called VLdIO, that takes account of the importance of different perspectives in videoretrieval. Text-based hierarchical structures are u...
详细信息
ISBN:
(纸本)0819424331
This paper describes the development of a prototype of a video database system, called VLdIO, that takes account of the importance of different perspectives in videoretrieval. Text-based hierarchical structures are used for representing the contents of a video. The structures are used for supporting the required functionalities in organizing personalized video materials. In addition to support for indexing original video materials, the system also supports tools for re-indexing and maintaining the results of videoretrieval. In other words it tries to fulfill the requirement of personalized video information management. The paper defines the requirement, outlines the key considerations in providing such support and describes the implemented system.
With the abstraction of digital video as the corresponding binary video- a process which upon numerous subjective experimentation seems to preserve (most of the) intelligibility of video content- we can pursue a preci...
详细信息
With the abstraction of digital video as the corresponding binary video- a process which upon numerous subjective experimentation seems to preserve (most of the) intelligibility of video content- we can pursue a precise and analytic approach to (digital videostorage and retrieval) algorithm design that are based upon geometrical (morphological) intuition. The foremost and tangible general benefit of such abstraction, however, is the immediate reductions of both data and computational complexities involved in implementing various algorithms and databases. The general paradigm presented may be utilized to address all issues pertaining to video library construction including visualization, optimum feedback query generation, object recognition, e.t.c., but the primary focus of attention in this paper are the ones pertaining to detection of fast (including presence of flashlights) and gradual scene changes (such as dissolves, fades, and various special effects such as wipes). Upon simulation we observed that we can achieve performances comparable to those of others with drastic reductions in both storage and computational complexities. Furthermore, since the conversion from grayscale to binary videos can be performed directly (with minimal additional computation) in the compressed domain by thresholding on the DCT DC coefficients themselves (or by using the contour information attached to MPEG4 formats), the algorithms presented herein are ideally suited for performing fast (on-the-fly) determinations of scene change, object recognition and/or tracking, and other more intelligent tasks traditionally requiring heavy demand on computational and/or storage complexities. The fast determinations may then be used on their own merits or can be used in conjunction or complementation with other higher-layer information in the future.
In this paper we describe a novel interactive image viewer incorporating a range of image processing techniques that allows inexperienced users to quickly and easily delineate objects or shapes from a wide range of re...
详细信息
ISBN:
(纸本)0819427527
In this paper we describe a novel interactive image viewer incorporating a range of image processing techniques that allows inexperienced users to quickly and easily delineate objects or shapes from a wide range of real world images. The viewer is specifically designed to be easily extensible, and this extensibility is demonstrated with the implementation of an iterative user guided segmentation tool. Using this tool objects can be efficiently extracted from images and used as the basis for navigation and retrieval within MAVIS, the Multimedia Architecture for video, image, and Sound.
The large amount of available multimedia information (e.g. videos, audio, images) requires efficient and effective annotation and retrieval methods. As videos start playing a more important role in the frame of multim...
详细信息
ISBN:
(纸本)0819424331
The large amount of available multimedia information (e.g. videos, audio, images) requires efficient and effective annotation and retrieval methods. As videos start playing a more important role in the frame of multimedia, we want to make these available for content-based retrieval. The imageMiner-System, which was developed at the University of Bremen in the AI group, is designed for content-based retrieval of single images by a new combination of techniques and methods from computer vision and artificial intelligence. In our approach to make videos available for retrieval in a large database of videos and images there are two necessary steps: First, the detection and extraction of shots from a video, which is done by a histogram based method and second, the construction of the separate frames in a shot to one still single image. This is performed by a mosaicing-technique. The resulting mosaiced image gives a one image visualization of the shot and can be analyzed by the the imageMiner-System. imageMiner has been tested on several domains, (e.g. landscape images, technical drawings), which cover a wide range of applications.
The color hologram of an image has been widely used as a feature descriptor for the image in content-based retrieval applications. In this paper, some results from our investigation efforts into to usage are reported....
详细信息
The color hologram of an image has been widely used as a feature descriptor for the image in content-based retrieval applications. In this paper, some results from our investigation efforts into to usage are reported. We outline three typical color space quantization schemes used in our experiments and introduce the soft-decision histogramming method to eliminate the discontinuity problem in traditional color histogram population process. Then, to improve the effectiveness of color histogram based retrieval algorithms, several similarity metrics are proposed for comparing color histograms, including three special forms of the Kantorovich metric.
We present a fast algorithm for computing the singular value decomposition (SVD) of a matrix consisting of the frames from a video sequence. The computational efficiency of this algorithm derives from the observation ...
详细信息
We present a fast algorithm for computing the singular value decomposition (SVD) of a matrix consisting of the frames from a video sequence. The computational efficiency of this algorithm derives from the observation that portions of a video sequence will consist of sets of correlated frames. We then show that the information obtained from the SVD can be used to analyze video sequences to obtain information such as scene breaks, scene query, reduced-order shot representation and key frame determination. We illustrate this approach on several video sequences.
The technique of symbolic projection has been widely studied in the area of image database systems as a first step towards content-based indexing and retrieval of images. In this paper we have extended the idea of sym...
详细信息
ISBN:
(纸本)0819424331
The technique of symbolic projection has been widely studied in the area of image database systems as a first step towards content-based indexing and retrieval of images. In this paper we have extended the idea of symbolic projections to video and audio data as well as to multimedia documents containing combinations of these data types. Formal definitions of symbolic video sequence, symbolic audio sequence and symbolic multimedia documents are given as are definitions of their symbolic projections. An indexing methodology based on these symbolic projections is presented. Operators which allow multimedia documents to be constructed from the basic multimedia data types are also presented. The main contribution of this paper is to provide a basis for the development of content-based retrieval of multimedia documents via extended symbolic projections.
Most color indexing techniques proposed in the literature are similar: images are represented by color histograms, and a metric on the color histogram space is used to determine the similarity of images. In this paper...
详细信息
ISBN:
(纸本)0819414808
Most color indexing techniques proposed in the literature are similar: images are represented by color histograms, and a metric on the color histogram space is used to determine the similarity of images. In this paper we determine the limits of these color indexing techniques. We propose two functions to measure the discrimination power of indexing techniques: the capacity (how many distinguishable histograms can be stored) and the maximal match number (the maximal number of retrieved images). We derive bounds for these functions. These bounds have two practical aspects. First, they help a user to decide whether color histograms effectively index database images from a given domain. Second, they facilitate the choice of a good threshold for the distance below which histograms are considered similar. Our arguments are based on an analysis of the metrical properties of the histogram space and results from coding theory. The results show that over a large range of reasonable parameters the capacity is very large. Thus, the set of parameters for which color indexing works well can be described as the set of parameters for which the maximal match number is below an application-dependent maximum.
Organizing video shots into hierarchy structures is very important for efficient browsing and retrieval on large videodatabases, and many shot organizing methods have been proposed. Most algorithms are based on autom...
详细信息
Organizing video shots into hierarchy structures is very important for efficient browsing and retrieval on large videodatabases, and many shot organizing methods have been proposed. Most algorithms are based on automatic clustering schemes, which usually fail to give satisfactory results in real applications. In this paper, we proposed a preprocessing technology for interactive shot organizing - similarity sequence. It differs from traditional shot organizing methods in that it does not classify shots, instead it only reorders the shot sequence so that similar shots appear near each other, thus provides an effective interactive shot organizing interface and leaves the classification work to the user. A measure called similarity length was introduced to evaluate the similarity between adjacent shots in shot sequence, and an improved genetic algorithm was developed to calculate the similarity sequence. Basic thoughts and implementation details are provided, also with experiment results on real videos and analysis.
This paper describes the videoSTAR experimental database system that is being designed to support video applications in sharing and reusing video data and meta-data. videoSTAR provides four different repositories: for...
详细信息
ISBN:
(纸本)081941767X
This paper describes the videoSTAR experimental database system that is being designed to support video applications in sharing and reusing video data and meta-data. videoSTAR provides four different repositories: for media files, virtual documents, video structures, and video annotations/user indexes. It also provides a generic video data model relating data in the different repositories to each other, and it offers a powerful application interface. videoSTAR concepts have been evaluated by developing a number of experimental video tools, such as a video player, a video annotator, a video authoring tool, a video structure and contents browser, and a video query tool.
暂无评论