A new system, the so-called MUVIS, is introduced for content-based indexing and retrieval for image database management systems. In addition to traditional indexing by key words, MUVIS allows indexing of objects and i...
详细信息
A new system, the so-called MUVIS, is introduced for content-based indexing and retrieval for image database management systems. In addition to traditional indexing by key words, MUVIS allows indexing of objects and images based on color, texture, shape and objects layout inside them. Due to the use of large vector features, the pyramid trees are employed to create the index structure.
The focus of this paper is on similarity modeling. In the first part we revisit underlying concepts of similarity modeling and sketch the currently most used VIR similarity model (Linear Weighted Merging, LWM). Motiva...
详细信息
ISBN:
(纸本)0819448214
The focus of this paper is on similarity modeling. In the first part we revisit underlying concepts of similarity modeling and sketch the currently most used VIR similarity model (Linear Weighted Merging, LWM). Motivated by its drawbacks we introduce a new general similarity model called Logical retrieval (LR) that offers more flexibility than LWM. In the second part we integrate the Feature Contrast Model (FCM) in this environment, developed by psychologists to explain human peculiarities in similarity perception. FCM is integrated as a general method for distance measurement. The results show that FCM performs (in the LR context) better than metric-based distance measurement. Euclidean distance is used for comparison because it is used in many VIR systems and is based on the questionable metric axioms. FCM minimizes the number of clusters in distance space. Therefore it is the ideal distance measure for LR. FCM allows a number of different parameterizations. The tests reveal that in average a symmetric, non-subtractive configuration that emphasizes common properties of visual objects performs best. Its major drawback in comparison to Euclidean distance is its worse performance (in terms of query execution time).
Digital video is becoming an emerging force in current computer and telecommunication industries for its large mass of data. video segmentation and key-frame extraction have become crucial for the development of advan...
详细信息
ISBN:
(纸本)9781479965458
Digital video is becoming an emerging force in current computer and telecommunication industries for its large mass of data. video segmentation and key-frame extraction have become crucial for the development of advanced digital video systems. Key frame extraction is a very useful technique to provide a concise access to the video content and is the first step towards efficient browsing and retrieval in videodatabases. Existing approaches are either computationally expensive or ineffective in capturing salient visual content. The proposed system extracts key frames from input videos using two distinct, cost-effective algorithms namely reference based key frame extraction and clustering. It uses multiple characteristics such as co-relation, optical flow and mutual information to identify and extract key frames. The proposed system is able to extract the key frames efficiently for any video format & the extracted key frames can satisfactorily represent the salient content of the video. storage is reduced by one-eighth of the total space required by the original video and the original content can be represented in one-fourth the time of the input video achieving very high compression efficiency & hence can be used in any videoretrieval applications.
The explosive growth of images and videos on the World Wide Web (WWW) is making the Web into a huge resource of visual information. Among various types of multimedia information, still images or dynamic images (video ...
详细信息
ISBN:
(纸本)0819448214
The explosive growth of images and videos on the World Wide Web (WWW) is making the Web into a huge resource of visual information. Among various types of multimedia information, still images or dynamic images (video clips) in compressed format are the most widely accepted on the WWW. Therefore, it becomes an essential issue to achieve the maximum efficiency in transmitting and decoding those compressed images on the Internet. Progressive coding provides a mode that allows a coarse version of an image being transmitted at a lower bit rate and then gradually refined by subsequent transmissions. Compared with conventional coding, it is more suitable for interactive applications such as those involving JPEG images on the Internet. In this paper, we first give an approximation of cosine function used in IDCT for the various orders. Based on the approximation and a series analysis, we then develop a progressive decoding scheme which comprehends the successive approximation and the spectral selection. The analysis and experiments establish the fact that our proposed method saves computational cost significantly in comparison with the existing spectral selection based progressive decoding proposed by JPEG. Extensive experiments are carried out to evaluate the proposed algorithm, which reveals that, the reconstructed images, even at the lowest bit rate and with lower order approximation, can still achieve encouraging PSNR values.
This paper presents a general method to retrieve images from large databases using images as queries. The method is based on local characteristics which are robust to the group of similarity transformations in the ima...
详细信息
ISBN:
(纸本)0780332598
This paper presents a general method to retrieve images from large databases using images as queries. The method is based on local characteristics which are robust to the group of similarity transformations in the image. images can be retrieved even if they are translated, rotated or scaled. Due to the locality of the characterization, images can be retrieved even if only a small part of the image is given as well as in the presence of occlusions. A voting algorithm, following the idea of a Hough transform, and semi-local constraints allow us to develop a new method which is robust to noise, to scene clutter and small perspective deformations. Experiments show an efficient recognition for different types of images. The approach has been validated on an image database containing 1020 images, some of them being very similar by structure, texture or shape.
The detection of shot boundaries in video sequences is an important task for generating indexed videodatabases. This paper provides a comprehensive quantitative comparison of the metrics that have been applied to sho...
详细信息
The detection of shot boundaries in video sequences is an important task for generating indexed videodatabases. This paper provides a comprehensive quantitative comparison of the metrics that have been applied to shot boundary detection. In addition, several standardized statistical tests that have not been applied to this problem, and three new metrics, are considered. A mathematical framework for quantitatively comparing metrics is supplied. Experimental results based on a video database containing 39,000 frames are included.
This paper is concerned with estimating a probability density function of human skin color using a finite Gaussian mixture model whose parameters are estimated through the EM algorithm. Hawkins' statistical test o...
详细信息
This paper is concerned with estimating a probability density function of human skin color using a finite Gaussian mixture model whose parameters are estimated through the EM algorithm. Hawkins' statistical test on the normality and homoscedasticity (common covariance matrix) of the estimated Gaussian mixture models is performed and McLachlan's bootstrap method is used to test the number of components in a mixture. Experimental results show that the estimated Gaussian mixture model fits skin images from a large database. Applications of the estimated density function in image and videodatabases are presented.
A different approach to content-based retrieval and a novel framework for classification of visual information are proposed. The Visual Apprentice which is an implementation of the framework for still images and video...
详细信息
A different approach to content-based retrieval and a novel framework for classification of visual information are proposed. The Visual Apprentice which is an implementation of the framework for still images and video that uses a combination of lazy-learning, decision trees, and evolution programs for classification and grouping is introduced. Examples and results are given to demonstrate the applicability of the proposed approach to perform visual classification and detection.
Modern interactive multimedia services, such as the video-on-demand (VoD), electronic library, and etc. tend to involve large-scale media archives of audio records, video clips, image banks, and text documents. Thus, ...
详细信息
ISBN:
(纸本)0819426628
Modern interactive multimedia services, such as the video-on-demand (VoD), electronic library, and etc. tend to involve large-scale media archives of audio records, video clips, image banks, and text documents. Thus, these services impose many challenges on designing and implementing new generation database systems. In this paper, we first introduce a new multimedia data model, which could accommodate sophisticated media types, as well as complex relationships among different media entities. Thereafter, an object-relational database infrastructure is proposed, to support applications of the data model developed in our project. The infrastructure is designated both as a framework for designing and implementing multimedia databases, and as a reference model to compare and evaluate different database systems. Features of the proposed infrastructure, as well as its implementation into a prototype multimedia database system, are also discussed in the paper.
暂无评论