We present a generic model to describe image and video content by a combination of semantic entities and low level features for semantically meaningful and fast retrieval. The proposed model includes semantic entities...
详细信息
ISBN:
(纸本)0819439932
We present a generic model to describe image and video content by a combination of semantic entities and low level features for semantically meaningful and fast retrieval. The proposed model includes semantic entities such as Object, Event and Actors to express relations between title first two. The use of Actors entity increases the efficiency of certain types of search, while the use of semantic and linguistic roles increases the expression capability of the model. The model also contains links to high-level media segments such as actions and interactions, and low level media segments such as elementary motion and reaction units, as well as low-level features such as motion parameters and trajectories. Based on this model, we propose image and videoretrieval combining semantic and low-level information. The retrieval performance of our system is tested by using query-by-annotation, query-by-example, query-by-sketch, and a combination of them.
This paper proposes an integrated system for supporting content-based videoretrieval and browsing over networks. An automatic semantic video object extraction technique for providing more compact video representation...
详细信息
ISBN:
(纸本)0819439932
This paper proposes an integrated system for supporting content-based videoretrieval and browsing over networks. An automatic semantic video object extraction technique for providing more compact video representation is developed. The videoimages are first partitioned into a set of homogeneous regions with accurate boundaries by integrating the results of color edge detection and region growing procedures. The object seeds, which are the intuitive and representative part of the semantic objects, are detected from these obtained homogeneous image regions. The semantic objects are then generated by a seeded region aggregation or a human interaction procedure. These obtained semantic objects are tracked along the time axis for exploiting their temporal correspondences among frames. Given the semantic video objects represented by a set of visual features, a seeded semantic video content clustering technique is developed for providing more effective video indexing, retrieval and browsing.
In this paper we describe a framework of analyzing programs belonging to different TV program genres using Hidden Markov Models and pseudo-semantic features derived from video shots. Clustering using Gaussian mixture ...
详细信息
ISBN:
(纸本)0819439932
In this paper we describe a framework of analyzing programs belonging to different TV program genres using Hidden Markov Models and pseudo-semantic features derived from video shots. Clustering using Gaussian mixture models is used to determine the order of the models. Results for initial genre classification experiments using two simple features derived from video shots are given.
With the advance of multimedia technologies and the explosive expansion of the World Wide Web, the volume of image and video data increases rapidly. An efficient and effective multimedia data retrieval technique is ne...
详细信息
ISBN:
(纸本)0819439932
With the advance of multimedia technologies and the explosive expansion of the World Wide Web, the volume of image and video data increases rapidly. An efficient and effective multimedia data retrieval technique is needed. In this paper, we propose an approach based on feature points for the content-based imageretrieval. The feature points extracted from the multiresolution representation of the query image and database image are first matched to determine the matching pairs. Then, the matching pairs are classified into groups, finally, two similarity measurements based on different similarity requirements are proposed to compute the similarity degree. We perform a series of experiments to study. the characteristics of this approach, and compare with the region-based approach on similar-shot sequence retrieval. The comparison shows the superiority of this approach.
Ire introduce a simple image coding method, the block truncation coding (BTC) technique, as a novel approach to the construction of colour imagedatabases. It is shown that BTC cars riot only be used to compress the i...
详细信息
ISBN:
(纸本)9628576623
Ire introduce a simple image coding method, the block truncation coding (BTC) technique, as a novel approach to the construction of colour imagedatabases. It is shown that BTC cars riot only be used to compress the images thus achieving storage efficiency, the BTC codes cart also be used directly, to construct image features for effective imageretrieval. From the BTC code we have developed an image feature termed the BTC colour co-occurrence matrix (BCCM) as an effective measure of image contents. Experimental results are presented to show that BCCM is comparable to state of the art techniques, such as color correlogram, in imageretrieval.
The need to retrieve visual information from large image and video collections is shared by many application domains ([1, 11,) (13, 16)]. This paper describes the main features of Quicklook(2), a system that combines ...
详细信息
ISBN:
(纸本)0819439932
The need to retrieve visual information from large image and video collections is shared by many application domains ([1, 11,) (13, 16)]. This paper describes the main features of Quicklook(2), a system that combines in a single framework the alphanumeric relational query, the content-based image query exploiting automatically computed low-level image features (such as color and texture), and the textual similarity query exploiting any textual annotations attached to image database items (such as figure captions or textual cards...).
This Volume 4315 of the conference proceedings contains 620 papers. Topics discussed include search and retrieval of image database, indexing, querying and learning, media information systems, multimodel retrieval, fe...
详细信息
This Volume 4315 of the conference proceedings contains 620 papers. Topics discussed include search and retrieval of image database, indexing, querying and learning, media information systems, multimodel retrieval, feature evaluation, video processing, video sequences, videoretrieval systems and MPEG.
作者:
Power, GJUSAF
Res Lab Target Recognit Branch SNAT Wright Patterson AFB OH 45433 USA
Appropriate segmentation of video is a key step for applications such as video surveillance, video composing, video compression, storage and retrieval, and automated target recognition. video segmentation algorithms i...
详细信息
ISBN:
(纸本)0819441848
Appropriate segmentation of video is a key step for applications such as video surveillance, video composing, video compression, storage and retrieval, and automated target recognition. video segmentation algorithms involve dissecting the video into scenes based on shot boundaries as well as local objects and events based on spatial shape and regional motions. Many algorithmic approaches to video segmentation have been recently reported, but many lack measures to quantify, the success of the segmentation especially in comparison to other algorithms. This paper suggests multiple bench-top measures for evaluating video segmentation. The paper suggests that the measures are most useful when "truth" data about the video is available such as precise frame-by-frame object shape. When precise "truth" data is unavailable, this paper suggests using hand-segmented "truth" data to measure the success of the video segmentation. Thereby, the ability of the video segmentation algorithm to achieve the same quality of segmentation as the human is obtained in the form of a variance in multiple measures. The paper introduces a suite of measures, each scaled from zero to one. A score of one on a particular measure is a perfect score for a singular segmentation measure. Measures are introduced to evaluate the ability of a segmentation algorithm to correctly detect shot boundaries, to correctly determine spatial shape and to correctly determine temporal shape. The usefulness of the measures are demonstrated on a simple segmenter designed to detect and segment a ping pong ball from a table tennis image sequence.
This paper addresses the issues involved in designing a classifier for multimedia indexing, a representative of domain of tasks involving high dimensionality of feature space and large dissimilarity between features i...
详细信息
ISBN:
(纸本)0819439932
This paper addresses the issues involved in designing a classifier for multimedia indexing, a representative of domain of tasks involving high dimensionality of feature space and large dissimilarity between features in range and variation, and requiring a strong inference mechanism. We consider decision trees, bayesian network, neural network and support vector approaches The Modified Bayesian Network (MBN), as designed by us offers significant advantages over other approaches. The application of bayesian network has generally been restricted to domains having discrete variable values (such as binary), or to the domain with continuous variable values which approximate to Gaussian distribution. However, MBN can form sound representation of non-Gaussian Multimodal continuous distribution, as is the case with feature space in multimedia indexing. This can be accomplished by intelligent partitioning and data clique association. The structure of MBN and its functionality on real video is also presented in the paper. MBN can perform optimal classification even with partially specified queries given by the user. The strategy automatically gives more weightage to the relevant features amongst hundreds of features present in multimedia indexing system. The inference mechanism is based on iteratively comparing the posterior probability of one class with all other classes. In the comparison of one class with another class, each feature takes different importance measure corresponding to the discerning capacity of the feature. A label is assigned to the multimedia data corresponding to the winning class. The comparison with other classification tools shows that MBN classification performance is consistently better than that of the other tools.
We introduce a simple image coding method, the block truncation coding (BTC) technique, as a novel approach to the construction of colour imagedatabases. It is shown that BTC can not only be used to compress images, ...
详细信息
ISBN:
(纸本)9628576623
We introduce a simple image coding method, the block truncation coding (BTC) technique, as a novel approach to the construction of colour imagedatabases. It is shown that BTC can not only be used to compress images, thus achieving storage efficiency, but the BTC codes can also be used directly to construct image features for effective imageretrieval. From the BTC code we have developed an image feature termed the BTC colour co-occurrence matrix (BCCM) as an effective measure of image contents. Experimental results are presented to show that BCCM is comparable to state of the art techniques, such as color correlogram, in imageretrieval.
暂无评论