Multimedia Information Systems are experiencing a tremendous growth as a direct consequence of the popularity and pervasive use of world wide web. As a consequence, it is becoming increasingly important to provide eff...
详细信息
Multimedia Information Systems are experiencing a tremendous growth as a direct consequence of the popularity and pervasive use of world wide web. As a consequence, it is becoming increasingly important to provide efficient and flexible solutions for accessing and retrieving multimedia data. images and video are emerging as significant data types in multimedia systems. And yet, most commercial systems are still text and key-word based and do not fully exploit the image content of these systems. We believe that there is an opportunity to build a novel interactive multimedia system for some specific applications in electronic commerce. In this paper we present an overview of our approach, the rationale behind it and the problems that are inherent in building such a system. We address some of the technical issues in representing and analysing image primitive features. These are the building blocks of any such systems. They can be generalized into a much broader range of applications as well.
In this paper we address the problem of choosing appropriate features to describe the content of still pictures or video sequences including audio. As the computational analysis of these features is often time-consumi...
详细信息
In this paper we address the problem of choosing appropriate features to describe the content of still pictures or video sequences including audio. As the computational analysis of these features is often time-consuming it is useful to identify a minimal set allowing for an automatic classification of some class or genre. Further it can be shown that deleting the coherence of the features characterizing some class is not suitable to guarantee an optimal classification result. The central question of the paper is thus which features should be selected and how they should be weighted to optimize a classification problem.
Illumination invariance is of paramount importance to annotate video sequences stored in large videodatabases consistently. Yet, popular texture analysis methods such as multichannel filtering techniques do not yield ...
详细信息
Illumination invariance is of paramount importance to annotate video sequences stored in large videodatabases consistently. Yet, popular texture analysis methods such as multichannel filtering techniques do not yield illumination-invariant texture representations. In this paper, we assess the effectiveness of three illumination normalisation schemes for texture representations derived from Gabor filter outputs. The schemes aim at overcoming intensity scaling effects due to changes in ilumination conditions. A theoretical analysis and experimental results enable us to select one scheme as the most promising one. In this scheme, a normalising factor is derived at each pixel by combining the energy responses of different filters at that pixel. The scheme overcomes illumination variations well, while still preserving discriminatory textural information. Further statistical analysis may shed light on other interesting properties or limitations of the scheme.
Various methods of automatic shot boundary detection have been proposed and claimed to perform reliably. Although the detection of edits is fundamental to any kind of video analysis since it segments a video into its ...
详细信息
Various methods of automatic shot boundary detection have been proposed and claimed to perform reliably. Although the detection of edits is fundamental to any kind of video analysis since it segments a video into its basic components, the shots, only few comparative investigations on early shot boundary detection algorithms have been published. These investigations mainly concentrate on measuring the edit detection performance, however, do not consider the algorithms' ability to classify the types and to locate the boundaries of the edits correctly. This paper extends these comparative investigations. More recent algorithms designed explicitly to detect specific complex editing operations such as fades and dissolves are taken into account, and their ability to classify the types and locate the boundaries of such edits are examined. The algorithms' performance is measured in terms of hit rate, number of false hits, and miss rate for hard cuts, fades, and dissolves over a large and diverse set of video sequences. The experiments show that while hard cuts and fades can be detected reliably, dissolves are still an open research issue. The false hit rate for dissolves is usually unacceptably high, ranging from 50% up to over 400%. Moreover, all algorithms seem to fail under roughly the same conditions.
An effective analysis of Visual Objects appearing in stillimages and video flames is required in order to offer fine grain access to multimedia and audiovisual contents. In previous papers, we showed how our method f...
详细信息
ISBN:
(纸本)0819434396
An effective analysis of Visual Objects appearing in stillimages and video flames is required in order to offer fine grain access to multimedia and audiovisual contents. In previous papers, we showed how our method for segmenting stillimages into visual objects could improve content-based imageretrieval and video analysis methods. Visual Objects are used in particular for extracting semantic knowledge about the contents. However, low-level segmentation methods for stillimages are not likely to extract a complex object as a whole but instead as a set of several sub-objects. For example, a person would be segmented into three visual objects: a face, hair, and a body. In this paper, we introduce the concept of Composite Visual Object. Such an object is hierarchically composed of sub-objects called Component Objects. Production rules implementing some common sense knowledge are used to extract and label composite visual objects based on the output of our stillimage segmentation method, and to label the component objects with their semantic values. Composite visual objects of the database (e.g.: "persons") can then be searched for, possibly with some constraints on some of their components (e.g.: "only with a blue suit!").
A different approach to content-based retrieval and a novel framework for classification of visual information are proposed. The Visual Apprentice which is an implementation of the framework for stillimages and video...
详细信息
A different approach to content-based retrieval and a novel framework for classification of visual information are proposed. The Visual Apprentice which is an implementation of the framework for stillimages and video that uses a combination of lazy-learning, decision trees, and evolution programs for classification and grouping is introduced. Examples and results are given to demonstrate the applicability of the proposed approach to perform visual classification and detection.
Multimedia Information Systems are experiencing a tremendous growth as a direct consequence of the popularity and pervasive use of world wide web. As a consequence, it is becoming increasingly important to provide eff...
详细信息
Multimedia Information Systems are experiencing a tremendous growth as a direct consequence of the popularity and pervasive use of world wide web. As a consequence, it is becoming increasingly important to provide efficient and flexible solutions for accessing and retrieving multimedia data. images and video are emerging as significant data types in multimedia systems. And yet, most commercial systems are still text and key-word based and do not fully exploit the image content of these systems. We believe that there is an opportunity to build a novel interactive multimedia system for some specific applications in electronic commerce. In this paper we present an overview of our approach, the rationale behind it and the problems that are inherent in building such a system. We address some of the technical issues in representing and analysing image primitive features. These are the building blocks of any such systems. They can be generalized into a much broader range of applications as well.
Illumination invariance is of paramount importance to annotate video sequences stored in large videodatabases consistently. Yet, popular texture analysis methods such as multichannel filtering techniques do not yield ...
详细信息
Illumination invariance is of paramount importance to annotate video sequences stored in large videodatabases consistently. Yet, popular texture analysis methods such as multichannel filtering techniques do not yield illumination-invariant texture representations. In this paper, we assess the effectiveness of three illumination normalisation schemes for texture representations derived from Gabor filter outputs. The schemes aim at overcoming intensity scaling effects due to changes in ilumination conditions. A theoretical analysis and experimental results enable us to select one scheme as the most promising one. In this scheme, a normalising factor is derived at each pixel by combining the energy responses of different filters at that pixel. The scheme overcomes illumination variations well, while still preserving discriminatory textural information. Further statistical analysis may shed light on other interesting properties or limitations of the scheme.
This paper introduces a model of spatio-temporal database that we are developing to query interesting events in video sequences. The database that we are designing is pushing the state of the art for a number of field...
详细信息
ISBN:
(纸本)0819431273
This paper introduces a model of spatio-temporal database that we are developing to query interesting events in video sequences. The database that we are designing is pushing the state of the art for a number of fields, and there are many issues that are still waiting a satisfactory solution. In this paper we present our (albeit still partial) answer to some of these problems, and the future directions of our work. Our design is divided in two layers: a Logbook which operates as a short time repository of unsummarized and unprocessed data, and a long term spatio-temporal database which stores and queries sumamrized data.
Content based retrieval on large multimedia database attracts the interests of many researchers, but the database architecture needed for content based retrieval is still an open problem. Traditional relation database...
详细信息
ISBN:
(纸本)0819431273
Content based retrieval on large multimedia database attracts the interests of many researchers, but the database architecture needed for content based retrieval is still an open problem. Traditional relation database system does not support the high-dimension feature form content description and indexing, thus is limited in its content based retrieval function. Some systems do support high-dimension feature form content description and indexing, but lacks descriptions and query expressions on media object content and relations. In this paper, we present our study results on query mechanism and proposed CbEEpr - a power flexible query expression mechanism on media object. Based on CbExpr we proposed GMA (general mediabase architecture) -a general architecture for management and content based retrieval on large media databases, and videoBase - a content based videoretrieval system is present as example of GMA. Basic thoughts, considerations, and definitions are presented in the paper, also with some implementation details.
暂无评论