In this paper, we present an approach to clustering video sequences and images for efficient retrieval using relative entropy as our cost criterion. In addition, our experiments indicate that relative entropy is a goo...
详细信息
In this paper, we present an approach to clustering video sequences and images for efficient retrieval using relative entropy as our cost criterion. In addition, our experiments indicate that relative entropy is a good similarity measure for content-based retrieval. In our clustering work, we treat images and video as probability density functions over the extracted features. This leads us to formulate a general algorithm for clustering densities. In this context, it can be seen that an euclidean distance between features and the Kullback-Liebler (KL) divergence give equivalent clustering. In addition, the asymmetry of the KL divergence leads to another clustering. Our experiments indicate that this clustering is more robust to noise and distortions compared with the one resulting from euclidean norm.
A prototype of the content-based imageretrieval system is implemented based on the algorithms introduced in this paper. The image contents at the high levels are extracted. The fuzzy C-means classifier is employed to...
详细信息
A prototype of the content-based imageretrieval system is implemented based on the algorithms introduced in this paper. The image contents at the high levels are extracted. The fuzzy C-means classifier is employed to compute the object clusters and provide useful information for overlapped clusters. The automatic image segmentation and categorisation is achieved. To obtain the context for imageretrieval, the subjective context and the objective context are modelled by means of the fuzzy sets theory. The system is able to trace the users' interactions during retrieval. The refinements of the retrieval results can be made while the users are submitting the queries telling the specific requirements.
A different approach to content-based retrieval and a novel framework for classification of visual information are proposed. The Visual Apprentice which is an implementation of the framework for still images and video...
详细信息
A different approach to content-based retrieval and a novel framework for classification of visual information are proposed. The Visual Apprentice which is an implementation of the framework for still images and video that uses a combination of lazy-learning, decision trees, and evolution programs for classification and grouping is introduced. Examples and results are given to demonstrate the applicability of the proposed approach to perform visual classification and detection.
The Web-based Medical Information retrieval System (WebMIRS) allows Internet access to databases containing 17,000 digitized x-ray spine images and associated text data from National Health and Nutrition Examinations ...
详细信息
The Web-based Medical Information retrieval System (WebMIRS) allows Internet access to databases containing 17,000 digitized x-ray spine images and associated text data from National Health and Nutrition Examinations Surveys (NHANES). WebMIRS allows SQL query of the text, and viewing of the returned text records and images using a standard browser. We are now working (1) to determine utility of data directly derived from the images in our databases and (2) to investigate the feasibility of computer-assisted or automated indexing of the images to support imageretrieval of images of interest to biomedical researchers in the field of osteoarthritis. To build an initial database based on image data, we are manually segmenting a subset of the vertebrae, using techniques from vertebral morphometry. From this, we will derive and add to the database vertebral features. This image-derived data will enhance the user's data access capability by enabling the creation of combined SQL/image-content queries.
With the abstraction of digital video as the corresponding binary video- a process which upon numerous subjective experimentation seems to preserve (most of the) intelligibility of video content- we can pursue a preci...
详细信息
With the abstraction of digital video as the corresponding binary video- a process which upon numerous subjective experimentation seems to preserve (most of the) intelligibility of video content- we can pursue a precise and analytic approach to (digital videostorage and retrieval) algorithm design that are based upon geometrical (morphological) intuition. The foremost and tangible general benefit of such abstraction, however, is the immediate reductions of both data and computational complexities involved in implementing various algorithms and databases. The general paradigm presented may be utilized to address all issues pertaining to video library construction including visualization, optimum feedback query generation, object recognition, e.t.c., but the primary focus of attention in this paper are the ones pertaining to detection of fast (including presence of flashlights) and gradual scene changes (such as dissolves, fades, and various special effects such as wipes). Upon simulation we observed that we can achieve performances comparable to those of others with drastic reductions in both storage and computational complexities. Furthermore, since the conversion from grayscale to binary videos can be performed directly (with minimal additional computation) in the compressed domain by thresholding on the DCT DC coefficients themselves (or by using the contour information attached to MPEG4 formats), the algorithms presented herein are ideally suited for performing fast (on-the-fly) determinations of scene change, object recognition and/or tracking, and other more intelligent tasks traditionally requiring heavy demand on computational and/or storage complexities. The fast determinations may then be used on their own merits or can be used in conjunction or complementation with other higher-layer information in the future.
video parsing is an important step in content-based indexing techniques where the input video is decomposed into segments with uniform content. In video parsing detection of scene changes is one of the approaches wide...
详细信息
video parsing is an important step in content-based indexing techniques where the input video is decomposed into segments with uniform content. In video parsing detection of scene changes is one of the approaches widely used for extracting key frames from the video sequence. In this paper, an algorithm based on motion vectors is proposed to detect sudden scene changes and gradual scene changes (camera movements such as panning, tilting and zooming). Unlike some of the existing schemes, the proposed scheme is capable of detecting both sudden and gradual changes in uncompressed as well as compressed domain video. It is shown that the resultant motion vector can be used to identify and classify gradual changes due to camera movements. Results show that algorithm performed as well as the histogram-based schemes with uncompressed video. The performance of the algorithm was also investigated with H.263 compressed video. The detection and classification of both sudden and gradual scene changes was successfully demonstrated.
This paper describes an API for image searching. The attempt was to isolate the functionality of the GUI from the functionality of the image search engine. The GUI would then make calls to the image search API and cou...
详细信息
This paper describes an API for image searching. The attempt was to isolate the functionality of the GUI from the functionality of the image search engine. The GUI would then make calls to the image search API and could be used with any image search engine implementing that API. Also, different methods of specifying the initial search image are discussed as well as different methods of displaying the results, including the use of 3D using VRML.
Recent research on imagedatabases has been aimed at the development of content-based retrieval techniques for the management of visual information. Compared with such visual information as color, texture, and spatial...
详细信息
Recent research on imagedatabases has been aimed at the development of content-based retrieval techniques for the management of visual information. Compared with such visual information as color, texture, and spatial constraints, shape is so important a feature associated with those image objects of interest that shape alone may be sufficient to identify and classify an object completely and accurately. This paper presents a novel method based on feature point histogram indexing for object shape representation in imagedatabases. In this scheme, the feature point histogram is obtained by discretizing the angles produced by the Delaunay triangulation of a set of unique feature points which characterize object shape in the context, and then counting the number of times each discrete angle occurs in the resulted triangulation. The proposed shape representation technique is translation, scale, and rotation independent. Our various experiments concluded that the Euclidean distance performs very well as the similarity measure function in combination with the feature point histogram computed by counting the two largest angles of each individual Delaunay triangle. Through the further experiment, we also found evidence that an image object representation using a feature point histogram provides an effective cue for image object discrimination.
While current approaches for video segmentation and indexing are mostly focused on visual information, audio signals may actually play a primary role in video content parsing. In this paper, we present an approach for...
详细信息
While current approaches for video segmentation and indexing are mostly focused on visual information, audio signals may actually play a primary role in video content parsing. In this paper, we present an approach for automatic segmentation, indexing, and retrieval of audiovisual data based on audio content analysis. The accompanying audio signal of audiovisual data is first segmented and classified into basic types, i.e. speech, music, environmental sound, and silence. This coarse-level segmentation and indexing step is based on morphological and statistical analysis of several short-term features of the audio signals. Then, environmental sounds are classified into finer classes such as applause, explosion, bird's sound, etc. This fine-level classification and indexing step is based on time-frequency analysis of audio signals and the use of hidden Markov model (HMM) as the classifier. On top of this archiving scheme, an audiovisual data retrieval system is proposed. Experimental results show that the proposed approach has an accuracy rate higher than 90% for the coarse-level classification, and higher than 85% for the fine-level classification. Examples of audiovisual data segmentation and retrieval are also provided.
A simultaneous learning and indexing technique is proposed for efficient content-based retrieval of images that can be described by feature vectors. This technique builds a compact high-dimensional index while taking ...
详细信息
A simultaneous learning and indexing technique is proposed for efficient content-based retrieval of images that can be described by feature vectors. This technique builds a compact high-dimensional index while taking into account that the raw feature space needs to be adjusted for each new application. With this technique, much better efficiency can be achieved as compared to those techniques that do not make provisions for efficient indexing.
暂无评论