The Fractal Transform (FT) was originally introduced as a methodology for compressing digital images and representing them at different scales. The process of calculating an FT generates a great deal of information ab...
详细信息
The Fractal Transform (FT) was originally introduced as a methodology for compressing digital images and representing them at different scales. The process of calculating an FT generates a great deal of information about the affine similarities and dissimilarities of an image, most of which is discarded in compression applications. In this paper we introduce the concept of Fractal Transform Analysis and use it to derive new image descriptors. We present results of experiments in which description schemes comprised of some of these FT-based descriptors are applied to the problems of finding objects in an image similar to a given object, of indexing images, and of querying an image database consisting of about 17,000 images. Complexity and timing data are also presented.
In this paper, a technique is presented to locate and track the facial areas in image and videodatabases. The extracted facial regions are used to obtain a number of features that are suitable for content-based stora...
详细信息
ISBN:
(纸本)0819427497
In this paper, a technique is presented to locate and track the facial areas in image and videodatabases. The extracted facial regions are used to obtain a number of features that are suitable for content-based storage and retrieval. The proposed face localization method consists of essentially two components: i) a color processing unit, and ii) a shape and color analysis module. The color processing component utilizes the distribution of skin-tones in the HSv color space to obtain an initial set of candidate regions or objects. The latter shape and color analysis module is used to correctly identify the facial regions when falsely detected objects are extracted. A number of features such as hair color, skin-tone, and face location and size are subsequently determined from the extracted facial areas. The hair and skin colors provide useful descriptions related to the human characteristics while the face location and size can reveal information about the activity within the scene (i.e. spatial relationships with other objects), and the type of image (i.e. portrait shot, complete body). These features can be effectively combined with others and employed in user queries to retrieve particular facial images.
This volume 4315 of the conference proceedings contains 620 papers. Topics discussed include search and retrieval of image database, indexing, querying and learning, media information systems, multimodel retrieval, fe...
详细信息
This volume 4315 of the conference proceedings contains 620 papers. Topics discussed include search and retrieval of image database, indexing, querying and learning, media information systems, multimodel retrieval, feature evaluation, video processing, video sequences, videoretrieval systems and MPEG.
In this paper we propose a method for tracking a video object in an ordered sequence of two-dimensional images, where the outcome is the trajectory of the video object throughout the time sequence of images. This meth...
详细信息
In this paper we propose a method for tracking a video object in an ordered sequence of two-dimensional images, where the outcome is the trajectory of the video object throughout the time sequence of images. This method is designed to run in real-time in a synchronous video collaboration environment, and used for producing dynamic object annotations for enhanced video content understanding. A dynamic object is an object whose location or size in the video frame constantly changes due to the camera motion, the motion of its own, or both. We suggest a novel method for finding the trajectory of the object in the intermediate frames given the locations and shapes of the object in two end frames. In addition to the shape and location information of the object, its texture information in the end frames is used to predict the location and search space of it in the intermediate frames.
With the currently existing shot change detection algorithms, abrupt changes are detected fairly well. It is thus more challenging to detect gradual changes including fades, dissolves, and wipes as these are often mis...
详细信息
With the currently existing shot change detection algorithms, abrupt changes are detected fairly well. It is thus more challenging to detect gradual changes including fades, dissolves, and wipes as these are often missed or falsely detected. In this paper, we focus on the detection of wipes. The proposed algorithm begins by processing the visual rhythm, a portion of the DC image sequence. It is a single image, a sub-sampled version of a full video in which the sampling is performed in a pre-determined and in a systematic fashion. The visual rhythm contains distinctive patterns or visual features for many different types of video effects. The different video effects manifest themselves differently on the visual rhythm. In particular, wipes appear as curves that run from the top to the bottom of the visual rhythm. Thus, using the visual rhythm, it becomes possible to automatically detect wipes simply by determining various lines and curves on the visual rhythm.
The Fractal Transform (FT) was originally introduced as a methodology for compressing digital images and representing them at different scales. The process of calculating an FT generates a great dear of information ab...
详细信息
ISBN:
(纸本)0819431273
The Fractal Transform (FT) was originally introduced as a methodology for compressing digital images and representing them at different scales. The process of calculating an FT generates a great dear of information about the affine similarities and dissimilarities of an image, most of which is discarded in compression applications. In this paper we introduce the concept of Fractal Transform Analysis and use it to derive new image descriptors. We present results of experiments in which description schemes comprised of some of these FT-based descriptors are applied to the problems of finding objects in an image similar to a given object, of indexing images, and of querying an image database consisting of about 17,000 images. Complexity and timing data are also presented.
We present three strategies for placement of video data on parallel disk arrays. Using a low- level disk model and video data from a scalable subband coding technique, we derive constraints with which to compare the t...
详细信息
ISBN:
(纸本)0819414808
We present three strategies for placement of video data on parallel disk arrays. Using a low- level disk model and video data from a scalable subband coding technique, we derive constraints with which to compare the three strategies. One strategy, constant frame grouping, is shown to be superior. Two methods for interleaving multiple videos under the constant frame grouping strategy are presented: nonperiodic and periodic. Periodic interleaving is shown to have the advantages of a lower access time and limited scan and pause functions. The constant frame grouping strategy is tested on an actual array of 8 disks and shown to have performance that is close to the theoretical prediction. The scalable nature of the compressed data is used to relieve the disk system overload for an overly high request rate.
This paper describes the extended model for information retrieval (EMIR) designed for complex information description and retrieval and particularly well suited for image modeling. A main object in the proposed model ...
详细信息
ISBN:
(纸本)081941767X
This paper describes the extended model for information retrieval (EMIR) designed for complex information description and retrieval and particularly well suited for image modeling. A main object in the proposed model has a three parts specification: a description that is a list of attributes;a composition that is a list of component objects;and a topology that is a list of semantic relationships between component objects, expressing more semantic aspects of the main object structure. The model is well suited for image modeling for two complementary reasons. On one hand, it can distinguish between an object structure and its contents. This is achieved by relaxing the class-object classical instantiation link;thus allowing objects to have individual non categorized contents rather than those predicted in their classes. On the other hand, images have typically very different individual contents, and, therefore, cannot be easily modeled within a structured database model such as the relational model. The query language is organized according to the three-part organization of the model. A simple query has three parts: description, being some constraints on some attributes values;composition, being a set of sub-queries on the composition part of objects;topology, being the specification of special required links on the results of composition sub-queries.
The design of an electronic archive of digitized images of thousands of xrays collected as part of nationwide health surveys has raised several issues related to user interface design, image presentation and image com...
详细信息
ISBN:
(纸本)0819414808
The design of an electronic archive of digitized images of thousands of xrays collected as part of nationwide health surveys has raised several issues related to user interface design, image presentation and image compression. The project involves developing an image archive implemented with an optical disk jukebox, and user workstations that allow Internet access to the images. This paper describes: the physical layout design of the workstation screens; desirable image processing functions contributing to better viewing and minimizing artifacts; interface design factors contributing to ease-of-use and speed of task completion; and work toward the selection of a suitable image compression technique.
This paper proposes an imageretrieval system which searches a database for images similar to a target, imagined by a user. The system uses image features, rather than keywords, and retrieves images by reducing a mult...
详细信息
ISBN:
(纸本)0819431273
This paper proposes an imageretrieval system which searches a database for images similar to a target, imagined by a user. The system uses image features, rather than keywords, and retrieves images by reducing a multidimensional feature space generated by the image feature vectors. First, the system presents the user some sample images with a suitable feature vector value and requires the user's interaction to obtain information on which image is similar to the target he/she has in his/her mind. Then, the information is used to appropriately reduce the feature space. This process is continued until the target region is reduced to a suitable volume. Since this method requires neither real target image nor keywords in retrieving, it is quite simple and practical. Experimental results show the advantage and efficiency of the proposed system.
暂无评论