This Volume 4315 of the conference proceedings contains 620 papers. Topics discussed include search and retrieval of image database, indexing, querying and learning, media information systems, multimodel retrieval, fe...
详细信息
This Volume 4315 of the conference proceedings contains 620 papers. Topics discussed include search and retrieval of image database, indexing, querying and learning, media information systems, multimodel retrieval, feature evaluation, video processing, video sequences, videoretrieval systems and MPEG.
In this paper we propose a method for tracking a video object in an ordered sequence of two-dimensional images, where the outcome is the trajectory of the video object throughout the time sequence of images. This meth...
详细信息
In this paper we propose a method for tracking a video object in an ordered sequence of two-dimensional images, where the outcome is the trajectory of the video object throughout the time sequence of images. This method is designed to run in real-time in a synchronous video collaboration environment, and used for producing dynamic object annotations for enhanced video content understanding. A dynamic object is an object whose location or size in the video frame constantly changes due to the camera motion, the motion of its own, or both. We suggest a novel method for finding the trajectory of the object in the intermediate frames given the locations and shapes of the object in two end frames. In addition to the shape and location information of the object, its texture information in the end frames is used to predict the location and search space of it in the intermediate frames.
With the currently existing shot change detection algorithms, abrupt changes are detected fairly well. It is thus more challenging to detect gradual changes including fades, dissolves, and wipes as these are often mis...
详细信息
With the currently existing shot change detection algorithms, abrupt changes are detected fairly well. It is thus more challenging to detect gradual changes including fades, dissolves, and wipes as these are often missed or falsely detected. In this paper, we focus on the detection of wipes. The proposed algorithm begins by processing the visual rhythm, a portion of the DC image sequence. It is a single image, a sub-sampled version of a full video in which the sampling is performed in a pre-determined and in a systematic fashion. The visual rhythm contains distinctive patterns or visual features for many different types of video effects. The different video effects manifest themselves differently on the visual rhythm. In particular, wipes appear as curves that run from the top to the bottom of the visual rhythm. Thus, using the visual rhythm, it becomes possible to automatically detect wipes simply by determining various lines and curves on the visual rhythm.
The Fractal Transform (FT) was originally introduced as a methodology for compressing digital images and representing them at different scales. The process of calculating an FT generates a great dear of information ab...
详细信息
ISBN:
(纸本)0819431273
The Fractal Transform (FT) was originally introduced as a methodology for compressing digital images and representing them at different scales. The process of calculating an FT generates a great dear of information about the affine similarities and dissimilarities of an image, most of which is discarded in compression applications. In this paper we introduce the concept of Fractal Transform Analysis and use it to derive new image descriptors. We present results of experiments in which description schemes comprised of some of these FT-based descriptors are applied to the problems of finding objects in an image similar to a given object, of indexing images, and of querying an image database consisting of about 17,000 images. Complexity and timing data are also presented.
We present three strategies for placement of video data on parallel disk arrays. Using a low- level disk model and video data from a scalable subband coding technique, we derive constraints with which to compare the t...
详细信息
ISBN:
(纸本)0819414808
We present three strategies for placement of video data on parallel disk arrays. Using a low- level disk model and video data from a scalable subband coding technique, we derive constraints with which to compare the three strategies. One strategy, constant frame grouping, is shown to be superior. Two methods for interleaving multiple videos under the constant frame grouping strategy are presented: nonperiodic and periodic. Periodic interleaving is shown to have the advantages of a lower access time and limited scan and pause functions. The constant frame grouping strategy is tested on an actual array of 8 disks and shown to have performance that is close to the theoretical prediction. The scalable nature of the compressed data is used to relieve the disk system overload for an overly high request rate.
In general, video shots need to be clustered to form more semantically significant units, such as scenes, sequences, programs, etc. This is the so-called story-based video structuring. Automatic video structuring is o...
详细信息
ISBN:
(纸本)0819424331
In general, video shots need to be clustered to form more semantically significant units, such as scenes, sequences, programs, etc. This is the so-called story-based video structuring. Automatic video structuring is of great importance for video browsing and retrieval. The shots or scenes are usually described by one or several representative frames, called key frames. Viewed from a higher level, key frames of some shots might be redundant in terms of semantics. In this paper, we propose automatic solutions to the problems of key frame computing and key frame pruning. We develop an original image similarity criterion, which considers both spatial layout and detail content in an image. Coefficients of wavelet decomposition are used to derive parameter vectors accounting for the above two aspects. The parameters exhibit (quasi-) invariant properties. The novel ''Seek and Spread (SS)'' strategy used in key frame computing allows us to obtain a Targe representative range for the key frames. Inter-shot redundancy of the key frames is suppressed using the same image similarity measure. Experimental results demonstrate the effectiveness and efficiency of our techniques.
The design of an electronic archive of digitized images of thousands of xrays collected as part of nationwide health surveys has raised several issues related to user interface design, image presentation and image com...
详细信息
ISBN:
(纸本)0819414808
The design of an electronic archive of digitized images of thousands of xrays collected as part of nationwide health surveys has raised several issues related to user interface design, image presentation and image compression. The project involves developing an image archive implemented with an optical disk jukebox, and user workstations that allow Internet access to the images. This paper describes: the physical layout design of the workstation screens; desirable image processing functions contributing to better viewing and minimizing artifacts; interface design factors contributing to ease-of-use and speed of task completion; and work toward the selection of a suitable image compression technique.
In digital libraries and the Internet, large amount of data in various modalities has to be transmitted and delivered across the networks, and is subject to bandwidth constraints and network congestion. Among all mult...
详细信息
ISBN:
(纸本)0819424331
In digital libraries and the Internet, large amount of data in various modalities has to be transmitted and delivered across the networks, and is subject to bandwidth constraints and network congestion. Among all multimedia data, video is the most difficult to handle, both in terms of its size and the scarcity of tools and techniques available for efficient delivery, storage and retrieval. Providing tools to help users search and browse large collections of video documents is important. Equally important are the means to deliver and present the essence of video content to the user without noticeable delay. In this paper, we focus on the characterization of video by means of automatic analysis of its visual content and the compact presentation of the underlying story content built upon the derived characteristics. We develop models to capture and characterize video by temporal events, namely, dialogues, actions and story units. We then present these events using succinct visual summaries that depict and differentiate the underlying dramatic elements in an intuitive manner. The combination of video characterization and visual summary offers significant compaction of data size in video far beyond the numbers achieved by traditional video compression, while retaining essential meanings and semantics of the content, and is particularly useful for digital library and Internet applications.
This paper proposes an imageretrieval system which searches a database for images similar to a target, imagined by a user. The system uses image features, rather than keywords, and retrieves images by reducing a mult...
详细信息
ISBN:
(纸本)0819431273
This paper proposes an imageretrieval system which searches a database for images similar to a target, imagined by a user. The system uses image features, rather than keywords, and retrieves images by reducing a multidimensional feature space generated by the image feature vectors. First, the system presents the user some sample images with a suitable feature vector value and requires the user's interaction to obtain information on which image is similar to the target he/she has in his/her mind. Then, the information is used to appropriately reduce the feature space. This process is continued until the target region is reduced to a suitable volume. Since this method requires neither real target image nor keywords in retrieving, it is quite simple and practical. Experimental results show the advantage and efficiency of the proposed system.
We developed a content based retrieval scheme for texture by using text based description. The texture technique is based on our previous work which uses very simple texture primitives such as edges and plain regions ...
详细信息
We developed a content based retrieval scheme for texture by using text based description. The texture technique is based on our previous work which uses very simple texture primitives such as edges and plain regions to generate features. Other methods that apply complicated statistics can be difficult to transcribe into understandable forms for normal users. Unlike these other methods, with the simplicity of our features, we can express them in terms of simple language. Hence we can bridge the gap between semantics and computed features. A number of benefits can be achieved which opens a new horizon for content based retrieval with texture. For example, the user can request a texture image without necessarily knowing what types of textures are stored. In this paper we describe the method of translating such features and the partial weighted Euclidean distance matching which allows users to describe only the parts that they are interested in. This allows them to gradually refine their texture descriptions.
暂无评论