Indexing and editing digital video directly in the compressed domain offer many advantages in terms of storage efficiency and processing speed. We have designed automatic tools in the compressed domain for extracting ...
详细信息
ISBN:
(纸本)0819420441
Indexing and editing digital video directly in the compressed domain offer many advantages in terms of storage efficiency and processing speed. We have designed automatic tools in the compressed domain for extracting key visual features such as scene cut, dissolve, camera operations (zoom, pan), and moving object detection and tracking. In addition, we have developed algorithms to solve the decoder buffer control problems and allow users to 'cut, copy and paste' arbitrary compressed video segments directly in the compressed domain. The compressed-domain approach does not require full decoding. Thus fast software implementations can be achieved. Our compressed video editing techniques enhance the reusability of existing compressed videos.
This paper studies the impact of different disk array configurations on the size of data buffer required to support video-on-demand. The study is based on a general two-level hierarchical disk array structure that pro...
详细信息
ISBN:
(纸本)0819420441
This paper studies the impact of different disk array configurations on the size of data buffer required to support video-on-demand. The study is based on a general two-level hierarchical disk array structure that provides both parallelism and concurrency. The study reveals that, for disk arrays with the same total number of disks, a higher degree of parallelism means that the minimum size of data buffer required is larger. This result provides valuable insight about how to organize the disk array in order to minimize system costs.
This paper describes an algorithm for searching imagedatabases for images that match a specified pattern. The application in mind for this algorithm is a query system for a large library of digitized satellite images...
详细信息
ISBN:
(纸本)0819420441
This paper describes an algorithm for searching imagedatabases for images that match a specified pattern. The application in mind for this algorithm is a query system for a large library of digitized satellite images. The algorithm has two thresholds that allow the user to adjust independently the closeness of a match. One threshold controls an intensity match and the other controls a texture match. The thresholds are correlations that can be computed efficiently in the Fourier transform domain of an image, and are particularly efficient to compute when the Fourier coefficients are mostly zero. Thus the scheme works well with image-compression algorithms that replace small Fourier coefficients by zeros. For compressed images, the majority of the cost of processing such images is in computing the inverse transforms plus a few operations per pixel for nonlinear threshold operations. The quality of retrieval for this algorithm has not been evaluated at this writing. We show the use of this technique on a typical satellite image. The technique may be suitable for automatic identification of cloud-free images, for making crude classifications of land use, and for finding isolated features that have unique intensity and texture characteristics. We discuss how to generalize the algorithm from matching gray-scale intensity to color or multispectral images.
Successful retrieval of images by shape feature is likely to be achieved only if we can mirror human similarity judgments. Following Biederman's theory of recognition-by-components, we postulate that shape analysi...
详细信息
ISBN:
(纸本)0819420441
Successful retrieval of images by shape feature is likely to be achieved only if we can mirror human similarity judgments. Following Biederman's theory of recognition-by-components, we postulate that shape analysis for retrieval should characterize an image by identifying properties such as collinearity, shape similarity and proximity in component boundaries. Such properties can then be used to group image components into families, from which indexing features can be derived. We are currently applying these principles in the development of the ARTISAN shape retrieval system for the UK Patent Office. The trademark images, supplied in compressed bit-map format, are processed using standard edge-extraction techniques to derive a set of region boundaries, which are approximated as a sequence of straight-line and circular-arc segments. These are then grouped into families using criteria such as proximity and shape similarity. Shape features for retrieval are then extracted from the image as a whole, each boundary family, and each individual boundary. Progress to date with the project is analyzed, evaluation plans described, and possible future directions for the research discussed.
The major problem facing videodatabases is that of content characterization of video clips once the cut boundaries have been determined. The current efforts in this direction are focussed exclusively on the use of pi...
详细信息
ISBN:
(纸本)0819420441
The major problem facing videodatabases is that of content characterization of video clips once the cut boundaries have been determined. The current efforts in this direction are focussed exclusively on the use of pictorial information, thereby neglecting an important supplementary source of content information, i.e. the embedded audio or sound track. The current research in audio processing can be readily applied to create many different video indices for use in video On Demand (VOD), educational video indexing, sports video characterization, etc. MPEG is an emerging video and audio compression standard with rapidly increasing popularity in multimedia industry. Compressed bit stream processing has gained good recognition among the researchers. We have also demonstrated feature extraction in MPEG compressed video which implements a majority of scene change detection schemes on compressed video. In this paper, we examine the potential of audio information for content characterization by demonstrating the extraction of widely used features in audio processing directly from compressed data stream and their application to video clip classification.
In this paper, we propose an algorithm based on vector quantization (VQ) for indexing of video sequences in the compressed form. In VQ, the image to be compressed is decomposed into L-dimensional vectors. Each vector ...
详细信息
ISBN:
(纸本)0819420441
In this paper, we propose an algorithm based on vector quantization (VQ) for indexing of video sequences in the compressed form. In VQ, the image to be compressed is decomposed into L-dimensional vectors. Each vector is mapped into one of a finite set of codewords (codebook). Vectors are encoded in the intraframe mode using adaptive VQ. Each frame is represented by a set of labels and a codebook. We note that the codebook reflects the contents of the frame being compressed and similar frames have similar codebooks. The labels are used for cut detection and to generate indices to store and retrieve video sequences. The proposed technique provides fast access to the sequences in the database. In addition, this technique combines video compression and video indexing. Simulation results confirm the substantial gains of the proposed technique in comparison with other techniques reported in the literature.
Multimedia interfaces increase the need for large imagedatabases, capable of storing and reading streams of data with strict synchronicity and isochronicity requirements. In order to fulfill these requirements, we co...
详细信息
ISBN:
(纸本)0819420441
Multimedia interfaces increase the need for large imagedatabases, capable of storing and reading streams of data with strict synchronicity and isochronicity requirements. In order to fulfill these requirements, we consider a parallel image server architecture which relies on arrays of intelligent disk nodes, each disk node being composed of one processor and one or more disks. This contribution analyzes through bottleneck performance evaluation and simulation the behavior of two multi-processor multi-disk architectures: a point-to-point architecture and a shared-bus architecture similar to current multiprocessor workstation architectures. We compare the two architectures on the basis of two multimedia algorithms: the compute-bound frame resizing by resampling and the data-bound disk-to-client stream transfer. The results suggest that the shared bus is a potential bottleneck despite its very high hardware throughput (400Mbytes/s) and that an architecture with addressable local memories located closely to their respective processors could partially remove this bottleneck. The point- to-point architecture is scalable and able to sustain high throughputs for simultaneous compute- bound and data-bound operations.
Dissimilarity measures, the basis of similarity-based retrieval, can be viewed as a distance and a similarity-based search as a nearest neighbor search. Though there has been extensive research on data structures and ...
详细信息
ISBN:
(纸本)0819420441
Dissimilarity measures, the basis of similarity-based retrieval, can be viewed as a distance and a similarity-based search as a nearest neighbor search. Though there has been extensive research on data structures and search methods to support nearest-neighbor searching, these indexing and dimension-reduction methods are generally not applicable to non-coordinate data and non-Euclidean distance measures. In this paper we reexamine and extend previous work of other researchers on best match searching based on the triangle inequality. These methods can be used to organize both non-coordinate data and non-Euclidean metric similarity measures. The effectiveness of the indexes depends on the actual dimensionality of the feature set, data, and similarity metric used. We show that these methods provide significant performance improvements and may be of practical value in real-world databases.
Many algorithms have been proposed for detecting video shot boundaries and classifying shot and shot transition types. Few published studies compare available algorithms, and those that do have looked at limited range...
详细信息
ISBN:
(纸本)0819420441
Many algorithms have been proposed for detecting video shot boundaries and classifying shot and shot transition types. Few published studies compare available algorithms, and those that do have looked at limited range of test material. This paper presents a comparison of several shot boundary detection and classification techniques and their variations including histograms, discrete cosine transform, motion vector, and block matching methods. The performance and ease of selecting good thresholds for these algorithms are evaluated based on a wide variety of video sequences with a good mix of transition types. Threshold selection requires a trade-off between recall and precision that must be guided by the target application.
Multimedia information is now routinely available in the forms of text, pictures, animation and sound. Although text objects are relatively easy to deal with (in terms of information search and retrieval), other infor...
详细信息
ISBN:
(纸本)0819420441
Multimedia information is now routinely available in the forms of text, pictures, animation and sound. Although text objects are relatively easy to deal with (in terms of information search and retrieval), other information bearing objects (such as sound, images, animation) are more difficult to index. Our research is aimed at developing better ways of representing multimedia objects by using a conceptual representation based on Schank's conceptual dependencies. Moreover, the representation allows for users' individual interpretations to be embedded in the system. This will alleviate the problems associated with traditional semantic networks by allowing for coexistence of multiple views of the same information. The viability of the approach is tested, and the preliminary results reported.
暂无评论