The color hologram of an image has been widely used as a feature descriptor for the image in content-based retrieval applications. In this paper, some results from our investigation efforts into to usage are reported....
详细信息
The color hologram of an image has been widely used as a feature descriptor for the image in content-based retrieval applications. In this paper, some results from our investigation efforts into to usage are reported. We outline three typical color space quantization schemes used in our experiments and introduce the soft-decision histogramming method to eliminate the discontinuity problem in traditional color histogram population process. Then, to improve the effectiveness of color histogram based retrieval algorithms, several similarity metrics are proposed for comparing color histograms, including three special forms of the Kantorovich metric.
With the rapid growth of multimedia information in forms of digital image and video libraries, there is an increasing need for intelligent database management tools with an efficient information retrieval system. For ...
详细信息
With the rapid growth of multimedia information in forms of digital image and video libraries, there is an increasing need for intelligent database management tools with an efficient information retrieval system. For this purpose, we propose a hierarchical retrieval system where shape, color and motion characteristics of human body are captured in compressed and uncompressed domains. The proposed retrieval method provides human detection and activity recognition at different resolution levels from low complexity to low false rates and connects low level features to high level semantics by developing relational object and activity presentations. The available information of standard video compression algorithms are used in order to reduce the amount of time and storage needed for the information retrieval. The principal component analysis is used for activity recognition using MPEG motion vectors and results are presented for walking, kicking, and running to demonstrate that the classification among activities is clearly visible. For low resolution and monochrome images it is demonstrated that the structural information of human silhouettes can be captured from AC-DCT coefficients. The system performance is tested on 40 images that contain a total of 126 nonoccluded frontal poses and the algorithm can detect 101 of them correctly. The finest details in the images and video sequences are obtained from the uncompressed domain via model based segmentation and graph matching for an in depth analysis of human bodies. The detection rate for human body parts is 70.27% for images and sequences including human body regions at different resolutions and with different postures.
The amount of digitized video in video archives is becoming so huge that easier access and content browsing tools are desperately needed. Also, video is no longer one big piece of data but a collection of useful small...
详细信息
The amount of digitized video in video archives is becoming so huge that easier access and content browsing tools are desperately needed. Also, video is no longer one big piece of data but a collection of useful smaller building blocks that can be accessed and used independently from the original context of presentation. In this paper, we demonstrate a content model for audio-video sequences with the purpose of enabling the automatic generation of video summaries. The model is based on descriptors which indicate various properties and relations of audio and video segments. In practice, these descriptors could either be generated automatically by analysis methods, or produced manually (or computer-assisted) by the content provider himself. We analyze the requirements and characteristics of the different data segments with respect to the problem of summarization, and we define our model as a set of constraints which allow to produce good quality summaries.
We present a fast algorithm for computing the singular value decomposition (SVD) of a matrix consisting of the frames from a video sequence. The computational efficiency of this algorithm derives from the observation ...
详细信息
We present a fast algorithm for computing the singular value decomposition (SVD) of a matrix consisting of the frames from a video sequence. The computational efficiency of this algorithm derives from the observation that portions of a video sequence will consist of sets of correlated frames. We then show that the information obtained from the SVD can be used to analyze video sequences to obtain information such as scene breaks, scene query, reduced-order shot representation and key frame determination. We illustrate this approach on several video sequences.
video Segmentation plays an integral role in many multimedia applications such as digital libraries, content management systems, and various other video browsing, indexing, and retrieval systems. Many algorithms for s...
详细信息
video Segmentation plays an integral role in many multimedia applications such as digital libraries, content management systems, and various other video browsing, indexing, and retrieval systems. Many algorithms for segmentation of video have appeared in the past few years. Most of these algorithms perform well on cuts, but yield poor performance on gradual transitions or special effect edits. A complete video segmentation system must achieve good performance on special effect edit detection also. In this paper, we discuss the performance of our videoTrails based algorithms with other existing special effect edit detection algorithms in literature. Results from experiments testing for the ability to detect edits from TV programs ranging from commercials to news magazine programs, and also diverse special effect edits introduced by us have been shown.
Advanced visual information retrieval systems supporting both video and images need to have flexible system design so that their system configurations can easily be enhanced. It is therefore desirable to separate the ...
详细信息
ISBN:
(纸本)0819411418
Advanced visual information retrieval systems supporting both video and images need to have flexible system design so that their system configurations can easily be enhanced. It is therefore desirable to separate the features of a central system into three parts: storage servers, communication servers, and a back-end network that combines these. In this architecture, unscheduled arrivals of data blocks at a back-end network cause two problems: unacceptable fluctuation of video frames and overly long delays of image transfer. To solve these problems, we have designed a new multimedia integrated switching system (MISS) that uses a fully connected crossbar switch to combine servers. MISS treats a time interval of a few hundred microseconds (called a `time-slot') as the basic unit of data block transfer, and allocates appropriate time-slots to all transfer requests in order to simultaneously meet the requirements for each kind of visual information transfer. According to simulation results and estimates based on queuing theory, MISS greatly reduces video frame fluctuation and halves the average image transfer delay. These effects have been confirmed in an experimental visual communication system built around MISS. This system supports JPEG compressed video and images, and six terminals can simultaneously retrieve visual information through an FDDI network.
With the advent of pervasive computing, a growing diversity of client devices is gaining access to audio-visual content. The increased variability in client device processing power, storage, bandwidth, and server load...
详细信息
ISBN:
(纸本)0780365364
With the advent of pervasive computing, a growing diversity of client devices is gaining access to audio-visual content. The increased variability in client device processing power, storage, bandwidth, and server loading require adaptive solutions for image, video and audio retrieval. Progressive retrieval is one prominent mode of access in which views at different resolutions are incrementally retrieved and refined over time. In this paper, we present a new framework for adaptively partitioning the synthesis operations in progressive retrieval of audio-visual signals. The framework considers that the server and client cooperate in synthesizing the views in order to best utilize the available processing power and bandwidth. We provide experimental results that demonstrate a significant reduction in latency in the progressive retrieval of images under different conditions of the client, server and network.
This paper discusses a novel data placement scheme which optimizes the storage utilization of a NVOD system. The scheme is most distinctive in the following two aspects: 1. It considers Me file blocks placement of pro...
详细信息
ISBN:
(纸本)0819424331
This paper discusses a novel data placement scheme which optimizes the storage utilization of a NVOD system. The scheme is most distinctive in the following two aspects: 1. It considers Me file blocks placement of programs featured different number NVOD channels. 2. The file blocks grouping scheme optimizes the storage utilization of a NVOD system.
In this paper videos are analyzed to get a content-based decription of the video. The structure of a given video is useful to index long videos efficiently and automatically. A comparison between shots gives an overvi...
详细信息
ISBN:
(纸本)0819427527
In this paper videos are analyzed to get a content-based decription of the video. The structure of a given video is useful to index long videos efficiently and automatically. A comparison between shots gives an overview about cut frequency, cut pattern, and scene bounds. After a shot detection the shots are grouped into clusters based on their visual similarity. A time-constraint clustering procedure is used to compare only those shots that are positioned inside a time range. Shots from different areas of the video (e.g., begin/end) are not compared. With this cluster information that contains a list about shots and their clusters it is possible to calculate scene bounds. A labeling of all clusters gives a declaration about the cut pattern. It is easy now to distinguish a dialogue from an action scene. The final content analysis is done by the imageMiner* system. The imageMiner system developed at the University of Bremen of the image Processing Department of the Center for Computing Technology realizes content-based imageretrieval for still images through a novel combination of methods and techniques of computer vision and artifical intelligence. The imageMiner system consists of three analysis modules for computer vision, namely for color, texture, and contour analysis. Additionally exists a module for object recognition. The output of the object recognition module can be indexed by a text retrieval system. Thus, concepts like forestscene may;be searched for. We combine the still image analysis with the results of the video analysis in order to retrieve shots or scenes.
In this paper we introduce our approach to multimedia database interfaces. Although we deal mainly with imagedatabases, most of the ideas we present can be generalized to other types of data. We argue that, when deal...
详细信息
In this paper we introduce our approach to multimedia database interfaces. Although we deal mainly with imagedatabases, most of the ideas we present can be generalized to other types of data. We argue that, when dealing with complex data, such as images, the problem of access must be redefined along different lines than text databases. In multimedia databases, the semantics of the data is imprecise, and depends in part on user's interpretation. This observation made us consider the development of interfaces in which the user explores the database rather than querying it. In this paper we give a brief justification of our position and present the exploratory interface that we have developed for our image database El nino.
暂无评论