videodatabases are very demanding systems as far as mass storage requirements and computational resources necessary to perform common database operations, such as browsing and retrieval, are required. These operation...
详细信息
ISBN:
(纸本)0780362985
videodatabases are very demanding systems as far as mass storage requirements and computational resources necessary to perform common database operations, such as browsing and retrieval, are required. These operations can be simplified both in terms of computational complexity and of processing time by performing them on an ensemble of frames, called key frames, representative of the content units (shots) in which a video can be segmented. In this contribution an adaptive key frames extraction method based on a wavelet based multiresolution analysis in a perceptually uniform color space is presented. Experimental results that show the effectiveness of the proposed technique to select key frames summarizing the video's content, are finally provided.
In this work, we present a system for the automatic segmentation, indexing and retrieval of audiovisual data based on the combination of audio, visual and textual content analysis. The video stream is demultiplexed in...
详细信息
In this work, we present a system for the automatic segmentation, indexing and retrieval of audiovisual data based on the combination of audio, visual and textual content analysis. The video stream is demultiplexed into audio, image and caption components. Then, a semantic segmentation of the audio signal based on audio content analysis is conducted, and each segment is indexed as one of the basic audio types. The image sequence is segmented into shots based on visual information analysis, and keyframes are extracted from each shot. Meanwhile, keywords are detected from the closed caption. Index tables are designed for both linear and non-linear access to the video. It is shown by experiments that the proposed methods for multimodal media content analysis are effective, and that the integrated framework achieves satisfactory results for video information filtering and retrieval.
作者:
Power, GJUSAF
Res Lab Wright Patterson AFB OH 45433 USA
The transmission and storage of digital video currently requires more bandwidth than is typically available. Emerging applications such as video-on-demand, web cameras, and collaborative tools with video conferencing ...
详细信息
ISBN:
(纸本)0819437670
The transmission and storage of digital video currently requires more bandwidth than is typically available. Emerging applications such as video-on-demand, web cameras, and collaborative tools with video conferencing are pushing the limits of the transmission media to provide video to the desktop computer. Lossy compression has succeeded in meeting some of the video demand, but it suffers from artifacts and low resolution. This paper introduces a content-dependent, frame-selective compression technique which is developed wholly as a preconditioner that can be used with existing digital video compression techniques. The technique is heavily dependent on a priori knowledge of the general content of the video which uses content knowledge to make smart decisions concerning the frames selected for storage or transmission. The velocital information feature of each frame is calculated to determine the frames with the most active changes. The velocital information feature along with a priori knowledge of the application allows prioritization of the frames. Frames are assigned priority values with the higher priority frames being selected for transmission based on available bandwidth. The technique is demonstrated for two applications: an airborne surveillance application and a worldwide web camera application. The airborne surveillance application acquires digital infrared video of targets at a standard frame rate of 30 frames per second, but the imagery suffers from infrared sensor artifacts and spurious noise. The web camera application selects frames at a slow rate but suffers from artifacts due to lighting and reflections. The results of using content-dependent, frame-selective video compression shows improvement in image quality along with reduced transmission bandwidth requirements.
Due to the huge amount of potentially interesting documents available over the Internet, searching for relevant information has become very difficult. Since image and video are a major source of these data, grouping i...
详细信息
Due to the huge amount of potentially interesting documents available over the Internet, searching for relevant information has become very difficult. Since image and video are a major source of these data, grouping images into (semantically) meaningful categories using low-level visual features is an important (and challenging) problem in content-based imageretrieval. Using Bayesian classifiers, we attempt to capture high-level concepts from low-level image features. Specifically, we have developed Bayesian classifiers for semantic image classification (indoor vs. outdoor, city vs. landscape, and sunset vs. forest vs. mountain), image orientation detection, and object detection (detecting regions of sky and vegetation in outdoor images). We demonstrate that a small codebook (the optimal codebook size is selected using a modified MDL criterion) extracted from a learning vector quantizer can be used to estimate the class-conditional densities of the observed features needed for image classification. We have developed an incremental learning paradigm, a feature selection scheme, a rejection scheme, and a classifier combination strategy using bagging to improve classifier performance. Empirical results on a large database (∼24,000 images) show that semantic categorization and organization of the database using the proposed classification schemes improves both retrieval accuracy and efficiency.
Rapid advances in storage and networking technology have enabled the support of video-on-demand applications using computer networks. As time goes on, the modes by which users can select the movie that they wish to wa...
Rapid advances in storage and networking technology have enabled the support of video-on-demand applications using computer networks. As time goes on, the modes by which users can select the movie that they wish to watch are likely to increase dramatically. A multimedia system supporting user interaction must have the ability to efficiently retrieve the movies that satisfy the user's requirement. Furthermore, the user should have the option of watching a particular segment of the movie. In addition, the user should have the option of seeing critical reviews of the movie. Such reviews may help him to select desired movie for watching. For successful Multimedia database Management Systems (MMDBMS), it must have the ability to retrieve media object (i. e. video) from a local storage device in a smooth jitter free manner. As video data often occupy a large amount of space, therefore, new techniques are required for the fast retrieval and efficient use of storage. In the present work, a new approach for fast retrieval of any particular segment is developed. R-trees basically were used for imageretrieval. We are extending this approach for video data representation under the name of R-segment tree representation. These databases are used for efficient retrieval of particular activity of any video. User can interact with the database through graphical user interface. Results show that proposed technique can retrieve desired activity of any video randomly with higher retrieval efficiency. The program for implementing this technique was developed in visual basic language.
Besides traditional applications (e.g., CAD/CAM and Trademark registry), new multimedia applications such as structured video, animation, and MPEG-7 standard require the storage and management of well-defined objects....
详细信息
ISBN:
(纸本)0780365364
Besides traditional applications (e.g., CAD/CAM and Trademark registry), new multimedia applications such as structured video, animation, and MPEG-7 standard require the storage and management of well-defined objects. We focus on shape-based object retrieval and conduct a comparison study on four of such techniques: FD, GB, DT, and MBC - TPVAS. Our results show that the similarity retrieval accuracy of our method (TVPAS) is as good as other methods, while it has the lowest computation cost to generate the shape signatures of the objects. Moreover, it has low storage requirement, and a comparable computation cost to compute the similarity between two shape signatures. In addition, TPVAS requires no normalization of the objects, and is the only method that has direct support for RST query types. In this paper, we also introduce anew shape description taxonomy.
This paper presents a human posture recognition method from a single image. We first segment an image into homogeneous regions and extract curve segments corresponding to human body parts. Each body part is considered...
详细信息
This paper presents a human posture recognition method from a single image. We first segment an image into homogeneous regions and extract curve segments corresponding to human body parts. Each body part is considered as a 2D ribbon. From the smooth curve segments in skin regions, 2D ribbons are extracted and a human body model is constructed. We assign a predefined posture type to the image according to the constructed body model. For the user input query to retrieve images containing human of specific posture, the system convert the query to a body model. The body model is compared to other body models saved in the local storage of target images and images of good matches are retrieved. When a face detection result is available for the given image, it is also used to increase the reliability of body model. For the query human posture, our system retrieves images of the corresponding posture. As another application, the proposed method provides an initial location of a human body to track in a video sequence.
In this paper we propose an interactive tool for generating dynamic markers for video objects in a distributed video content discussion environment. We address interactive video object selection and real-time video ob...
详细信息
In this paper we propose an interactive tool for generating dynamic markers for video objects in a distributed video content discussion environment. We address interactive video object selection and real-time video object marker generation which is supported by an automatic object tracking method. The proposed system satisfies the following criteria: (i) automatic object tracking has to be in real-time;(ii) the video object selection has to be carried out with minimal effort and knowledge;(iii) the user has to be notified by the system when the automatic object tracking method encounters problems;and (iv) interactive rectification of the object marker has to be instantaneous and direct. Our experimental results indicate that the proposed tool is very effective and intuitive in creating dynamic object markers for video content on the fly. Automatic object tracking method yields reliable results on a desktop PC in real-time, even with busy background and/or partial occlusion.
video segmentation is an important step in many of the video applications. We observe that the video shot boundary is a multi-resolution edge phenomenon in the feature space. Based on this observation, we have develop...
详细信息
video segmentation is an important step in many of the video applications. We observe that the video shot boundary is a multi-resolution edge phenomenon in the feature space. Based on this observation, we have developed a novel temporal multi-resolution analysis (TMRA) based algorithm using Canny wavelets to perform temporal video segmentation. Information across multiple resolutions is used to help detect as well as locate abrupt and gradual transitions. We present the theoretical basis of the algorithm followed by the implementation as well as the results. In this paper the TMRA technique has been implemented using color histogram in the raw domain and DCT coefficients in the compressed video streams as the feature space. Experimental results shows that this method can detect as well as characterize both the abrupt and gradual shot boundaries. The technique also shows good noise tolerance characteristics.
This paper addresses key-frame selection for content-based video indexing and access. The proposed key-frame selection method is aimed to operate in real-time irrespective of the available computation resources and me...
详细信息
This paper addresses key-frame selection for content-based video indexing and access. The proposed key-frame selection method is aimed to operate in real-time irrespective of the available computation resources and memory. Hence, we provide three solutions to content-based key-frame selection with different costs, and suggest three operation levels. The suggested key-frame selection method has two major parts: (i) segmentation of the video into shots;(ii) analysis of the motion and color activity within each video shot to select additional frames. We also provide a new color based approach to key-frame selection and discuss how to fuse color and motion based key-frame selection results.
暂无评论