Rapid advances in storage and networking technology have enabled the support of video-on-demand applications using computer networks. As time goes on, the modes by which users can select the movie that they wish to wa...
Rapid advances in storage and networking technology have enabled the support of video-on-demand applications using computer networks. As time goes on, the modes by which users can select the movie that they wish to watch are likely to increase dramatically. A multimedia system supporting user interaction must have the ability to efficiently retrieve the movies that satisfy the user's requirement. Furthermore, the user should have the option of watching a particular segment of the movie. In addition, the user should have the option of seeing critical reviews of the movie. Such reviews may help him to select desired movie for watching. For successful Multimedia database Management Systems (MMDBMS), it must have the ability to retrieve media object (i. e. video) from a local storage device in a smooth jitter free manner. As video data often occupy a large amount of space, therefore, new techniques are required for the fast retrieval and efficient use of storage. In the present work, a new approach for fast retrieval of any particular segment is developed. R-trees basically were used for imageretrieval. We are extending this approach for video data representation under the name of R-segment tree representation. These databases are used for efficient retrieval of particular activity of any video. User can interact with the database through graphical user interface. Results show that proposed technique can retrieve desired activity of any video randomly with higher retrieval efficiency. The program for implementing this technique was developed in visual basic language.
This paper presents a human posture recognition method from a single image. We first segment an image into homogeneous regions and extract curve segments corresponding to human body parts. Each body part is considered...
详细信息
This paper presents a human posture recognition method from a single image. We first segment an image into homogeneous regions and extract curve segments corresponding to human body parts. Each body part is considered as a 2D ribbon. From the smooth curve segments in skin regions, 2D ribbons are extracted and a human body model is constructed. We assign a predefined posture type to the image according to the constructed body model. For the user input query to retrieve images containing human of specific posture, the system convert the query to a body model. The body model is compared to other body models saved in the local storage of target images and images of good matches are retrieved. When a face detection result is available for the given image, it is also used to increase the reliability of body model. For the query human posture, our system retrieves images of the corresponding posture. As another application, the proposed method provides an initial location of a human body to track in a video sequence.
video segmentation is an important step in many of the video applications. We observe that the video shot boundary is a multi-resolution edge phenomenon in the feature space. Based on this observation, we have develop...
详细信息
video segmentation is an important step in many of the video applications. We observe that the video shot boundary is a multi-resolution edge phenomenon in the feature space. Based on this observation, we have developed a novel temporal multi-resolution analysis (TMRA) based algorithm using Canny wavelets to perform temporal video segmentation. Information across multiple resolutions is used to help detect as well as locate abrupt and gradual transitions. We present the theoretical basis of the algorithm followed by the implementation as well as the results. In this paper the TMRA technique has been implemented using color histogram in the raw domain and DCT coefficients in the compressed video streams as the feature space. Experimental results shows that this method can detect as well as characterize both the abrupt and gradual shot boundaries. The technique also shows good noise tolerance characteristics.
This paper addresses key-frame selection for content-based video indexing and access. The proposed key-frame selection method is aimed to operate in real-time irrespective of the available computation resources and me...
详细信息
This paper addresses key-frame selection for content-based video indexing and access. The proposed key-frame selection method is aimed to operate in real-time irrespective of the available computation resources and memory. Hence, we provide three solutions to content-based key-frame selection with different costs, and suggest three operation levels. The suggested key-frame selection method has two major parts: (i) segmentation of the video into shots;(ii) analysis of the motion and color activity within each video shot to select additional frames. We also provide a new color based approach to key-frame selection and discuss how to fuse color and motion based key-frame selection results.
Tools for efficient and intelligent management of digital content are essential for digital video data management. An extremely challenging research area in this context is that of multimedia analysis and understandin...
详细信息
Tools for efficient and intelligent management of digital content are essential for digital video data management. An extremely challenging research area in this context is that of multimedia analysis and understanding. The capabilities of audio analysis in particular for video data management are yet to be fully exploited. We present a novel scheme for indexing and segmentation of video by analyzing the audio track. This analysis is then applied to the segmentation and indexing of movies. We build models for some interesting events in the motion picture soundtrack. The models built include music, human speech and silence. We propose the use of hidden Markov models to model the dynamics of the soundtrack and detect audio-events. Using these models we segment and index the soundtrack. A practical problem in motion picture soundtracks is that the audio in the track is of a composite nature. This corresponds to the mixing of sounds from different sources. Speech in foreground and music in background are common examples. The coexistence of multiple individual audio sources forces us to model such events explicitly. Experiments reveal that explicit modeling gives better results than modeling individual audio events separately.
This paper proposes a real time storage and simultaneous retrieval tool that can be used to access a video database to capture and index all the images which the video camera records. Using this tool we can register v...
详细信息
This paper proposes a real time storage and simultaneous retrieval tool that can be used to access a video database to capture and index all the images which the video camera records. Using this tool we can register video streams on the spot. Surveillance and patrol of the electric power industry can in this way be aided effectively. Our tool has two main features: (1) automatic segmentation of video streams containing irregular transitions, and (2) rapid indexing of segmented video scenes using local color patterns. Our prototype tool can segment, index and retrieve video streams at 5 frames/s.
In this paper we present a new descriptor for spatial distribution of motion activity in video sequences. We use the magnitude of the motion vectors as a measure of the intensity of motion activity in a macro-block. W...
详细信息
In this paper we present a new descriptor for spatial distribution of motion activity in video sequences. We use the magnitude of the motion vectors as a measure of the intensity of motion activity in a macro-block. We construct a matrix Cmv consisting of the magnitudes of the motion vector for each macro-block of a given P frame. We compute the average magnitude of the motion vector per macro-block, Cavg, and then use Cavg as a threshold on the matrix C by setting the elements of C that are less than Cavg to zero. We classify the runs of zeroes into three categories based on length, and count the number of runs of each category in the matrix C. Our activity descriptor for a frame thus consists of four parameters viz. the average magnitude of the motion vectors and the numbers of runs of short, medium and long length. Since the feature extraction is in the compressed domain and simple, it is extremely fast. We have tested it on the MPEG-7 test content set, which consists of approximately 14 hours of MPEG-1 encoded video content of different kinds. We find that our descriptor enables fast and accurate indexing of video. It is robust to noise and changes in encoding parameters such as frame size, frame rate, encoding bit rate, encoding format etc. It is a low-level non-semantic descriptor that gives semantic matches within the same program, and is thus very suitable for applications such as video program browsing. We also find that indirect and computationally simpler measures of the magnitude of the motion vectors such as bits taken to encode the motion vectors, though less effective, also can be used in our run-length framework.
Histograms are the most prevalently used representation for the color content of images and video. An elaborate representation of the histograms requires specifying the color centers of the histogram bins and the coun...
详细信息
ISBN:
(纸本)0819435902
Histograms are the most prevalently used representation for the color content of images and video. An elaborate representation of the histograms requires specifying the color centers of the histogram bins and the count of the number of image pixels with that color. Such an elaborate representation, though expressive, may not be necessary for some tasks in image search, filtering and retrieval. A qualitative representation of the histogram is sufficient for many applications. Such a representation will be compact and greatly simplify the storage and transmission of the image representation. It will also reduce the computational complexity of search and filtering algorithms without adversely affecting the quality. We present such a compact binary descriptor for color representation. This descriptor is the quantized Haar transform coefficients of the color histograms. We show the use of this descriptor for fast retrieval of similar images and search for similar video segments from a large database. We also show the use of this descriptor for browsing large imagedatabases without the need for computationally expensive clustering algorithms. The compact nature of the descriptor and the associated simple similarity measure allows searching over a database of about four hours of video in less than 5-6 seconds without the use of any sophisticated indexing scheme.
We present an approach to clustering images for efficient retrieval using relative entropy. We start with the assumption that visual features are represented by probability densities and develop clustering algorithms ...
详细信息
ISBN:
(纸本)0780362977
We present an approach to clustering images for efficient retrieval using relative entropy. We start with the assumption that visual features are represented by probability densities and develop clustering algorithms for probability densities (for example, normalized histograms are crude approximations of probability densities). These clustering algorithms are then used for efficient retrieval of images and video.
A multimedia database is a controlled collection of multimedia data items such as text, images, graphic objects, video and audio. A multimedia database management system (DBMS) provides support for the creation, stora...
详细信息
A multimedia database is a controlled collection of multimedia data items such as text, images, graphic objects, video and audio. A multimedia database management system (DBMS) provides support for the creation, storage, access, querying and control of a multimedia database. The requirements of a multimedia DBMS are: multimedia data modeling; multimedia object storage; multimedia indexing, retrieval and browsing; and multimedia query support. This paper discusses a general framework for multimedia database systems and describes the requirements and architecture for these systems.
暂无评论