With the currently existing shot change detection algorithms, abrupt changes are detected fairly well. It is thus more challenging to detect gradual changes including fades, dissolves, and wipes as these are often mis...
详细信息
With the currently existing shot change detection algorithms, abrupt changes are detected fairly well. It is thus more challenging to detect gradual changes including fades, dissolves, and wipes as these are often missed or falsely detected. In this paper, we focus on the detection of wipes. The proposed algorithm begins by processing the visual rhythm, a portion of the DC image sequence. It is a single image, a sub-sampled version of a full video in which the sampling is performed in a pre-determined and in a systematic fashion. The visual rhythm contains distinctive patterns or visual features for many different types of video effects. The different video effects manifest themselves differently on the visual rhythm. In particular, wipes appear as curves that run from the top to the bottom of the visual rhythm. Thus, using the visual rhythm, it becomes possible to automatically detect wipes simply by determining various lines and curves on the visual rhythm.
We developed a content based retrieval scheme for texture by using text based description. The texture technique is based on our previous work which uses very simple texture primitives such as edges and plain regions ...
详细信息
We developed a content based retrieval scheme for texture by using text based description. The texture technique is based on our previous work which uses very simple texture primitives such as edges and plain regions to generate features. Other methods that apply complicated statistics can be difficult to transcribe into understandable forms for normal users. Unlike these other methods, with the simplicity of our features, we can express them in terms of simple language. Hence we can bridge the gap between semantics and computed features. A number of benefits can be achieved which opens a new horizon for content based retrieval with texture. For example, the user can request a texture image without necessarily knowing what types of textures are stored. In this paper we describe the method of translating such features and the partial weighted Euclidean distance matching which allows users to describe only the parts that they are interested in. This allows them to gradually refine their texture descriptions.
A key aspect of imageretrieval using color, is the creation of robust and efficient indices. In particular, the color histogram remains the most popular index, due primarily to its simplicity. However, the color hist...
详细信息
A key aspect of imageretrieval using color, is the creation of robust and efficient indices. In particular, the color histogram remains the most popular index, due primarily to its simplicity. However, the color histogram has a number of drawbacks. Specifically, histograms capture only global activity, they require quantization to reduce dimensionality, are highly dependent on the chosen color space, have no means to exclude a certain color from a query and can provide erroneous results due to gamma nonlinearity. In this paper we present a vector angular distance measure which is implemented as part of our database system. Our system does away with histogram techniques for color indexing and retrieval and instead implements color vector techniques. We use color segmentation to extract regions of prominent color and use representative vectors from these extracted regions in the image indices. This way we end up with a much smaller index which does not have the granularity of a histogram. Instead similarity is based on our vector angular distance measure between a query color vector and the indexed representative vectors.
This article describes the use of gesture recognition techniques in computer vision as a natural interface for video content navigation, and the design of a navigation and browsing system that caters to these natural ...
详细信息
This article describes the use of gesture recognition techniques in computer vision as a natural interface for video content navigation, and the design of a navigation and browsing system that caters to these natural means of computer-human interaction. For consumer applications, video content navigation presents two challenges: (1) how to parse and summarize multiple video streams in an intuitive and efficient manner, and (2) what type of interface will enhance the ease of use for video browsing and navigation in a living room setting or an interactive environment. In this paper, we address the issues and propose the techniques that combine video content navigation with gestures, seamlessly and intuitively, in an integrated system. The current framework can incorporate speech recognition technology. We present a new type of browser for browsing and navigating video content, as well as a gesture recognition interface for this browser.
In this paper we propose a method for tracking a video object in an ordered sequence of two-dimensional images, where the outcome is the trajectory of the video object throughout the time sequence of images. This meth...
详细信息
In this paper we propose a method for tracking a video object in an ordered sequence of two-dimensional images, where the outcome is the trajectory of the video object throughout the time sequence of images. This method is designed to run in real-time in a synchronous video collaboration environment, and used for producing dynamic object annotations for enhanced video content understanding. A dynamic object is an object whose location or size in the video frame constantly changes due to the camera motion, the motion of its own, or both. We suggest a novel method for finding the trajectory of the object in the intermediate frames given the locations and shapes of the object in two end frames. In addition to the shape and location information of the object, its texture information in the end frames is used to predict the location and search space of it in the intermediate frames.
In this paper we propose a novel system of semantic feature extraction and retrieval for interior design and decoration application. The system, V2ID (Virtual Interior Design), uses colored texture and spatial edge la...
详细信息
In this paper we propose a novel system of semantic feature extraction and retrieval for interior design and decoration application. The system, V2ID (Virtual Interior Design), uses colored texture and spatial edge layout to obtain simple information about global room environment. We address the domain specific segmentation problem in our application and present techniques for obtaining semantic features from a room environment. We also discuss heuristics for making use of these features (color, texture, edge layout and shape) to retrieve objects from an existing database. The final resynthesized room environment with original scene and objects from database is created for the purpose of animation and virtual walk-through.
The Fractal Transform (FT) was originally introduced as a methodology for compressing digital images and representing them at different scales. The process of calculating an FT generates a great deal of information ab...
详细信息
The Fractal Transform (FT) was originally introduced as a methodology for compressing digital images and representing them at different scales. The process of calculating an FT generates a great deal of information about the affine similarities and dissimilarities of an image, most of which is discarded in compression applications. In this paper we introduce the concept of Fractal Transform Analysis and use it to derive new image descriptors. We present results of experiments in which description schemes comprised of some of these FT-based descriptors are applied to the problems of finding objects in an image similar to a given object, of indexing images, and of querying an image database consisting of about 17,000 images. Complexity and timing data are also presented.
In this paper, we propose a new image feature extraction method for MPEG compressed video. To minimize the MPEG decoding process, we use only DC values for Y, Cr, and Cb components for each macroblock. Then, we can ob...
详细信息
In this paper, we propose a new image feature extraction method for MPEG compressed video. To minimize the MPEG decoding process, we use only DC values for Y, Cr, and Cb components for each macroblock. Then, we can obtain a feature vector using the decoded DC values of Y, Cr, and Cb components for all macroblocks in an I frame. The feature vector consists of histograms for various colors, luminance, and edge types. In obtaining histograms for colors and luminance features, we consider the ratio of contributing pure colors and luminance to the chroma DC values for each macroblock. Then, we update all contributing colors and/or luminance histograms accordingly. Otherwise, if the macro block is classified as an edge block, then we update the corresponding edge type histogram. To demonstrate the performance of the proposed feature extraction method, we apply it to a scene change detection problem.
image search has been actively studied in recent years. On the other hands, image browsing has received little attention. image browsing refers to the process of presenting some forms of overview or summary of the ima...
详细信息
image search has been actively studied in recent years. On the other hands, image browsing has received little attention. image browsing refers to the process of presenting some forms of overview or summary of the image relationships, thus facilitating a user to navigate across the data set and find images of interests. In this paper, we present a new data structure built on the multi-linearization of image attributes for efficient organization of the data set and fast visual browsing of the images. We describe new techniques for multi-linearization based on multiple space-filling curves and hierarchical clustering techniques. In addition to providing fast navigation, our proposed data structure allows computationally efficient insertion and deletion of images from the data set. We then present a novel image navigator and browser built on dual-linearization data structure and intuitive presentation of image relevance and relationships, demonstrate the image navigation process, and report results on 1000 and 22,000 imagedatabases. We also discuss how our data structure can be extended to support fast image search.
We have developed a wide-area-distributed storage system for multimedia databases that minimizes the possibility of simultaneous failure of multiple disks in the event of a major disaster. It features a RAID system wh...
详细信息
We have developed a wide-area-distributed storage system for multimedia databases that minimizes the possibility of simultaneous failure of multiple disks in the event of a major disaster. It features a RAID system whose member disks are spatially distributed over a wide area. Each node has a device that includes the controller of the RAID and the controller of the member disks controlled by other nodes. The devices in the node are connected to a computer using fiber optic cables and communicate using Fibre-Channel technology. Any computer at a node can utilize multiple devices connected by optical fibers as a single 'virtual disk'. The advantage of this system structure is that devices and fiber optic cables are shared by the computers. In this report, we first describe our proposed system and a prototype we used for testing. Then we discuss its performance, i.e., how the read and write throughputs are affected by data-access delay, the RAID level, and queuing.
暂无评论