We present a method for motion estimation using ordinal measures. Ordinal measures are based on relative ordering of intensity values in a image region called rank permutation. While popular measures like the sum-of-s...
详细信息
We present a method for motion estimation using ordinal measures. Ordinal measures are based on relative ordering of intensity values in a image region called rank permutation. While popular measures like the sum-of-squared-difference (SSD) and normalized correlation (NCC) rely on linearity between corresponding intensity values, ordinal measures only require them to be monotonically related so that rank permutations between corresponding regions are preserved. This property turns out to be useful for motion estimation in tagged Magnetic Resonance images. We study the imaging equation involved in two methods of tagging and observe temporal monotonicity in intensity under certain conditions though the tags themselves fade. We compare our method to SSD and NCC in a rotating ring phantom image sequence. We present an experiment on a real heart image sequence which suggests the suitability of our method.
We describe a new multi-phase, color-based image retrieval system, FOCUS (Fast Object Color-based qUery System), with an online user interface which is capable of identifying multi-colored query objects in an image in...
详细信息
We describe a new multi-phase, color-based image retrieval system, FOCUS (Fast Object Color-based qUery System), with an online user interface which is capable of identifying multi-colored query objects in an image in the presence of significant, interfering backgrounds. The query object may occur in arbitrary sizes, orientations and locations in the database images. The color features used to describe an image have been developed based on the need for speed in matching and ease of computation on complex images while maintaining the scale and rotation invariance properties. The first phase matches the color content of an image computed as the peaks in the color histogram of the image, with the query object colors using an efficient indexing mechanism. The second phase matches the spatial relationships between color regions in the image with the query using a spatial proximity graph (SPG) structure designed for the purpose. The method is fast and has low storage overhead. Test results with multi-colored query objects from artificial and natural domains show that FOCUS is quite effective in handling interfering backgrounds and large variations in scale. The experimental results on a database of diverse images highlights the capabilities of the system.
Digital video is rapidly becoming important for education, entertainment, and a host of multimedia applications. With the size of the video collections growing to thousands of hours, technology is needed to effectivel...
详细信息
Digital video is rapidly becoming important for education, entertainment, and a host of multimedia applications. With the size of the video collections growing to thousands of hours, technology is needed to effectively browse segments in a short time without losing the content of the video. We propose a method to extract the significant audio and video information and create a `skim' video which represents a very short synopsis of the original. The goal of this work is to show the utility of integrating language and image understanding techniques for video skimming by extraction of significant information, such as specific objects, audio keywords and relevant video structure. The resulting skim video is much shorter, where compaction is as high as 20:1, and yet retains the essential content of the original segment.
For many Fortran90 and HPF programs performing dense matrix computations, the main computational portion of the program belongs to a class of kernels known as stencils. Stencil computations are commonly used in solvin...
详细信息
This paper proposes a novel approach to extract meaningful content information from video by collaborative integration of image understanding and natural language processing. As an actual example, we developed a syste...
详细信息
This paper proposes a novel approach to extract meaningful content information from video by collaborative integration of image understanding and natural language processing. As an actual example, we developed a system that associates faces and names in videos, called Name-It, which is given news videos as a knowledge source, then automatically extracts face and name association as content information. The system can infer the name of a given unknown face image, or guess faces which are likely to have the name given to the system. This paper explains the method with several successful matching results which reveal effectiveness in integrating heterogeneous techniques as well as the importance of real content information extraction from video, especially face-name association.
Snakes, or active contours, are used extensively in computer vision and imageprocessing.applications, particularly to locate object boundaries. Problems associated with initialization and poor convergence to concave ...
详细信息
Snakes, or active contours, are used extensively in computer vision and imageprocessing.applications, particularly to locate object boundaries. Problems associated with initialization and poor convergence to concave boundaries, however, have limited their utility. This paper develops a new external force for active contours, largely solving both problems. This external force, which we call gradient vector flow (GVF) is computed as a diffusion of the gradient vectors of a gray-level or binary edge map derived from the image. The resultant field has a large capture range and forces active contours into concave regions. Examples on simulated images and one real image are presented.
A general geometrical framework for imageprocessing.is presented. We consider intensity images as surfaces in the (x, I) space. The image is thereby a two dimensional surface in three dimensional space for gray level...
详细信息
A general geometrical framework for imageprocessing.is presented. We consider intensity images as surfaces in the (x, I) space. The image is thereby a two dimensional surface in three dimensional space for gray level images. The new formulation unifies many classical schemes, algorithms, and measures via choices of parameters in "master" geometrical measure. More important, it is a simple and efficient tool for the design of natural schemes for image enhancement, segmentation, and scale space. Here we give the basic motivation and apply the scheme to enhance images. We present the concept of an image as a surface in dimensions higher than the three dimensional intuitive space. This will help us handle movies, color, and volumetric medical images.
We propose a new machine learning paradigm called Graph Transformer Networks that extends the applicability of gradient-based learning algorithms to systems composed of modules that take graphs as inputs and produce g...
详细信息
We propose a new machine learning paradigm called Graph Transformer Networks that extends the applicability of gradient-based learning algorithms to systems composed of modules that take graphs as inputs and produce graphs as output. Training is performed by computing gradients of a global objective function with respect to all the parameters in the system using a kind of back-propagation procedure. A complete check reading system based on these concepts is described. The system uses convolutional neural network character recognizers, combined with global training techniques to provide record accuracy on business and personal checks. It is presently deployed commercially and reads million of checks per month.
We present in this paper a novel calibration method that uses cross ratio to compute world points falling onto any given light stripe plane of a structured light system. We show that, by using 4 known non-coplanar set...
详细信息
We present in this paper a novel calibration method that uses cross ratio to compute world points falling onto any given light stripe plane of a structured light system. We show that, by using 4 known non-coplanar sets of 3 collinear world points, the direct 4/spl times/3 image-to-world transformation matrix for each light stripe plane can also be recovered from plane-to-plane homography. Preliminary experiments conducted with a calibration target and a mannequin suggest that this novel calibration method is robust and is applicable to many shape measurement task.
Stereo reconstruction algorithms often fail to properly deal with complex surfaces, because there is not enough image information. To overcome this problem, we propose to guide the reconstruction process using a prior...
详细信息
Stereo reconstruction algorithms often fail to properly deal with complex surfaces, because there is not enough image information. To overcome this problem, we propose to guide the reconstruction process using a priori information about the differential geometry of the object surfaces. We use both linear structures such as crest lines or scalar fields such as curvature values to generate a reconstruction of the surface which is consistent with the differential properties. This method improves the accuracy of the reconstruction around the discontinuities and increases the compactness of the surface representation.
暂无评论