Can we detect low dimensional structure in high dimensional data sets of images and video? The problem of dimensionality reduction arises often in computervision and patternrecognition. In this paper, we propose a n...
详细信息
Can we detect low dimensional structure in high dimensional data sets of images and video? The problem of dimensionality reduction arises often in computervision and patternrecognition. In this paper, we propose a new solution to this problem based on semidefinite programming. Our algorithm can be used to analyze high dimensional data that lies on or near a low dimensional manifold. It overcomes certain limitations of previous work in manifold learning, such as Isomap and locally linear embedding. We illustrate the algorithm on easily visualized examples of curves and surfaces, as well as on actual images of faces, handwritten digits, and solid objects.
We present a novel approach that uses boundary interpolation to correct (1) geometric distortion and (2) shading artifacts present in images of printed materials. Unlike existing approaches, our algorithm can simultan...
详细信息
We present a novel approach that uses boundary interpolation to correct (1) geometric distortion and (2) shading artifacts present in images of printed materials. Unlike existing approaches, our algorithm can simultaneously correct a variety of geometric distortions, including skew, fold distortion, binder curl, and combinations of these. In addition, the same interpolation framework can estimate the intrinsic illumination component of the distorted image to correct shading artifacts.
The emerging cognitive vision paradigm is concerned with vision systems that evaluate, gather and integrate contextual knowledge for visual analysis. In reasoning about events and structures, cognitive vision systems ...
详细信息
The emerging cognitive vision paradigm is concerned with vision systems that evaluate, gather and integrate contextual knowledge for visual analysis. In reasoning about events and structures, cognitive vision systems should rely on multiple computations in order to perform robustly even in noisy domains. Action recognition in an unconstrained office environment thus provides an excellent testbed for research on cognitive computervision. In this contribution, we present a system that consists of several computational modules for object and action recognition. It applies attention mechanisms, visual learning and contextual as well as probabilistic reasoning to fuse individual results and verify their consistency. Database technologies are used for information storage and an XML based communication framework integrates all modules into a consistent architecture.
To facilitate activity recognition, analysis of the scene at multiple levels of detail is necessary. Required prerequisites for our activity recognition are tracking objects across frames and establishing a consistent...
详细信息
To facilitate activity recognition, analysis of the scene at multiple levels of detail is necessary. Required prerequisites for our activity recognition are tracking objects across frames and establishing a consistent labeling of objects across cameras. This paper makes several innovative uses of the epipolar constraint in the context of activity recognition. We first demonstrate how we track heads and hands using the epipolar geometry. Next we show how the detected objects are labeled consistently across cameras and zooms by employing epipolar, spatial, trajectory, and appearance properties. Finally we show how our method, utilizing the multiple levels of detail, is able to answer activity recognition problems which are difficult to answer with a single level of detail.
The problem of deciding whether two pixels in an image have the same real world color is a fundamental problem in computervision. Many color spaces are used in different applications for discriminating color from int...
详细信息
The problem of deciding whether two pixels in an image have the same real world color is a fundamental problem in computervision. Many color spaces are used in different applications for discriminating color from intensity to create an informative representation of color. The major drawback of all of these representations is that they assume no color distortion. In practice the colors of real world images are distorted both in the scene itself and in the image capturing process. In this work we introduce color lines, an image specific color representation that is robust to color distortion and provides a compact and useful representation of the colors in a scene.
This paper develops an analytic model for self shadowing and local illumination of rough surfaces. The surface is assumed homogeneous, isotropic, and smooth microscopically, with a Gaussian height field. The total ref...
详细信息
This paper develops an analytic model for self shadowing and local illumination of rough surfaces. The surface is assumed homogeneous, isotropic, and smooth microscopically, with a Gaussian height field. The total reflection is decomposed into single and multiple scattering. A shadowing factor is derived as a function of surface roughness, and color variation due to multiple scattering is described. This model is compared to previous models and the effects of surface roughness are studied via rendered images.
Radial symmetry is an important perceptual cue for the feature-based representation, fixation, and description of large-scale data sets. A new approach based on iterative voting along the gradient direction is introdu...
详细信息
Radial symmetry is an important perceptual cue for the feature-based representation, fixation, and description of large-scale data sets. A new approach based on iterative voting along the gradient direction is introduced for inferring the center of mass for objects demonstrating radial symmetries that are not limited to convex geometries. The kernel topography is unique in that it votes for the most likely set of grid points where the center of mass may be located. Initially, it is applied in the direction of the gradient and then reoriented iteratively in the most probable direction. This technique can detect perceptual symmetries, has an excellent noise immunity, and is shown to be tolerant to moderate perturbation in scale. Applications of this approach to blobs with incomplete and noisy boundaries, multimedia scenes, and scientific images are demonstrated.
This paper describes two distortion estimation techniques for object recognition that solve EZ-Gimpy and Gimpy-r, two of the visual CAPTCHAs ("completely automated public turing test to tell computers and humans ...
详细信息
This paper describes two distortion estimation techniques for object recognition that solve EZ-Gimpy and Gimpy-r, two of the visual CAPTCHAs ("completely automated public turing test to tell computers and humans apart") with high degrees of success. A CAPTCHA is a program that generates and grades tests that most humans can pass but current computer programs cannot pass. We have developed a correlation algorithm that correctly identifies the word in an EZ-Gimpy challenge image 99% of the time and a direct distortion estimation algorithm that correctly identifies the four letters in a Gimpy-r challenge image 78% of the time.
Shape-from-shading (SfS) is a fundamental problem in computervision. At its basis lies the image irradiance equation. Recently, the authors proposed to base the image irradiance equation on the assumption of perspect...
详细信息
Shape-from-shading (SfS) is a fundamental problem in computervision. At its basis lies the image irradiance equation. Recently, the authors proposed to base the image irradiance equation on the assumption of perspective projection rather than the common orthographic one. The current paper presents a greatly-improved reconstruction method based on the perspective formulation. The proposed model is solved efficiently via a modification of the fast marching method of Kimmel and Sethian. We compare the two versions of the fast marching method (orthographic vs. perspective) on medical images. The perspective algorithm outperformed the orthographic one. This shows that the more realistic hypothesis of perspective projection improves reconstruction significantly. The comparison also demonstrates the usability of perspective SfS for real-life applications such as medical endoscopy.
The goal of this work is to detect a human figure image and localize his joints and limbs along with their associated pixel masks. In this work we attempt to tackle this problem in a general setting. The dataset we us...
详细信息
The goal of this work is to detect a human figure image and localize his joints and limbs along with their associated pixel masks. In this work we attempt to tackle this problem in a general setting. The dataset we use is a collection of sports news photographs of baseball players, varying dramatically in pose and clothing. The approach that we take is to use segmentation to guide our recognition algorithm to salient bits of the image. We use this segmentation approach to build limb and torso detectors, the outputs of which are assembled into human figures. We present quantitative results on torso localization, in addition to shortlisted full body configurations.
暂无评论