Tracking uncertain mobile objects such as humans and vehicles is an important problem in computervision, robotics, and geo-spatial visualization. As the name suggests, predictor-corrector tracking is performed in two...
详细信息
In this paper, we propose a novel method for solving single-image super-resolution problems. Given a low-resolution image as input, we recover its high-resolution counterpart using a set of training examples. While th...
详细信息
In this paper, we propose a novel method for solving single-image super-resolution problems. Given a low-resolution image as input, we recover its high-resolution counterpart using a set of training examples. While this formulation resembles other learning-based methods for super-resolution, our method has been inspired by recent manifold learning methods, particularly locally linear embedding (LLE). Specifically, small image patches in the low- and high-resolution images form manifolds with similar local geometry in two distinct feature spaces. As in LLE, local geometry is characterized by how a feature vector corresponding to a patch can be reconstructed by its neighbors in the feature space. Besides using the training image pairs to estimate the high-resolution embedding, we also enforce local compatibility and smoothness constraints between patches in the target high-resolution image through overlapping. Experiments show that our method is very flexible and gives good empirical results.
Techniques that treat the face holistically as a vector of pixel values, which we refer to as a monolithic representation, are still widely considered state of the art for the task of face verification in literature. ...
详细信息
In this paper, we propose a generative model for representing complex motion, such as wavy river, dancing fire and dangling cloth. Our generative method consists of four components: (1) A photometric model using prima...
详细信息
In this paper, we propose a generative model for representing complex motion, such as wavy river, dancing fire and dangling cloth. Our generative method consists of four components: (1) A photometric model using primal sketch[8] which transfers an image into an attribute graph representation. Each vertex of the graph is a scaled and oriented image patch selected from a dictionary. The graph connects and aligns these patches. (2) A geometric model which characterizes the deformation of the attribute graph. (3) A dynamic model, which specifies the motion dynamics of these vertices (patches) and their interactions in the form of coupled Markov chains. (4) A topological model, which interprets the graph topological changes over time. We learn this generative model by a stochastic gradient algorithm implemented by Markov Chain Monte Carlo (MCMC) sampling. This method is shown to be effective in handling the topological changes of graphs. The correctness of the learned model is verified by the low-dimension reconstruction of the original image as well as by the realistic motion sequences it synthesized.
Multimedia retrieval is going to play an increasingly important role in the future. Since the availability of digital media seems ever rising, not only the retrieval performance in terms of what the user is presented ...
详细信息
To obtain high dynamic range or hyperspectral images, multiple frames of the same field of view are acquired while the imaging settings are modulated;images are taken at different exposures or through different wavele...
详细信息
To obtain high dynamic range or hyperspectral images, multiple frames of the same field of view are acquired while the imaging settings are modulated;images are taken at different exposures or through different wavelength bands. A major problem associated with such modulations has been the need for perfect synchronization between image acquisition and modulation control. In the past, this problem has been addressed by using sophisticated servo-control mechanisms. In this work, we show that the process of modulation imaging can be made much simpler by using vision algorithms to automatically relate each acquired frame to its corresponding modulation level. This correspondence is determined solely from the acquired image sequence and does not require measurement or control of the modulation. The image acquisition and the modulation work continuously, in parallel, and independently. We refer to this approach as computational synchronization. It makes the imaging process simple and easy to implement. We have developed a prototype modulation imaging system that uses computational synchronization and used it to acquire high dynamic range and multispectral images.
Many vision tasks can be formulated as partitioning an adjacency graph through optimizing a Bayesian posterior probability p defined on the partition-space. In this paper two approaches are proposed to generalize the ...
详细信息
Many vision tasks can be formulated as partitioning an adjacency graph through optimizing a Bayesian posterior probability p defined on the partition-space. In this paper two approaches are proposed to generalize the Swendsen-Wang cut algorithm for sampling p. The first method is called multigrid SW-cut which runs SW-cut within a sequence of local "attentional" windows and thus simulates conditional probabilities of p in the partition space. The second method is called multi-level SW-cut which projects the adjacency graph into a hierarchical representation with each vertex in the high level graph corresponding to a subgraph at the low level, and runs SW-cut at each level. Thus it simulates conditional probabilities of p at the higher level. Both methods are shown to observe the detailed balance equation and thus provide flexibilities in sampling the posterior probability p. We demonstrate the algorithms in image and motion segmentation with three levels (see Fig.1), and compare the speed improvement of the proposed methods.
Graphical models are powerful tools for processing images. However, the large dimensionality of even local image data poses a difficulty: representing the range of possible graphical model node variables with discrete...
详细信息
Graphical models are powerful tools for processing images. However, the large dimensionality of even local image data poses a difficulty: representing the range of possible graphical model node variables with discrete states leads to an overwhelmingly large number of states for the model, often making both exact and approximate inference computationally intractable. We propose a representation that allows a small number of discrete states to represent the large number of possible image values at each pixel or local image patch. Each node in the graph represents the best regression function, chosen from a set of candidate functions, for estimating the unobserved image pixels from the observed samples. This permits a small number of discrete states to summarize the range of possible image values at each point in the image. Belief propagation is then used to find the best regressor to use at each point. To demonstrate the usefulness of this technique, we apply it to two problems: super-resolution and color demosaicing. In both cases, we find our method compares well against other techniques for these problems.
The visual effects of rain are complex. Rain consists of spatially distributed drops falling at high velocities. Each drop refracts and reflects the environment, producing sharp intensity changes in an image. A group ...
详细信息
The visual effects of rain are complex. Rain consists of spatially distributed drops falling at high velocities. Each drop refracts and reflects the environment, producing sharp intensity changes in an image. A group of such falling drops creates a complex time varying signal in images and videos. In addition, due to the finite exposure time of the camera, intensities due to rain are motion blurred and hence depend on the background intensities. Thus, the visual manifestations of rain are a combination of both the dynamics of rain and the photometry of the environment In this paper, we present the first comprehensive analysis of the visual effects of rain on an imaging system. We develop a correlation model that captures the dynamics of rain and a physics-based motion blur model that explains the photometry of rain. Based on these models, we develop efficient algorithms for detecting and removing rain from videos. The effectiveness of our algorithms is demonstrated using experiments on videos of complex scenes with moving objects and time-varying textures. The techniques described in this paper can be used in a wide range of applications including video surveillance, vision based navigation, video/movie editing and video indexing/retrieval.
This paper presents a detailed description of an advanced real-time correlation-based stereo algorithm running completely on the graphics processing unit (GPU). This is important since it allows to free up the main pr...
详细信息
暂无评论