This paper presents a new model to overcome the occlusion problems coming from wide baseline multiple camera stereo. Rather than explicitly modeling occlusions in the matching cost function, it detects occlusions in t...
详细信息
This paper presents a new model to overcome the occlusion problems coming from wide baseline multiple camera stereo. Rather than explicitly modeling occlusions in the matching cost function, it detects occlusions in the depth map obtained from regular efficient stereo matching algorithms. Occlusions are detected as inconsistencies of the depth map by computing the visibility of the map as it is reprojected into each camera. Our approach has the particularity of not discriminating between occluders and occludees. The matching cost function is modified according to the detected occlusions by removing the offending cameras from the computation of the matching cost. The algorithm gradually modifies the matching cost function according to the history of inconsistencies in the depth map, until convergence. While two graph-theoretic stereo algorithms are used in our experiments, our framework is general enough to be applied to many others. The validity of our framework is demonstrated using real imagery with different baselines.
A complete scheme for totally unconstrained handwritten word recognition based on a single contextual hidden Markov model (HMM) is proposed. The scheme includes a morphology- and heuristics-based segmentation algorith...
详细信息
A complete scheme for totally unconstrained handwritten word recognition based on a single contextual hidden Markov model (HMM) is proposed. The scheme includes a morphology- and heuristics-based segmentation algorithm and a modified Viterbi algorithm that searches the (l+1)st globally best path based on the previous l best paths. The results of detailed experiments for which the overall recognition rate is up to 89.4% are reported.< >
In this paper, we address the problem of discovering the 3D shape of a book surface from the shading information in a scanned document image. This shape-from-shading problem is characterized in real world environments...
详细信息
In this paper, we address the problem of discovering the 3D shape of a book surface from the shading information in a scanned document image. This shape-from-shading problem is characterized in real world environments by a proximal and a moving light source, Lambertian reflection and a non-uniform albedo distribution. By considering all these factors, we first build the practical model (consists of geometric model and optical model) to reconstruct the 3D shape of book surface. We next restore the scanned image using this shape based on two models, namely de-shading and dewarping models. Finally, we compare the OCR results on the original and restored document image. The experiments show that the geometric and photometric distortions are mostly removed and the OCR results are improved markedly.
An efficient Nonparametric Belief Propagation (NBP) algorithm is developed in this paper. While the recently proposed nonparametric belief propagation algorithm has wide applications such as articulated tracking [22, ...
详细信息
An efficient Nonparametric Belief Propagation (NBP) algorithm is developed in this paper. While the recently proposed nonparametric belief propagation algorithm has wide applications such as articulated tracking [22, 19], superresolution [6], stereo vision and sensor calibration [10], the hardcore of the algorithm requires repeatedly sampling from products of mixture of Gaussians, which makes the algorithm computationally very expensive. To avoid the slow sampling process, we applied mixture Gaussian density approximation by mode propagation and kernel fitting [2, 7]. The products of mixture of Gaussians are approximated accurately by just a few mode propagation and kernel fitting steps, while the sampling method (e.g. Gibbs sampler) needs many samples to achieve similar approximation results. The proposed algorithm is then applied to articulated body tracking for several scenarios. The experimental results show the robustness and the efficiency of the proposed algorithm. The proposed efficient NBP algorithm also has potentials in other applications mentioned above.
Varrier is a head-tracked, 35-panel tiled autostereoscopic display system which is produced by The Electronic Visualization Laboratory (EVL) at the University of Illinois at Chicago (UIC). Varrier produces autostereos...
详细信息
Varrier is a head-tracked, 35-panel tiled autostereoscopic display system which is produced by The Electronic Visualization Laboratory (EVL) at the University of Illinois at Chicago (UIC). Varrier produces autostereoscopic imagery through a combination of a physical parallax barrier and a virtual barrier, so that the stereoscopic images are directed correctly into the viewer’s eyes. Since a small amount of rotation and translation between physical and virtual barriers can cause large-scale effects, registration is critical for correct stereo viewing. The process is automated by examining image frames of two video cameras separated by the interocular distance as a simulation of human eyes. Three registration parameters for each panel are calibrated in the process. An arbitrary start condition is allowed and a robust stopping criterion is used to end the process and report results. Instead of exhaustive three dimensional searching, an efficient two phase calibration method is introduced. The combination of a heuristic rough calibration and an adaptive fine calibration guarantees a fast searching process with the best solution.
The problem of low-rank matrix factorization in the presence of missing data has seen significant attention in recent computervision research. The approach that dominates the literature is EM-like alternation of clos...
详细信息
The problem of low-rank matrix factorization in the presence of missing data has seen significant attention in recent computervision research. The approach that dominates the literature is EM-like alternation of closed-form solutions for the two factors of the matrix. An obvious alternative is nonlinear optimization of both factors simultaneously, a strategy which has seen little published research. This paper provides a comprehensive comparison of the two strategies by evaluating previously published factorization algorithms as well as some second order methods not previously presented for this problem. We conclude that, although alternation approaches can be very quick, their propensity to glacial convergence in narrow valleys of the cost function means that average-case performance is worse than second-order strategies. Further, we demonstrate the importance of two main observations: one, that schemes based on closed-form solutions alone are not suitable and that non-linear optimization strategies are faster, more accurate and provide more flexible frameworks for continued progress; and two, that basic objective functions are not adequate and that regularization priors must be incorporated, a process that is easier with nonlinear methods.
We integrate the cascade-of-rejectors approach with the Histograms of Oriented Gradients (HoG) features to achieve a fast and accurate human detection system. The features used in our system are HoGs of variable-size ...
详细信息
We integrate the cascade-of-rejectors approach with the Histograms of Oriented Gradients (HoG) features to achieve a fast and accurate human detection system. The features used in our system are HoGs of variable-size blocks that capture salient features of humans automatically. Using AdaBoost for feature selection, we identify the appropriate set of blocks, from a large set of possible blocks. In our system, we use the integral image representation and a rejection cascade which significantly speed up the computation. For a 320 × 280 image, the system can process 5 to 30 frames per second depending on the density in which we scan the image, while maintaining an accuracy level similar to existing methods.
We describe how certain tasks in the audio domain can be effectively addressed using computervision approaches. This paper focuses on the problem of music identification, where the goal is to reliably identify a song...
详细信息
We describe how certain tasks in the audio domain can be effectively addressed using computervision approaches. This paper focuses on the problem of music identification, where the goal is to reliably identify a song given a few seconds of noisy audio. Our approach treats the spectrogram of each music clip as a 2D image and transforms music identification into a corrupted sub-image retrieval problem. By employing pairwise boosting on a large set of Viola-Jones features, our system learns compact, discriminative, local descriptors that are amenable to efficient indexing. During the query phase, we retrieve the set of song snippets that locally match the noisy sample and employ geometric verification in conjunction with an EM-based "occlusion" model to identify the song that is most consistent with the observed signal. We have implemented our algorithm in a practical system that can quickly and accurately recognize music from short audio samples in the presence of distortions such as poor recording quality and significant ambient noise. Our experiments demonstrate that this approach significantly outperforms the current state-of-the-art in content-based music identification.
Recently, "epitomes" were introduced as patch-based probability models that are learned by compiling together a large number of examples of patches from input images. In this paper, we describe how epitomes ...
详细信息
Recently, "epitomes" were introduced as patch-based probability models that are learned by compiling together a large number of examples of patches from input images. In this paper, we describe how epitomes can be used to model video data and we describe significant computational speedups that can be incorporated into the epitome inference and learning algorithm. In the case of videos, epitomes are estimated so as to model most of the small space-time cubes from the input data. Then, the epitome can be used for various modeling and reconstruction tasks, of which we show results for video super-resolution, video interpolation, and object removal. Besides computational efficiency, an interesting advantage of the epitome as a representation is that it can be reliably estimated even from videos with large amounts of missing data. We illustrate this ability on the task of reconstructing the dropped frames in video broadcast using only the degraded video.
Lambert’s model for diffuse reflection is a main assumption in most of shape from shading (SFS) literature. Even with this simplified model, the SFS is still a difficult problem. Nevertheless, Lambert’s model has be...
详细信息
Lambert’s model for diffuse reflection is a main assumption in most of shape from shading (SFS) literature. Even with this simplified model, the SFS is still a difficult problem. Nevertheless, Lambert’s model has been proven to be an inaccurate approximation of the diffuse component of the surface reflectance. In this paper, we propose a new solution of the SFS problem based on a more comprehensive diffuse reflectance model: the Oren and Nayar model. In this work, we slightly modify this more realistic model in order to take into account the attenuation of the illumination due to distance. Using the modified non-Lambertian reflectance, we design a new explicit Partial Differential Equation (PDE) and then solve it using Lax-Friedrichs Sweeping method. Our experiments on synthetic data show that the proposed modeling gives a unique solution without any information about the height at the singular points of the surface. Additional results for real data are presented to show the efficiency of the proposed method . To the best of our knowledge, this is the first non-Lambertian SFS formulation that eliminates the concave/convex ambiguity which is a well known problem in SFS.
暂无评论