We propose a new machine learning paradigm called Graph Transformer Networks that extends the applicability of gradient-based learning algorithms to systems composed of modules that take graphs as inputs and produce g...
详细信息
We propose a new machine learning paradigm called Graph Transformer Networks that extends the applicability of gradient-based learning algorithms to systems composed of modules that take graphs as inputs and produce graphs as output. Training is performed by computing gradients of a global objective function with respect to all the parameters in the system using a kind of back-propagation procedure. A complete check reading system based on these concepts is described. The system uses convolutional neural network character recognizers, combined with global training techniques to provide record accuracy on business and personal checks. It is presently deployed commercially and reads million of checks per month.
A method for deformable shape detection and recognition is described. Deformable shape templates are used to partition the image into a globally consistent interpretation, determined in part by the minimum description...
详细信息
A method for deformable shape detection and recognition is described. Deformable shape templates are used to partition the image into a globally consistent interpretation, determined in part by the minimum description length principle. Statistical shape models enforce the prior probabilities on global, parametric deformations for each object class. Once trained, the system autonomously segments deformed shapes from the background, while not merging them with adjacent objects or shadows. The formulation can be used to group image regions based on any image homogeneity predicate; e.g., texture, color or motion. The recovered shape models can be used directly in object recognition. Experiments with color imagery are reported.
Photometric methods in computer vision require calibration of the camera's radiometric response, and previous works have addressed this problem using multiple registered images captured under different camera expo...
详细信息
Photometric methods in computer vision require calibration of the camera's radiometric response, and previous works have addressed this problem using multiple registered images captured under different camera exposure settings. In many instances, such an image set is not available, so we propose a method that performs radiometric calibration from only a single image, based on measured RGB distributions at color edges. This technique automatically selects appropriate edge information for processing. and employs a Bayesian approach to compute the calibration. Extensive experimentation has shown that accurate calibration results can be obtained using only a single input image.
A novel depth-from-focus technique is introduced that needs only a single image. It is based on a precise knowledge of the 3-D point spread function and requires objects of uniform brightness and simple shapes. Using ...
详细信息
A novel depth-from-focus technique is introduced that needs only a single image. It is based on a precise knowledge of the 3-D point spread function and requires objects of uniform brightness and simple shapes. Using adequate low-level imageprocessing.techniques, the true area of the object and the distance from the focal plane is obtained from parameters such as the apparent (blurred) area of the object and the mean brightness in this area. The technique has been applied to measure the size distribution of bubbles submerged by breaking waves. A depth criterion is used to define a virtual measuring volume that is roughly proportional to the size of the bubbles.< >
A methodology for optical flow analysis based on cepstral filtering is introduced. The power cepstrum is extended to multiframe analysis. A correlative cepstral technique, cepsCorr, is developed. It significantly incr...
详细信息
A methodology for optical flow analysis based on cepstral filtering is introduced. The power cepstrum is extended to multiframe analysis. A correlative cepstral technique, cepsCorr, is developed. It significantly increases the signal-to-noise ratio, reduces ambiguities, and it provides a predictive or multievidence approach to visual motion analysis.< >
One of the main tasks in content-based image retrieval (CBIR) is to reduce the gap between low-level visual features and high-level human concepts. This paper presents a new semi-supervised EM algorithm (NSSEM), where...
详细信息
One of the main tasks in content-based image retrieval (CBIR) is to reduce the gap between low-level visual features and high-level human concepts. This paper presents a new semi-supervised EM algorithm (NSSEM), where the image distribution in feature space is modeled as a mixture of Gaussian densities. Due to the statistical mechanism of accumulating and processing.meta knowledge, the NSS-EM algorithm with long term learning of mixture model parameters can deal with the cases where users may mislabel images during relevance feedback. Our approach that integrates mixture model of the data, relevance feedback and long term learning helps to improve retrieval performance. The concept learning is incrementally refined with increased retrieval experiences. Experiment results on Corel database show the efficacy of our proposed concept learning approach.
We present in this paper a novel calibration method that uses cross ratio to compute world points falling onto any given light stripe plane of a structured light system. We show that, by using 4 known non-coplanar set...
详细信息
We present in this paper a novel calibration method that uses cross ratio to compute world points falling onto any given light stripe plane of a structured light system. We show that, by using 4 known non-coplanar sets of 3 collinear world points, the direct 4/spl times/3 image-to-world transformation matrix for each light stripe plane can also be recovered from plane-to-plane homography. Preliminary experiments conducted with a calibration target and a mannequin suggest that this novel calibration method is robust and is applicable to many shape measurement task.
Stereo reconstruction algorithms often fail to properly deal with complex surfaces, because there is not enough image information. To overcome this problem, we propose to guide the reconstruction process using a prior...
详细信息
Stereo reconstruction algorithms often fail to properly deal with complex surfaces, because there is not enough image information. To overcome this problem, we propose to guide the reconstruction process using a priori information about the differential geometry of the object surfaces. We use both linear structures such as crest lines or scalar fields such as curvature values to generate a reconstruction of the surface which is consistent with the differential properties. This method improves the accuracy of the reconstruction around the discontinuities and increases the compactness of the surface representation.
Edge detection is analyzed as a problem in cost minimization. A cost function is formulated that evaluates the quality of edge configurations. A mathematical description of edges is given, and the cost function is ana...
详细信息
Edge detection is analyzed as a problem in cost minimization. A cost function is formulated that evaluates the quality of edge configurations. A mathematical description of edges is given, and the cost function is analyzed in terms of the characteristics of the edges in minimum-cost configurations. The cost function is minimized by the simulated annealing method. A novel set of strategies for generating candidate states and a suitable temperature schedule are presented. Sequential and parallel versions of the annealing algorithm are implemented and compared. Experimental results are presented.< >
A novel scene reconstruction technique is presented, different from previous approaches in its ability to cope with large changes in visibility and its modeling of intrinsic scene color and texture information. The me...
详细信息
A novel scene reconstruction technique is presented, different from previous approaches in its ability to cope with large changes in visibility and its modeling of intrinsic scene color and texture information. The method avoids image correspondence problems by working in a discretized scene space whose voxels are traversed in a fixed visibility ordering. This strategy takes full account of occlusions and allows the input cameras to be far apart and widely distributed about the environment. The algorithm identifies a special set of invariant voxels which together form a spatial and photometric reconstruction of the scene, fully consistent with the input images. The approach is evaluated with images from both inward- and outward-facing cameras.
暂无评论