Reliable estimation of visual saliency allows appropriate processing of images without prior knowledge of their contents, and thus remains an important step in many computervision tasks including image segmentation, ...
详细信息
ISBN:
(纸本)9781457703935
Reliable estimation of visual saliency allows appropriate processing of images without prior knowledge of their contents, and thus remains an important step in many computervision tasks including image segmentation, object recognition, and adaptive compression. We propose a regional contrast based saliency extraction algorithm, which simultaneously evaluates global contrast differences and spatial coherence. The proposed algorithm is simple, efficient, and yields full resolution saliency maps. Our algorithm consistently outperformed existing saliency detection methods, yielding higher precision and better recall rates, when evaluated using one of the largest publicly available data sets. We also demonstrate how the extracted saliency map can be used to create high quality segmentation masks for subsequent image processing.
Predicate logic based reasoning approaches provide a means of formally specifying domain knowledge and manipulating symbolic information to explicitly reason about different concepts of interest. Extension of traditio...
详细信息
Predicate logic based reasoning approaches provide a means of formally specifying domain knowledge and manipulating symbolic information to explicitly reason about different concepts of interest. Extension of traditional binary predicate logics with the bilattice formalism permits the handling of uncertainty in reasoning, thereby facilitating their application to computervision problems. In this paper, we propose using first order predicate logics, extended with a bilattice based uncertainty handling formalism, as a means of formally encoding pattern grammars, to parse a set of image features, and detect the presence of different patterns of interest. Detections from low level feature detectors are treated as logical facts and, in conjunction with logical rules, used to drive the reasoning. Positive and negative information from different sources, as well as uncertainties from detections, are integrated within the bilattice framework. We show that this approach can also generate proofs or justifications (in the form of parse trees) for each hypothesis it proposes thus permitting direct analysis of the final solution in linguistic form. Automated logical rule weight learning is an important aspect of the application of such systems in the computervision domain. We propose a rule weight optimization method which casts the instantiated inference tree as a knowledge-based neural network, interprets rule uncertainties as link weights in the network, and applies a constrained, back-propagation algorithm to converge upon a set of rule weights that give optimal performance within the bilattice framework. Finally, we evaluate the proposed predicate logic based pattern grammar formulation via application to the problems of (a) detecting the presence of humans under partial occlusions and (b) detecting large complex man made structures as viewed in satellite imagery. We also evaluate the optimization approach on real as well as simulated data and show favorable results.
A novel local image descriptor is proposed in this paper, which combines intensity orders and gradient distributions in multiple support regions. The novelty lies in three aspects: 1) The gradient is calculated in a r...
详细信息
ISBN:
(纸本)9781457703935
A novel local image descriptor is proposed in this paper, which combines intensity orders and gradient distributions in multiple support regions. The novelty lies in three aspects: 1) The gradient is calculated in a rotation invariant way in a given support region;2) The rotation invariant gradients are adaptively pooled spatially based on intensity orders in order to encode spatial information;3) Multiple support regions are used for constructing descriptor which further improves its discriminative ability. Therefore, the proposed descriptor encodes not only gradient information but also information about relative relationship of intensities as well as spatial information. In addition, it is truly rotation invariant in theory without the need of computing a dominant orientation which is a major error source of most existing methods, such as SIFT. Results on the standard Oxford dataset and 3D objects have shown a significant improvement over the state-of-the-art methods under various image transformations.
This paper suggests a method of selection of threshold value for segmenting objects with a priori known shapes. The problem is formulated in the form of an optimization task. This approach is implemented by the exampl...
详细信息
Human motion can be seen as a type of texture pattern. In this paper, we adopt the ideas of spatiotemporal analysis and the use of local features for motion description. Two methods are proposed. The first one uses te...
详细信息
Human motion can be seen as a type of texture pattern. In this paper, we adopt the ideas of spatiotemporal analysis and the use of local features for motion description. Two methods are proposed. The first one uses temporal templates to capture movement dynamics and then uses texture features to characterize the observed movements. We then extend this idea into a spatiotemporal space and describe human movements with dynamic texture features. Following recent trends in computervision, the method is designed to work with image data rather than silhouettes. The proposed methods are computationally simple and suitable for various applications. We verify the performance of our methods on the popular Weizmann and KTH datasets, achieving high accuracy.
Defocus blur correction for projectors using a camera is useful when the projector is used in ad hoc environments. However, past literature has not explicitly considered the common situation when the projection surfac...
详细信息
Recent research in the area of automatic machine recognition of human faces has shown that there may be an advantage in utilizing face symmetry to improve recognition accuracy. While promising, this work has led to se...
详细信息
Many computervision tasks can be formulated as labeling problems. The desired solution is often a spatially smooth labeling where label transitions are aligned with color edges of the input image. We show that such s...
详细信息
ISBN:
(纸本)9781457703935
Many computervision tasks can be formulated as labeling problems. The desired solution is often a spatially smooth labeling where label transitions are aligned with color edges of the input image. We show that such solutions can be efficiently achieved by smoothing the label costs with a very fast edge preserving filter. In this paper we propose a generic and simple framework comprising three steps: (i) constructing a cost volume (ii) fast cost volume filtering and (iii) winner-take-all label selection. Our main contribution is to show that with such a simple framework state-of-the-art results can be achieved for several computervision applications. In particular, we achieve (i) disparity maps in real-time, whose quality exceeds those of all other fast (local) approaches on the Middlebury stereo benchmark, and (ii) optical flow fields with very fine structures as well as large displacements. To demonstrate robustness, the few parameters of our framework are set to nearly identical values for both applications. Also, competitive results for interactive image segmentation are presented. With this work, we hope to inspire other researchers to leverage this framework to other application areas.
We are developing an embedded vision system for the humanoid robot iCub, inspired by the biology of the mammalian visual system, including concepts such as stimulus-driven, asynchronous signal sensing and processing. ...
详细信息
Occlusion is troublesome for almost all computervision algorithms. To a certain extent, the difficulty is alleviated when multiple frames are given. On the other hand, when we consider the recovery of shapes of movin...
详细信息
暂无评论