Visual patternrecognition over agricultural areas is an important application of aerial imageprocessing. In this paper, we consider the multi-modality nature of agricultural aerial images and show that naively combi...
详细信息
ISBN:
(数字)9781728193601
ISBN:
(纸本)9781728193618
Visual patternrecognition over agricultural areas is an important application of aerial imageprocessing. In this paper, we consider the multi-modality nature of agricultural aerial images and show that naively combining different modalities together without taking the feature divergence into account can lead to sub-optimal results. Thus, we apply a Switchable Normalization block to our DeepLabV3+ segmentation model to alleviate the feature divergence. Using the popular symmetric Kullback-Leibler divergence measure, we show that our model can greatly reduce the divergence between RGB and near-infrared channels. Together with a hybrid loss function, our model achieves nearly 10% improvements in mean IoU over previously published baseline.
With the development of digital forestry, imageprocessing.and patternrecognition technology have been extensively used in the study of forestry research. It is currently a hot research that automatic plant recogniti...
详细信息
The computational cost of conventional filter methods for junction characterization is very high. This burden can be attenuated by using steerable filters. However, in order to achieve a high orientational selectivity...
详细信息
The computational cost of conventional filter methods for junction characterization is very high. This burden can be attenuated by using steerable filters. However, in order to achieve a high orientational selectivity to characterize complex junctions a large number of basis filters is necessary. From this results a yet too high computational effort for steerable filters. In this paper we present a new method for characterizing junctions which keeps the high orientational resolution and is computationally efficient. It is based on applying rotated copies of a wedge averaging filter and estimating the derivative with respect to the polar angle. The new method is compared with the steerable wedge filter method in experiments with real images. We show the superiority of our method as well as its adaptability to scale changes and robustness against noise.
This paper proposes an algorithm to clean up a large collection of historical handwritten documents kept up in the National Archives of Singapore. Due to the seepage of ink over long period of storage, the front page ...
详细信息
ISBN:
(纸本)0769519008
This paper proposes an algorithm to clean up a large collection of historical handwritten documents kept up in the National Archives of Singapore. Due to the seepage of ink over long period of storage, the front page of each document has been severely marred by the reverse side writing. Earlier attempts have been made to match both sides of the page to identify the offending strokes originating from the back so as to eliminate them with the aid of a wavelet transform. Perfect matching, however, is difficult due to document skews, differing resolutions, inadvertently missing out reverse side and warped pages during image capture. An approach is now proposed to do away with double side mapping by using a directional wavelet transform that is able to distinguish the foreground and reverse side strokes much better than the conventional wavelet transform. Experiments have shown that the method indeed enhances the readability of each document significantly over after the directional wavelet operation without the need for mapping with its reverse side.
This paper presents a novel data-adaptive anisotropic filtering technique built on top of an iterative scheme. This new technique can preserve the original significant structures while suppressing noises to the larges...
详细信息
This paper presents a hierarchical framework to perform deformable matching of three dimensional (3D) images. 3D shape deformations are parameterized at different scales, using a decomposition of the continuous deform...
详细信息
This paper presents a hierarchical framework to perform deformable matching of three dimensional (3D) images. 3D shape deformations are parameterized at different scales, using a decomposition of the continuous deformation vector field over a sequence of nested subspaces, generated from a single scaling function. The parameterization of the field enables to enforce smoothness and differentiability constraints without performing explicit regularization. A global energy function, depending on the reference image and the transformed one, is minimized via a coarse-to-fine algorithm over this multiscale decomposition. Contrary to standard multigrid approaches, no reduction of image data is applied. The continuous field of deformation is always sampled at the same resolution, ensuring that the same energy function is handled at each scale and that the energy decreases at each step of the minimization.
We previously presented (Sarkar and Boyer, 1993) the Perceptual Inference Network (PIN), a formalism based on Bayesian Networks, to reason among a set of object or feature hypotheses and to integrate multiple sources ...
详细信息
We previously presented (Sarkar and Boyer, 1993) the Perceptual Inference Network (PIN), a formalism based on Bayesian Networks, to reason among a set of object or feature hypotheses and to integrate multiple sources of information in the context of perceptual organization. The design of a PIN requires knowledge of the dependency structure among the organizations of interest and the specification of the conditional probabilities. This design was done manually with large doses of tedium and guesswork. In this paper we present an algorithm based on structural entropic measures and random parametric structural descriptions (RPSDs) to design a PIN automatically and in a (more) theoretically sound fashion. Experimental results present evidence of the robustness of the algorithm and make performance comparisons on real image data with a manually structured PIN. Since PINs are a form of Bayesian Network, we hope that this work will also prove useful towards structuring Bayesian Networks in other computer vision contexts.< >
We present an integrated approach to the derivation of scene description from binocular stereo images. By inferring the scene description directly from local measurements of both point and line correspondences, we add...
详细信息
We present an integrated approach to the derivation of scene description from binocular stereo images. By inferring the scene description directly from local measurements of both point and line correspondences, we address both the stereo correspondence problem and the surface reconstruction problem simultaneously. We introduce a robust computational technique called tensor voting for the inference of scene description in terms of surfaces, junctions, and region boundaries. The methodology is grounded in two elements: tensor calculus for representation, and non-linear voting for data communication. By efficiently and effectively collecting and analyzing neighborhood information, we are able to handle the tasks of interpolation, discontinuity detection, and outlier identification simultaneously. The proposed method is non-iterative, robust to initialization and thresholding in the preprocessing.stage, and the only critical free parameter is the size of the neighborhood. We illustrate the approach with results on a variety of images.
In its full generality, motion analysis of crowded objects necessitates recognition and segmentation of each moving entity. The difficulty of these tasks increases considerably with occlusions and therefore with crowd...
详细信息
In its full generality, motion analysis of crowded objects necessitates recognition and segmentation of each moving entity. The difficulty of these tasks increases considerably with occlusions and therefore with crowding. When the objects are constrained to be of the same kind, however, partitioning of densely crowded semi-rigid objects can be accomplished by means of clustering tracked feature points. We base our approach on a highly parallelized version of the KLT tracker in order to process the video into a set of feature trajectories. While such a set of trajectories provides a substrate for motion analysis, their unequal lengths and fragmented nature present difficulties for subsequent processing. To address this, we propose a simple means of spatially and temporally conditioning the trajectories. Given this representation, we integrate it with a learned object descriptor to achieve a segmentation of the constituent motions. We present experimental results for the problem of estimating the number of moving objects in a dense crowd as a function of time.
暂无评论