Object recognition on the satellite images is one of the most relevant and popular topics in the problem of patternrecognition. This was facilitated by many factors, such as a high number of satellites with high-reso...
详细信息
ISBN:
(数字)9781538661000
ISBN:
(纸本)9781538661000
Object recognition on the satellite images is one of the most relevant and popular topics in the problem of patternrecognition. This was facilitated by many factors, such as a high number of satellites with high-resolution imagery, the significant development of computervision, especially with a major breakthrough in the field of convolutional neural networks, a wide range of industry verticals for usage and still a quite empty market. Roads are one of the most popular objects for recognition. In this article, we want to present you the combination of work of neural network and postprocessing algorithm, due to which we get not only the coverage mask but also the vectors of all of the individual roads that are present in the image and can be used to address the higher-level tasks in the future. This approach was used to solve the DeepGlobe Road Extraction Challenge.
With the recent advances of Convolutional Neural Networks (CNN) in computervision, there have been rapid progresses in extracting roads and other features from satellite imagery for mapping and other purposes. In thi...
详细信息
ISBN:
(数字)9781538661000
ISBN:
(纸本)9781538661000
With the recent advances of Convolutional Neural Networks (CNN) in computervision, there have been rapid progresses in extracting roads and other features from satellite imagery for mapping and other purposes. In this paper, we propose a new method for road extraction using stacked U-Nets with multiple output. A hybrid loss function is used to address the problem of unbalanced classes of training data. Post-processing methods, including road map vectorization and shortest path search with hierarchical thresholds, help improve recall. The overall improvement of mean IoU compared to the vanilla VGG network is more than 20%.
An algorithm Sor tracking a person's head is presented. The head's projection onto the image plane is modeled as an ellipse whose position and size are continually updated by a local search combining the outpu...
详细信息
ISBN:
(纸本)0818684976
An algorithm Sor tracking a person's head is presented. The head's projection onto the image plane is modeled as an ellipse whose position and size are continually updated by a local search combining the output of a module concentrating an the intensity gradient around the ellipse's perimeter with that of another module focusing on the color histogram of the ellipse's interior: Since these two modules have roughly orthogonal failure modes, they serve to complement one another: The result is a robust, real-time system that is able to track a person's head with enough accuracy to automatically: central the camera's pml, tilt, and zoom in order to keep the person centered in the field of view at a desired size. Extensive experimentation shows the algorithm's robustness with respect to full 360-degree out-of-plane rotation, up to 90-degree tilting, severe but brief occlusion, arbitrary camera movement, and multiple moving people in the background.
Detecting spoofing attacks plays a vital role for deploying automatic face recognition for biometric authentication in applications such as access control, face payment, device unlock, etc. In this paper we propose a ...
详细信息
ISBN:
(数字)9781728125060
ISBN:
(纸本)9781728125060
Detecting spoofing attacks plays a vital role for deploying automatic face recognition for biometric authentication in applications such as access control, face payment, device unlock, etc. In this paper we propose a new anti-spoofing network architecture that takes advantage of multi-modal image data and aggregates intra-channel features at multiple network layers. We also transfer strong facial features learned for face recognition and show their benefits for detecting spoofing attacks. Finally, to increase the generalization ability of our method to unseen attacks, we use an ensemble of models trained separately for distinct types of spoofing attacks. The proposed method achieves state-of-the-art result on the largest multi-modal anti-spoofing dataset CASIA-SURF [26].
Skeletonization is a process aimed to extract a line-like object shape representation, skeleton, which is of great interest for optical character recognition, shape-based object matching, recognition, biomedical image...
详细信息
ISBN:
(纸本)9781728125060
Skeletonization is a process aimed to extract a line-like object shape representation, skeleton, which is of great interest for optical character recognition, shape-based object matching, recognition, biomedical image analysis, etc.. Existing methods for skeleton extraction are typically based on topological, morphological or distance transform and are known to be sensitive to the noise on the boundary and require post-processing procedure for redundant branches pruning. In this work, we introduce U-net based approach for direct skeleton extraction of the object within Pixel SkelNetOn - cvpr 2019 challenge, inspired by CNNs success in skeleton extraction from real images task. The main idea of our approach is to consistently edit a skeleton mask by feature propagation through different scale layers. It opposes final skeleton generation from different scale object shape representations as occurs in approaches with deep supervision for skeleton extraction from the real image. Our U-net based model showed 0.75 F1-score on the validation set and the ensemble of eight identical models, trained on different data subsets, got 0.7846 F1-score on the test data.
Face anti-spoofing detection is a crucial procedure in biometric face recognition systems. State-of-the-art approaches, based on Convolutional Neural Networks (CNNs), present good results in this field. However, previ...
详细信息
ISBN:
(纸本)9781728125060
Face anti-spoofing detection is a crucial procedure in biometric face recognition systems. State-of-the-art approaches, based on Convolutional Neural Networks (CNNs), present good results in this field. However, previous works focus on one single modal data with limited number of subjects. The recently published CASIA-SURF dataset is the largest dataset that consists of 1000 subjects and 21000 video clips with 3 modalities (RGB, Depth and IR). In this paper, we propose a multi-stream CNN architecture called FaceBagNet to make full use of this data. The input of FaceBagNet is patch-level images which contributes to extract spoof-specific discriminative information. In addition, in order to prevent overfitting and for better learning the fusion features, we design a Modal Feature Erasing (MFE) operation on the multi-modal features which erases features from one randomly selected modality during training. As the result, our approach wins the second place in cvpr 2019 ChaLearn Face Anti-spoofing attack detection challenge. Ourfinal submission gets the score of 99.8052% (TPR@FPR = 10e-4) on the test set.
Many computervision and patternrecognition problems may be posed by defining a way of measuring dissimilarities between patterns. For many types of data, these dissimilarities are not Euclidean, and may not be metri...
详细信息
ISBN:
(纸本)9781424469840
Many computervision and patternrecognition problems may be posed by defining a way of measuring dissimilarities between patterns. For many types of data, these dissimilarities are not Euclidean, and may not be metric. In this paper, we provide a means of embedding such data. We aim to embed the data on a hypersphere whose radius of curvature is determined by the dissimilarity data. The hypersphere can be either of positive curvature (elliptic) or of negative curvature (hyperbolic). We give an efficient method for solving the elliptic and hyperbolic embedding problems on symmetric dissimilarity data. This method gives the radius of curvature and a method for approximating the objects as points on a hyperspherical manifold. We apply our method to a variety of data including shape-similarities, graph-similarity and gesture-similarity data. In each case the embedding maintains the local structure of the data while placing the points in a metric space.
We propose an adaptive and effective multimodal peripheral-fovea sensor design for real-time targets tracking. This design is inspired by the biological vision systems for achieving real-time target detection and reco...
详细信息
ISBN:
(纸本)9781424439942
We propose an adaptive and effective multimodal peripheral-fovea sensor design for real-time targets tracking. This design is inspired by the biological vision systems for achieving real-time target detection and recognition with a hyperspectral/range fovea and panoramic peripheral view. A realistic scene simulation approach is used to evaluate our sensor design and the related data exploitation algorithms before a real sensor is made. The goal is to reduce development time and system cost while achieving optimal results through an iterative process that incorporates simulation, sensing, processing and evaluation. Important issues such as multimodal sensory component integration, region of interest extraction, target tracking, hyperspectral image analysis and target signature identification are discussed.
The organization of image databases can rely upon different aspects of image similarity. Here rue extract silhouettes from images of three dimensional objects, and rely upon curve similarity for image classification. ...
详细信息
ISBN:
(纸本)0818684976
The organization of image databases can rely upon different aspects of image similarity. Here rue extract silhouettes from images of three dimensional objects, and rely upon curve similarity for image classification. Our scheme avoids the embedding of images in a vector space. Instead, we propose a curve dissimilarity measure which relies upon a novel carve matching syntactic algorithm, and use it to represent the database as a complete graph, with nodes representing the images and dissimilarity values assigning weights to the edges. A robust clustering algorithm, which is based on a physical ferromagnet model, is used to find the hierarchical structure underlying the collection of images. We tested our scheme with a database of 90 real images of 6 objects, some of them very different, others rather similar. We get a perfect hierarchical classification of these images into 6 classes of objects belonging to 3 different families.
Automatic target recognition involves detecting and recognizing potential targets automatically, which is widely used in civilian and military applications today. Quadratic correlation filters were introduced as two-c...
详细信息
ISBN:
(数字)9781538661000
ISBN:
(纸本)9781538661000
Automatic target recognition involves detecting and recognizing potential targets automatically, which is widely used in civilian and military applications today. Quadratic correlation filters were introduced as two-class recognition classifiers for quickly detecting targets in cluttered scene environments. In this paper, we introduce two methods that integrate the discrimination capability of quadratic correlation filters with the multi-class recognition ability of multilayer neural networks. For mid-wave infrared imagery, the proposed methods are demonstrated to be multi-class target recognition classifiers with very high accuracy.
暂无评论