With the recent advances of Convolutional Neural Networks (CNN) in computervision, there have been rapid progresses in extracting roads and other features from satellite imagery for mapping and other purposes. In thi...
详细信息
ISBN:
(数字)9781538661000
ISBN:
(纸本)9781538661000
With the recent advances of Convolutional Neural Networks (CNN) in computervision, there have been rapid progresses in extracting roads and other features from satellite imagery for mapping and other purposes. In this paper, we propose a new method for road extraction using stacked U-Nets with multiple output. A hybrid loss function is used to address the problem of unbalanced classes of training data. Post-processing methods, including road map vectorization and shortest path search with hierarchical thresholds, help improve recall. The overall improvement of mean IoU compared to the vanilla VGG network is more than 20%.
Object recognition on the satellite images is one of the most relevant and popular topics in the problem of patternrecognition. This was facilitated by many factors, such as a high number of satellites with high-reso...
详细信息
ISBN:
(数字)9781538661000
ISBN:
(纸本)9781538661000
Object recognition on the satellite images is one of the most relevant and popular topics in the problem of patternrecognition. This was facilitated by many factors, such as a high number of satellites with high-resolution imagery, the significant development of computervision, especially with a major breakthrough in the field of convolutional neural networks, a wide range of industry verticals for usage and still a quite empty market. Roads are one of the most popular objects for recognition. In this article, we want to present you the combination of work of neural network and postprocessing algorithm, due to which we get not only the coverage mask but also the vectors of all of the individual roads that are present in the image and can be used to address the higher-level tasks in the future. This approach was used to solve the DeepGlobe Road Extraction Challenge.
In this paper, we propose a compact frame-based facial expression recognition framework for facial expression recognition which achieves very competitive performance with respect to state-of-the-art methods while usin...
详细信息
ISBN:
(数字)9781538661000
ISBN:
(纸本)9781538661000
In this paper, we propose a compact frame-based facial expression recognition framework for facial expression recognition which achieves very competitive performance with respect to state-of-the-art methods while using much less parameters. The proposed framework is extended to a frame-to-sequence approach by exploiting temporal information with gated recurrent units. In addition, we develop an illumination augmentation scheme to alleviate the over-fitting problem when training the deep networks with hybrid data sources. Finally, we demonstrate the performance improvement by using the proposed technique on some public datasets.
In this paper, we study deep transfer learning as a way of overcoming object recognition challenges encountered in the field of digital pathology. Through several experiments, we investigate various uses of pre-traine...
详细信息
ISBN:
(数字)9781538661000
ISBN:
(纸本)9781538661000
In this paper, we study deep transfer learning as a way of overcoming object recognition challenges encountered in the field of digital pathology. Through several experiments, we investigate various uses of pre-trained neural network architectures and different combination schemes with random forests for feature selection. Our experiments on eight classification datasets show that densely connected and residual networks consistently yield best performances across strategies. It also appears that network fine-tuning and using inner layers features are the best performing strategies, with the former yielding slightly superior results.
In this paper we address the problem of unconstrained Word Spotting in scene images. We train a Fully Convolutional Network to produce heatmaps of all the character classes. Then, we employ the Text Proposals approach...
详细信息
ISBN:
(数字)9781538661000
ISBN:
(纸本)9781538661000
In this paper we address the problem of unconstrained Word Spotting in scene images. We train a Fully Convolutional Network to produce heatmaps of all the character classes. Then, we employ the Text Proposals approach and, via a rectangle classifier, detect the most likely rectangle for each query word based on the character attribute maps. We evaluate the proposed method on ICDAR2015 and show that it is capable of identifying and recognizing query words in natural scene images.
Building footprints (BFP) provide useful visual context for users of digital maps when navigating in space. This paper proposes a method for extracting and symbolizing building footprints from satellite imagery using ...
详细信息
ISBN:
(数字)9781538661000
ISBN:
(纸本)9781538661000
Building footprints (BFP) provide useful visual context for users of digital maps when navigating in space. This paper proposes a method for extracting and symbolizing building footprints from satellite imagery using a convolutional neural network (CNN). The CNN architecture outputs rotated rectangles, providing a symbolized approximation that works well for small buildings. Experiments are conducted on the four cities in the DeepGlobe Challenge dataset (Las Vegas, Paris, Shanghai, Khartoum). Our method performs best on suburbs consisting of individual houses. These experiments show that either large buildings or buildings without clear delineation produce weaker results in terms of precision and recall.
The land cover classification task of the DeepGlohe Challenge presents significant obstacles even to state of the art segmentation models due to a small amount of data, incomplete and sometimes incorrect labeling, and...
详细信息
ISBN:
(数字)9781538661000
ISBN:
(纸本)9781538661000
The land cover classification task of the DeepGlohe Challenge presents significant obstacles even to state of the art segmentation models due to a small amount of data, incomplete and sometimes incorrect labeling, and highly imbalanced classes. In this work, we show an approach based on the U-Net architecture with the Lovcisz-Softmax loss that successfully alleviates these problems: we compare several different convolutional architectures for U-Net encoders.
We introduce the first benchmark for a new problem - recognizing human action adverbs (HAA): "Adverbs Describing Human Actions" (ADHA). We demonstrate some key features of ADHA: a semantically complete set o...
详细信息
ISBN:
(数字)9781538661000
ISBN:
(纸本)9781538661000
We introduce the first benchmark for a new problem - recognizing human action adverbs (HAA): "Adverbs Describing Human Actions" (ADHA). We demonstrate some key features of ADHA: a semantically complete set of adverbs describing human actions, a set of common, describable human actions, and an exhaustive labelling of simultaneously emerging actions in each video. We commit an in-depth analysis on the implementation of current effective models in action recognition and image captioning on adverb recognition, and the results reveal that such methods are unsatisfactory. Furthermore, we propose a novel three-stream hybrid model to tackle the HAA problem, which achieves better performances and receives relatively promising results.
We present a semantic segmentation algorithm for RGB remote sensing images. Our method is based on the Dilated Stacked U-Nets architecture. This state-of-the-art method has been shown to have good performance in other...
详细信息
ISBN:
(数字)9781538661000
ISBN:
(纸本)9781538661000
We present a semantic segmentation algorithm for RGB remote sensing images. Our method is based on the Dilated Stacked U-Nets architecture. This state-of-the-art method has been shown to have good performance in other applications. We perform additional post-processing by blending image tiles and degridding the result. Our method gives competitive results on the DeepGlobe dataset.
Automatic target recognition involves detecting and recognizing potential targets automatically, which is widely used in civilian and military applications today. Quadratic correlation filters were introduced as two-c...
详细信息
ISBN:
(数字)9781538661000
ISBN:
(纸本)9781538661000
Automatic target recognition involves detecting and recognizing potential targets automatically, which is widely used in civilian and military applications today. Quadratic correlation filters were introduced as two-class recognition classifiers for quickly detecting targets in cluttered scene environments. In this paper, we introduce two methods that integrate the discrimination capability of quadratic correlation filters with the multi-class recognition ability of multilayer neural networks. For mid-wave infrared imagery, the proposed methods are demonstrated to be multi-class target recognition classifiers with very high accuracy.
暂无评论