Semantic segmentation of land in remote sensing images plays an important role in urban management and rural planning, and can provide intelligent analysis for urban development. Convolutional neural network (CNN) bas...
详细信息
Nowadays, the digital earth not only relates to the technologies of surveying and mapping geography, but also includes the analysis and cross-application of various scientific data related to geographic information. I...
详细信息
Objects in remote sensing images are difficult to detect due to arbitrarily rotated angles and the wide variance of scales. As bounding box plays an important role in object detection, we proposes a new bounding box t...
详细信息
Objects in remote sensing images are difficult to detect due to arbitrarily rotated angles and the wide variance of scales. As bounding box plays an important role in object detection, we proposes a new bounding box type named rotated bounding box (rBox). With the application of rBox, we have proposed a series of detection techniques (DrBox, DrBoxLight, DrBoxSemi, DrBoxPro) to effectively handle the situation where the orientation angles of the objects are arbitrary. This article is a brief overview of these techniques. The original DrBox detector applies VGG-net as its main network framework, with image pyramid input to address multi-scale problem. DrBoxLight is a mini version of DrBox, which applies MobileNet and knowledge distillation to be deployed on embedded devices. DrBoxSemi is a semi-supervised version of DrBox, so annotation of all training samples is no longer necessary. DrBoxPro is the most important update for DrBox with professional designing of abundant prior-rBoxes on feature pyramid networks. In our experiments, we demonstrated how rBox helps to improve the performance of object detection compared with traditional bounding boxes. Besides, we evaluated the performance of our DrBox family on a series of object detection tasks.
Owing to the property of being constant to image contrast and the identification of various types of features, phase congruency (PC) model has been widely used in remote sensing applications. However, when the PC is d...
详细信息
Owing to the property of being constant to image contrast and the identification of various types of features, phase congruency (PC) model has been widely used in remote sensing applications. However, when the PC is directly applied to optical and synthetic aperture radar (SAR) image registration, it fails to handle large radiometric and geometric differences. In this paper, we propose an automatic algorithm to solve this problem. First, evenly-distributed keypoints are extracted from the optical images via the block harris method. Complementary grid points are selected in image regions with poor structure and texture information. Then a robust similarity metric based on the improved PC model is proposed. Since the two images show diverse properties, we utilize two different PC models, the traditional PC and the SAR-PC. The PC values of several directions are aggregated to construct the feature descriptors on the basis of which, as a result, a similarity metric using the normalized correlation coefficient (NCC) is obtained. We compare the proposed metric with two baselines (mutual information and NCC) and a state-of-the-art method (histogram of the oriented phase congruency, HOPC) in the case of various scenarios, the results show that our method outperforms the baselines and show comparable performance with HOPC in regions with abundant structure information and better performance in untextured regions.
Feature Normalization (FN) is an important technique to help neural network training, which typically normalizes features across spatial dimensions. Most previous image inpainting methods apply FN in their networks wi...
详细信息
The dominant scattering mechanism is of great significance for the application of ground objects classification and target detection. It can also verify the quality of the polarimetric data by check the dominant scatt...
详细信息
The dominant scattering mechanism is of great significance for the application of ground objects classification and target detection. It can also verify the quality of the polarimetric data by check the dominant scattering mechanism of known ground objects. In order to improve the application performance, this paper studies the dominant scattering mechanism of GF-3 typical ground objects based on a large number of data slices. The GF-3 fully polarimetric data slices are classified based on the MODIS global classification map, and the GF-3 slice library of typical ground objects is constructed. Based on large amounts of GF-3 samples, we carry out the statistical analysis of dominant scattering mechanism separation results for typical GF-3 ground objects (building, woodland, cultivated land, grassland and waters) of by means of h/alpha/A decomposition. The quantitative results reveal the polarimetric scattering feature of different ground objects, and provide reference for fully polarimetric SAR application.
This study introduces an automatic method for change detection of multi-sensor remote-sensing images (e.g. optical and synthetic aperture radar (SAR) images). As object-based image analysis can effectively reduce the ...
详细信息
A comprehensive comparison of the trends and drivers of global surface and canopy urban heat islands (termed Is and Ic trends, respectively) is critical for better designing urban heat mitigation strategies. However, ...
详细信息
In this paper, we propose a novel deep architecture with multiple classifiers for continuous sign language recognition. Representing the sign video with a 3D convolutional residual network and a bidirectional LSTM, we...
详细信息
In this paper, we propose a novel deep architecture with multiple classifiers for continuous sign language recognition. Representing the sign video with a 3D convolutional residual network and a bidirectional LSTM, we formulate continuous sign language recognition as a grammatical-rule-based classification problem. We first split a text sentence of sign language into isolated words and n-grams, where an n-gram is a sequence of consecutive n words in a sentence. Then, we propose a word-independent classifiers (WIC) module and an n-gram classifier (NGC) module to identify the words and n-grams in a sentence, respectively. A greedy decoding algorithm is employed to integrate words and n-grams into the sentence based on the confidence scores provided by both modules. Our method is evaluated on a Chinese continuous sign language recognition benchmark, and the experimental results demonstrate its effectiveness and superiority.
Objective quality assessment of stereoscopic panoramic images becomes a challenging problem owing to the rapid growth of 360-degree contents. Different from traditional 2D image quality assessment (IQA), more complex ...
Objective quality assessment of stereoscopic panoramic images becomes a challenging problem owing to the rapid growth of 360-degree contents. Different from traditional 2D image quality assessment (IQA), more complex aspects are involved in 3D omnidirectional IQA, especially unlimited field of view (FoV) and extra depth perception, which brings difficulty to evaluate the quality of experience (QoE) of 3D omnidirectional images. In this paper, we propose a multi-viewport based full-reference stereo 360 IQA model. Due to the freely changeable viewports when browsing in the head-mounted display, our proposed approach processes the image inside FoV rather than the projected one such as equirectangular projection (ERP). In addition, since overall QoE depends on both image quality and depth perception, we utilize the features estimated by the difference map between left and right views which can reflect disparity. The depth perception features along with binocular image qualities are employed to further predict the overall QoE of 3D 360 images. The experimental results on our public Stereoscopic OmnidirectionaL Image quality assessment Database (SOLID) show that the proposed method achieves a significant improvement over some well-known IQA metrics and can accurately reflect the overall QoE of perceived images.
暂无评论