This paper addresses the issue on how to more effectively coordinate the depth with RGB aiming at boosting the performance of RGB-D object detection. Particularly, we investigate two primary ideas under the CNN model:...
详细信息
This paper addresses the issue on how to more effectively coordinate the depth with RGB aiming at boosting the performance of RGB-D object detection. Particularly, we investigate two primary ideas under the CNN model: property derivation and property fusion. Firstly, we propose that the depth can be utilized not only as a type of extra information besides RGB but also to derive more visual properties for comprehensively describing the objects of interest. So a two-stage learning framework consisting of property derivation and fusion is constructed. Here the properties can be derived either from the provided color/depth or their pairs (e.g. the geometry contour adopted in this paper). Secondly, we explore the fusion method of different properties in feature learning, which is boiled down to, under the CNN model, from which layer the properties should be fused together. The analysis shows that different semantic properties should be learned separately and combined before passing into the final classifier. Actually, such a detection way is in accordance with the mechanism of the primary neural cortex (V1) in brain. We experimentally evaluate the proposed method on the challenging dataset, and have achieved state-of-the-art performance.
Content-aware image retargeting has attracted substantial research interests in the related research community. However, so far there is still no method can adequately preserve important image contents and structure w...
详细信息
ISBN:
(纸本)9781509053179
Content-aware image retargeting has attracted substantial research interests in the related research community. However, so far there is still no method can adequately preserve important image contents and structure well without introducing conspicuous visible deformation in a relatively short period of time. To address this problem, we propose a Fast Genetic Multi-operator (FGM) method which integrates multiple retargeting operators. To improve the efficiency, FGM method utilizes Genetic Algorithms (GAs) to reach the optimal operator ratio, which adopts saliency and Gray-Level Co-occurrence Matrix (GLCM) as its energy function. FGM method not only can well preserve salient contents and structure, but also can greatly reduce the computational complexity. Experimental results demonstrated that our method outperforms state-of-art image retargeting methods.
The directional intra prediction (DIP) modes in HEVC are capable of predicting local continuous image features. Recently, intra block copy (IBC) is proposed for screen content coding, aiming at predicting non-local re...
详细信息
The directional intra prediction (DIP) modes in HEVC are capable of predicting local continuous image features. Recently, intra block copy (IBC) is proposed for screen content coding, aiming at predicting non-local recurrent image features. For natural video, we observe that recurrent features are often irregular and not aligned with blocks. Thus, we propose a combination of DIP and IBC with block partition for better intra prediction, where one block can be divided into several partitions, each of which may choose between DIP and IBC. We study an intra prediction scheme with the proposed combination, especially the rate-distortion optimization and entropy coding in the scheme. Preliminary experimental results show that the proposed combined intra prediction achieves as high as 5.8% bit-rate saving compared to HEVC anchor.
In all of the existing block-based image and video coding standards, blocks are processed in the fixed scan order. Then in HEVC intra coding, intra prediction is always based on the top and/or left neighboring reconst...
详细信息
ISBN:
(纸本)9781509053179
In all of the existing block-based image and video coding standards, blocks are processed in the fixed scan order. Then in HEVC intra coding, intra prediction is always based on the top and/or left neighboring reconstructed pixels, which incurs less accurate prediction for blocks where the spatial correlation is not along the topleft-to-bottomright direction. To obtain better intra prediction, we propose to flexibly determine the coding order of blocks in HEVC intra coding. Complying with the hierarchical quadtree structure in HEVC, our flexible block ordering (FBO) technique recursively decides the coding order of four sub-blocks when splitting one block. Moreover, we propose new methods to perform inter/extrapolation for intra prediction so as to fully utilize neighboring reconstructed pixels, not always being top/left. Experimental results show that our proposed FBO technique achieves on average 2.9% BD-rate reduction compared to HEVC baseline.
Discrete Cosine Transform (DCT) has been the commonly used transform for a few decades in image/video coding. However, DCT does not work well on the blocks having anisotropic correlations. In this paper, based on the ...
详细信息
ISBN:
(纸本)9781479953424
Discrete Cosine Transform (DCT) has been the commonly used transform for a few decades in image/video coding. However, DCT does not work well on the blocks having anisotropic correlations. In this paper, based on the adaptive dictionary, we propose a new online transform scheme using Orthogonal Matching Pursuit (OMP) for High Efficiency Video Coding (HEVC). For a coding block, we construct its dictionary by exploiting non-local correlations from the reconstructed regions. The OMP algorithm is implemented to obtain the sparse transform coefficients. Experimental results show that the BD-rate savings of the proposed scheme for the sequences with strong edges can be up to 19.9%.
We propose a novel superpixel algorithm based on Minimum Spanning Tree (MST), to generate superpixels efficiently while strictly adhere to object boundaries. The MST, which built by gradually removing strong edges of ...
详细信息
ISBN:
(纸本)9781467372596
We propose a novel superpixel algorithm based on Minimum Spanning Tree (MST), to generate superpixels efficiently while strictly adhere to object boundaries. The MST, which built by gradually removing strong edges of the image graph extracted from the image, is more sensitive to image local structures. Therefore, an efficient hierarchical clustering strategy is basically employed in our algorithm to segment the input image into superpixels based on the tree distance. To gradually merge the image pixels and remove texture noises, a multi-layer scheme with different resolutions of superpixels is proposed. In each layer, the graph is constructed from the lower layer and segmented into superpixels in a linear complexity with the node number in the graph. Because the node number in each layer is exponentially reduced, the computational time of our method mainly concentrates on the first few layers, which is linear with the number of image pixels. The experimental results conducted on the Berkeley Segmentation Dataset demonstrate that our method outperforms state-of-the-art methods both in terms of structure preservation and computational efficiency.
The edges of the shadow region are blurred in the SAR image due to the moving of the radar during data collection. This phenomenon becomes obvious in the High Resolution SAR images. Shadow enhancement is of great valu...
详细信息
The edges of the shadow region are blurred in the SAR image due to the moving of the radar during data collection. This phenomenon becomes obvious in the High Resolution SAR images. Shadow enhancement is of great value for ATR especially when the scattering centers of the target itself are not clear. In this paper, an approach for shadow enhancement in the SAR images for targets with plat structures is presented. And experiments on the Mini-SAR data test the validity of the approach.
In this paper, an interferometric synthetic aperture radar phase denoising method which utilizes both local sparsity of wavelet coefficients and nonlocal similarity of grouped blocks, has been proposed. The derived no...
详细信息
In this paper, an interferometric synthetic aperture radar phase denoising method which utilizes both local sparsity of wavelet coefficients and nonlocal similarity of grouped blocks, has been proposed. The derived nonlocal wavelet shrinkage use double L1 norm restrictions, which enforce local and nonlocal sparsity constraints by efficient shrinkage operators. This method can take advantage of the coefficients of nonlocal similarity between group blocks for wavelet shrinkage, and improve the accuracy of filtering result. Experimental results in InSAR phase image denoising tasks with simulation and actual noise data show that the proposed method outperforms the state of the art with lower root-mean-square error and less noisy fringes, making it possible to effectively filtering phase noise with superior performance.
Scattering structure features of targets is of great importance for Synthetic Aperture Radar (SAR) image analysis. In this paper, a novel algorithm for aircraft recognition in high resolution apron area of SAR images ...
详细信息
ISBN:
(纸本)9781509033331
Scattering structure features of targets is of great importance for Synthetic Aperture Radar (SAR) image analysis. In this paper, a novel algorithm for aircraft recognition in high resolution apron area of SAR images is proposed. The algorithm combines the strength of gradient saliency map and scattering structure features to improve accuracy and efficiency. Specially, Constant False-Alarm Rate (CFAR) algorithm is carried out to segment images. Then, a new efficient object locating method based on directional local gradient map is proposed to detect aircraft targets. Then, the candidate slices as well as template slices are modeled using Gaussian Mixture Model (GMM), which will be treated as structure features. In the recognition stage, a novel similarity measurement algorithm based on Kullback-Leibler Divergence for GMM models is proposed for classification. We conduct experiments on the dataset with 3.0m resolution and the recognition results demonstrate the accuracy of our proposed method.
Two approximations, center-beam approximation and reference digital elevation model (DEM) approximation, are used in synthetic aperture radar (SAR) motion compensation procedures. They usually introduce residual m...
详细信息
Two approximations, center-beam approximation and reference digital elevation model (DEM) approximation, are used in synthetic aperture radar (SAR) motion compensation procedures. They usually introduce residual motion compensation errors for airborne single-antenna SAR imaging and SAR interferometry. In this paper, we investigate the effects of residual uncompensated motion errors, which are caused by the above two approximations, on the performance of airborne along-track interferometric SAR (ATI-SAR). The residual uncompensated errors caused by center-beam approximation in the absence and in the presence of elevation errors are derived, respectively. Airborne simulation parameters are used to verify the correctness of the analysis and to show the impacts of residual uncompensated errors on the interferometric phase errors for ATI-SAR. It is shown that the interferometric phase errors caused by the center-beam approximation with an accurate DEM could be neglected, while the interferometric phase errors caused by the center-beam approximation with an inaccurate DEM cannot be neglected when the elevation errors exceed a threshold. This research provides theoretical bases for the error source analysis and signal processing of airborne ATI-SAR.
暂无评论