Light field image quality assessment (LF-IQA) plays a significant role due to its guidance to Light Field (LF) contents acquisition, processing and application. The LF can be represented as 4-D signal, and its quality...
详细信息
The past decade has witnessed great success of deep learning technology in many disciplines, especially in computer vision and image processing. However, deep learning-based video coding remains in its infancy. This p...
详细信息
The past decade has witnessed great success of deep learning technology in many disciplines, especially in computer vision and image processing. However, deep learning-based video coding remains in its infancy. This paper reviews the representative works about using deep learning for image/video coding, which has been an actively developing research area since the year of 2015. We divide the related works into two categories: new coding schemes that are built primarily upon deep networks (deep schemes), and deep network-based coding tools (deep tools) that shall be used within traditional coding schemes or together with traditional coding tools. For deep schemes, pixel probability modeling and auto-encoder are the two approaches, that can be viewed as predictive coding scheme and transform coding scheme, respectively. For deep tools, there have been several proposed techniques using deep learning to perform intra-picture prediction, inter-picture prediction, cross-channel prediction, probability distribution prediction, transform, post- or in-loop filtering, down- and up-sampling, as well as encoding optimizations. According to the newest reports, deep schemes have achieved comparable or even higher compression efficiency than the state-of-the-art traditional schemes, such as High Efficiency Video Coding (HEVC) based scheme, for image coding;deep tools have demonstrated the compression capability beyond HEVC for video coding. However, deep schemes have not yet reached the current height of HEVC for video coding, and deep tools remain largely unexplored at many aspects including the tradeoff between compression efficiency and encoding/decoding complexity, the optimization for perceptual naturalness or semantic quality, the speciality and universality, the federated design of multiple deep tools, and so on. In the hope of advocating the research of deep learning-based video coding, we present a case study of our developed prototype video codec, namely Deep Learning Vi
We address the channel estimation problem in reconfigurable intelligent surface (RIS) aided broadband systems by proposing a dual-structure and multi-dimensional transformations (DS-MDT) algorithm. The proposed approa...
详细信息
High-resolution SAR has large transmitting bandwidth and wide synthetic aperture. How to understand and take advantage of the variation characteristics of SAR scattering characteristics with angle and frequency is a t...
详细信息
High-resolution SAR has large transmitting bandwidth and wide synthetic aperture. How to understand and take advantage of the variation characteristics of SAR scattering characteristics with angle and frequency is a topic that worth studying. This article establishes a coherence matrix of sub-band and sub-aperture SAR images, and analyzes its ability to classify scattering mechanism. Experiments are conducted using the TerraSAR-X high-resolution data of different scenarios, and some meaningful results are got, which may provide some support to the analysis and application of high-resolution SAR data.
In High Efficiency Video Coding (HEVC), excellent rate-distortion (RD) performance is achieved in part by having a flexible quadtree coding unit (CU) partition and a large number of intra-prediction modes. Such an exc...
详细信息
This paper proposes an extension version of our previous work MS-CC to achieve optical and SAR images change detection. The proposed method introduces a cooperative multitemporal segmentation, whose merging process co...
详细信息
This paper proposes an extension version of our previous work MS-CC to achieve optical and SAR images change detection. The proposed method introduces a cooperative multitemporal segmentation, whose merging process considers the heterogeneity of SAR and optical images as parallel information, making sure that the multitemporal information can be fully utilized without interfering with each other. Then, the change detection strategy based on compound classification is carried out on the segmentation results, obtaining the multi-scale change detection maps. Experimental validation is conducted with GoaFen3 and Google Earth data.
Convolutional neural networks (CNNs) are powerful and have achieved state-of-the-art performance in many visual recognition tasks. Despite their impressive performance, CNNs are still unable to remain invariant while ...
详细信息
Convolutional neural networks (CNNs) are powerful and have achieved state-of-the-art performance in many visual recognition tasks. Despite their impressive performance, CNNs are still unable to remain invariant while some spatial transformations are applied on images. Herein, we propose representation-consistent neural networks to solve this problem. By introducing consistent losses between the representations in different layers of transformed images, the recognition performance of transformed images is significantly improved. This model not only learns to map from the transformed images to the pre-defined labels but each layer also learns to generate invariant representations when the input images are transformed. All the characteristics of transformation invariance are embedded in the model, which means that no extra parameters or computations are introduced in the well-trained model. Comparative experiments demonstrate the superiority of our model when learning invariance to rotation, translation, and scaling on large-scale image recognition and retrieval tasks.
Objective quality assessment of stereoscopic panoramic images becomes a challenging problem owing to the rapid growth of 360-degree contents. Different from traditional 2D image quality assessment (IQA), more complex ...
详细信息
We study the video super-resolution (SR) problem for facilitating video analytics tasks, e.g. action recognition, instead of for visual quality. The popular action recognition methods based on convolutional networks, ...
详细信息
With the needs of quality assessment for massive GF-3 polarimetric data, a method based on common distribution targets has been proposed by Sha Jiang. However, it needs manually selection of those woodlands, and canno...
详细信息
With the needs of quality assessment for massive GF-3 polarimetric data, a method based on common distribution targets has been proposed by Sha Jiang. However, it needs manually selection of those woodlands, and cannot be performed automatically. In this paper, an automated GF-3 full-polarization SAR data quality assessment method is conducted using a classic Convolution Neural Network (VGG-16). The network is pre-trained by Radarsat-2 PolSAR data and then trained by selected typical GF-3 scenes. It is supposed to learn the features of the targets, which satisfies the azimuthal symmetry and backscatter reciprocity and fulfills the quality assessment work. Several typical GF-3 strips data are used to test the method. Experiments show that the network can predict the plots of targets from a new scene under the interference of polarimetric distortion and noise. And, the quality assessment results by the network are consistent with the manual assessment results, which shows the effectiveness of the method.
暂无评论