检索结果-内蒙古大学图书馆

A CNN-Based In-Loop Filter with CU Classification for HEVC

学校读者我要写书评

暂无评论

A CNN-Based In-Loop Filter with CU Classification for HEVC

IEEE Visual Communications and Image processing (VCIP)

作者： Yuanying Dai Dong Liu Zheng-Jun Zha Feng Wu CAS Key Laboratory of Technology in Geo-Spatial Information Processing and Application System University of Science and Technology of China Hefei China

ISBN: (纸本)9781538644591;9781538644584

Lossy compression of image and video yields visually annoying artifacts including blocking, blurring, ringing, etc., especially at low bit rates. In-loop filtering techniques can reduce these artifacts, improve quality, and achieve coding gain accordingly. In this paper, we present a convolutional neural network (CNN) based in-loop filter for High Efficiency Video Coding (HEVC). First, we design a new CNN structure that is composed of multiple Variable-filter-size Residue-learning blocks, namely VRCNN-ext, for artifact reduction. VRCNN-ext is trained by natural images as well as their compressed versions at different quality levels. Second, we investigate a new in-loop filter based on the trained VRCNN-ext models. Specifically, we observed that using VRCNN-ext directly on the inter pictures is not effective. To solve this problem, we further train a classifier to decide whether to use VRCNN-ext for each coding unit (CU). The classifier makes decision based on the compressed information, thus avoiding the overhead bits to control the on/off of the CNN-based filter at the CU level. Experimental results show that our scheme achieves significant bits saving than the HEVC anchor, leading to on average 9.2%, 9.6% and 7.4% BD-rate reduction on the HEVC test sequences, under all-intra, low-delay B and random-access configurations, respectively.

关键词： Decoding Encoding Training Image coding Feature extraction Copper Video coding

Classification of Large-Scale High-Resolution SAR Images with Deep Transfer Learning

学校读者我要写书评

暂无评论

TechRxiv

TechRxiv 2019年

作者： Huang, Zhongling Dumitru, Corneliu Octavian Pan, Zongxu Lei, Bin Datcu, Mihai The Aerospace Information Research Institute Chinese Academy of Sciences Beijing100094 China School of Electronic Electrical and Communication Engineering University of Chinese Academy of Sciences Huairou District Beijing101408 China Key Laboratory of Technology in Geo-spatial Information Processing and Application System Chinese Academy of Sciences Beijing China Wessling 82234 Germany

The classification of large-scale high-resolution SAR land cover images acquired by satellites is a challenging task, facing several difficulties such as semantic annotation with expertise, changing data characteristics due to varying imaging parameters or regional target area differences, and complex scattering mechanisms being different from optical imaging. Given a large-scale SAR land cover dataset collected from TerraSAR-X images with a hierarchical three-level annotation of 150 categories and comprising more than 100,000 patches, three main challenges in automatically interpreting SAR images of highly imbalanced classes, geographic diversity, and label noise are addressed. In this letter, a deep transfer learning method is proposed based on a similarly annotated optical land cover dataset (NWPU-RESISC45). Besides, a top-2 smooth loss function with cost-sensitive parameters was introduced to tackle the label noise and imbalanced classes’ problems. The proposed method shows high efficiency in transferring information from a similarly annotated remote sensing dataset, a robust performance on highly imbalanced classes, and is alleviating the over-fitting problem caused by label noise. What’s more, the learned deep model has a good generalization for other SAR-specific tasks, such as MSTAR target recognition with a state-of-the-art classification accuracy of 99.46%. © 2019, CC BY.

关键词： Synthetic aperture radar

An end-to-end foreground-aware network for person re-identification

学校读者我要写书评

暂无评论

arXiv 2019年

作者： Liu, Yiheng Zhou, Wengang Liu, Jianzhuang Qi, Guojun Tian, Qi Li, Houqiang CAS Key Laboratory of Technology in Geo-spatial Information Processing and Application System Department of Electronic Engineering and Information Science University of Science and Technology of China Hefei230027 China Noah’s Ark Lab Huawei Technologies Company Limited Shenzhen518129 China Huawei Cloud EI Product Department Cloud & AI Huawei Technologies

Person re-identification is a crucial task of identifying pedestrians of interest across multiple surveillance camera views. For person re-identification, a pedestrian is usually represented with features extracted from a rectangular image region that inevitably contains the scene background, which incurs ambiguity to distinguish different pedestrians and degrades the accuracy. Thus, we propose an end-to-end foreground-aware network to discriminate against the foreground from the background by learning a soft mask for person re-identification. In our method, in addition to the pedestrian ID as supervision for the foreground, we introduce the camera ID of each pedestrian image for background modeling. The foreground branch and the background branch are optimized collaboratively. By presenting a target attention loss, the pedestrian features extracted from the foreground branch become more insensitive to backgrounds, which greatly reduces the negative impact of changing backgrounds on pedestrian matching across different camera views. Notably, in contrast to existing methods, our approach does not require an additional dataset to train a human landmark detector or a segmentation model for locating the background regions. The experimental results conducted on three challenging datasets, i.e., Market-1501, DukeMTMC-reID, and MSMT17, demonstrate the effectiveness of our approach. Copyright © 2019, The Authors. All rights reserved.

关键词： Cameras

What, Where and How to Transfer in SAR Target Recognition Based on Deep CNNs

学校读者我要写书评

暂无评论

arXiv 2019年

作者： Huang, Zhongling Pan, Zongxu Lei, Bin School of Electronic Electrical and Communication Engineering University of Chinese Academy of Sciences Huairou District Beijing101408 China Key Laboratory of Technology in Geo-spatial Information Processing and Application System Chinese Academy of Sciences Beijing100190 China Institute of Electronics Chinese Academy of Sciences Beijing100190 China

Deep convolutional neural networks (DCNNs) have attracted much attention in remote sensing recently. Compared with the large-scale annotated dataset in natural images, the lack of labeled data in remote sensing becomes an obstacle to train a deep network very well, especially in SAR image interpretation. Transfer learning provides an effective way to solve this problem by borrowing the knowledge from the source task to the target task. In optical remote sensing application, a prevalent mechanism is to fine-tune on an existing model pre-trained with a large-scale natural image dataset, such as ImageNet. However, this scheme does not achieve satisfactory performance for SAR application because of the prominent discrepancy between SAR and optical images. In this paper, we attempt to discuss three issues that are seldom studied before in detail: (1) what network and source tasks are better to transfer to SAR targets, (2) in which layer are transferred features more generic to SAR targets and (3) how to transfer effectively to SAR targets recognition. Based on the analysis, a transitive transfer method via multi-source data with domain adaptation is proposed in this paper to decrease the discrepancy between the source data and SAR targets. Several experiments are conducted on OpenSARShip. The results indicate that the universal conclusions about transfer learning in natural images cannot be completely applied to SAR targets, and the analysis of what and where to transfer in SAR target recognition is helpful to decide how to transfer more effectively. Copyright © 2019, The Authors. All rights reserved.

关键词： Deep neural networks

A generalization theory based on independent and task-identically distributed assumption

学校读者我要写书评

暂无评论

arXiv 2019年

作者： Zheng, Guanhua Sang, Jitao Li, Houqiang Yu, Jian Xu, Changsheng University of Science and Technology of China School of Computer and Information Technology Beijing Key Laboratory of Traffic Data Analysis and Mining Beijing Jiaotong University Beijing100044 China Chinese Academy of Sciences Key Laboratory of Technology in Geo-Spatial Information Processing and Application System Hefei230026 China National Lab of Pattern Recognition Institute of Automation CAS Beijing100190 China University of Chinese Academy of Sciences

—Existing generalization theories analyze the generalization performance mainly based on the model complexity and training process. The ignorance of the task properties, which results from the widely used IID assumption, makes these theories fail to interpret many generalization phenomena or guide practical learning tasks. In this paper, we propose a new Independent and Task-Identically Distributed (ITID) assumption, to consider the task properties into the data generating process. The derived generalization bound based on the ITID assumption identifies the significance of hypothesis invariance in guaranteeing generalization performance. Based on the new bound, we introduce a practical invariance enhancement algorithm from the perspective of modifying data distributions. Finally, we verify the algorithm and theorems in the context of image classification task on both toy and real-world datasets. The experimental results demonstrate the reasonableness of the ITID assumption and the effectiveness of new generalization theory in improving practical generalization performance. Copyright © 2019, The Authors. All rights reserved.

关键词： Classification (of information)

Learning for video compression

学校读者我要写书评

暂无评论

arXiv 2018年

作者： Chen, Zhibo He, Tianyu Jin, Xin Wu, Feng CAS Key Laboratory of Technology in Geo-spatial Information Processing and Application System University of Science and Technology of China Hefei230027 China

One key challenge to learning-based video compression is that motion predictive coding, a very effective tool for video compression, can hardly be trained into a neural network. In this paper we propose the concept of PixelMotionCNN (PMCNN) which includes motion extension and hybrid prediction networks. PMCNN can model spatiotemporal coherence to effectively perform predictive coding inside the learning network. On the basis of PMCNN, we further explore a learning-based framework for video compression with additional components of iterative analysis/synthesis, binarization, etc. Experimental results demonstrate the effectiveness of the proposed scheme. Although entropy coding and complex configurations are not employed in this paper, we still demonstrate superior performance compared with MPEG-2 and achieve comparable results with H.264 codec. The proposed learning-based scheme provides a possible new direction to further improve compression efficiency and functionalities of future video coding. Copyright © 2018, The Authors. All rights reserved.

关键词： Image compression

Aircraft Detection in Sar Images Using Saliency Based Location Regression Network

学校读者我要写书评

暂无评论

Aircraft Detection in Sar Images Using Saliency Based Locati...

IEEE International Symposium on geoscience and Remote Sensing (IGARSS)

作者： Wenhui Diao Fangzheng Dou Kun Fu Xian Sun Key Laboratory of Spatial Information Processing and Application System Technology Chinese Academy of Sciences Beijing China

In this paper, a novel framework for aircraft detection in high resolution apron area in Synthetic Aperture Radar (SAR) images is proposed, which combines the strength of location regression based convolutional neural network (CNN) framework and the salient features of target in SAR images. Specifically, a Constant False Alarm Rate (CFAR) based target pre-locating algorithm is introduced, which can match the scale of target in SAR images more accurate compared to the existing region proposal method. In addition, in order to eliminate the fact of overfitting, we explore several strategies for SAR data augmentation, including translation, adding noise and rotation within a small range. Experiments are conducted on the data set acquired by the TerraSAR-X satellite in a resolution of 3.0 meters. The results show that the proposed detection framework could effectively obtain a more accurate detection result.

关键词： Synthetic aperture radar Aircraft Object detection Training Image resolution Radar imaging Atmospheric modeling

Inference of Urban Function Zone Based on Deep Neural Network

学校读者我要写书评

暂无评论

Inference of Urban Function Zone Based on Deep Neural Networ...

IEEE International Symposium on geoscience and Remote Sensing (IGARSS)

作者： Ankai Hou Mingcang Zhu Pengshan Li Yong He Xiaobo Zhang Jibao Shi Kai Chen Tao Weng Zezhong Zheng Guoqing Zhou School of Resources and Environment University of Electronic Science and Technology of China Chengdu Sichuan PRC Chengdu Sichuan PRC Chengdu Land Planning and Cadastre Center Chengdu Sichuan PRC Sichuan Research Institute for Eco-system Restoration & Geo-disaster Prevention Chengdu Sichuan PRC Chengdu Institute of Survey & Investigation Chengdu Sichuan PRC Guangxi Key Laboratory for Spatial Information and Geomatics Guilin University of Technology Guilin Guangxi PRC

ISBN: (数字)9781728163741

ISBN: (纸本)9781728163758

With the rapid development of urbanization, more and more attention has been paid to the structure of urban function zone. Thus, it is of great significance to investigate urban function zone. In this paper, we introduced the deep neural network (DNN) to infer the urban function zone with a supervised classification approach, taking the Shenzhen city in China as a case. First of all, the urban road networks of Shenzhen city were gathered and selected appropriately. Then, the fifth level road networks were utilized to segment the study region. Second, the communication data of different times and points of interest (POI) were collected. Then, the fifteen factors influencing urban function zone were derived. In addition, the urban function zone was divided into five types and the labeled examples with fifteen influencing factors were chosen. Third, the labeled examples were employed to train the DNN with different hidden layers compared with random forest (RF) and support vector machine (SVM). The models were trained with the approach of five-fold cross validation, and the average training accuracy with five times is taken as the accuracy of models. Finally, this paper compared the accuracy. It's been shown in the results that DNN was the optimum model and achieved the highest accuracy. Therefore, our proposed method is an efficient approach to infer the urban function zone.

关键词： Support vector machines Radio frequency Urban areas Social networking (online) Roads Random forests Deep learning

Ship Detection with Sar Based on Yolo

学校读者我要写书评

暂无评论

Ship Detection with Sar Based on Yolo

IEEE International Symposium on geoscience and Remote Sensing (IGARSS)

作者： Shaobin Jiang Mingcang Zhu Yong He Zezhong Zheng Fangrong Zhou Guoqing Zhou School of Resources and Environment University of Electronic Science and Technology of China Chengdu Sichuan PRC Chengdu Sichuan PRC Sichuan Research Institute for Eco-system Restoration & Geo-disaster Prevention Chengdu Sichuan PRC Guangxi Key Laboratory for Spatial Information and Geomatics Guilin University of Technology Guilin Guangxi PRC Electric Power Research Institute Yunnan Power Grid Co. Ltd. Kunming Yunnan PRC

ISBN: (数字)9781728163741

ISBN: (纸本)9781728163758

Synthetic aperture radar (SAR) allows all-weather, day and night surveillance. Thus, it is of great significance for the ship detection and recognition. Because of the SAR special imaging mechanism, it is very difficult to extract the ship features with SAR image for the traditional target detection algorithm. In this paper, we proposed a approach which is composed of you only look once (YOLO) algorithm, sliding window detection strategy, and clustering algorithm. Firstly, the SAR images of GaoFen-3 and training dataset are gathered. Secondly, the experiments about the size of ship detection frame is carried out to find the optimum size of the frame for the training model. Thirdly, the ships are detected initially with YOLO v3 and fast region-based convolutional neural network (Fast-RCNN). Finally, the detected ships are clustered adaptively, and the experimental results of YOLO v3 and Fast-RCNN are compared and discussed at length. Our experimental results demonstrated that our method outperformed Fast-RCNN to detect the ships in the surface sea with low-resolution wide -band SAR images. Therefore, our approach is a robust method to detect the ships in the surface sea with SAR images.

关键词： Marine vehicles Synthetic aperture radar Radar polarimetry Training Feature extraction Object detection Deep learning