检索结果-内蒙古大学图书馆

End-to-End Facial Image Compression with Integrated Semantic Distortion Metric

学校读者我要写书评

暂无评论

End-to-End Facial Image Compression with Integrated Semantic...

IEEE Visual Communications and Image processing (VCIP)

作者： Tianyu He Zhibo Chen CAS Key Laboratory of Technology in Geo-spatial Information Processing and Application System University of Science and Technology of China Hefei China

ISBN: (纸本)9781538644591;9781538644584

High efficient facial image compression is broadly required and challenging for surveillance and security scenarios, while either traditional general image codecs or special facial image compression schemes only heuristically refine codec separately according to face verification accuracy metric. We propose an End-to-End Facial Image Compression (E2EFIC) framework with a novel variable block size Regionally Adaptive Pooling (RAP) module whose parameters can be automatically optimized according to gradient feedback from an integrated semantic distortion metrics, including a successful exploration to apply Generative Adversarial Network (GAN) as metric directly in image compression scheme. The experimental results verify the framework's efficiency by demonstrating performance improvement of 71.41%, 48.28% and 52.67% bitrate saving separately over JPEG2000, WebP and neural network-based codecs under the same face verification accuracy distortion metric. We also evaluate E2EFIC's superior performance gain compared with latest specific facial image codecs.

关键词： Image coding Distortion Semantics Face Bit rate Codecs

A CNN-Based In-Loop Filter with CU Classification for HEVC

学校读者我要写书评

暂无评论

A CNN-Based In-Loop Filter with CU Classification for HEVC

IEEE Visual Communications and Image processing (VCIP)

作者： Yuanying Dai Dong Liu Zheng-Jun Zha Feng Wu CAS Key Laboratory of Technology in Geo-Spatial Information Processing and Application System University of Science and Technology of China Hefei China

ISBN: (纸本)9781538644591;9781538644584

Lossy compression of image and video yields visually annoying artifacts including blocking, blurring, ringing, etc., especially at low bit rates. In-loop filtering techniques can reduce these artifacts, improve quality, and achieve coding gain accordingly. In this paper, we present a convolutional neural network (CNN) based in-loop filter for High Efficiency Video Coding (HEVC). First, we design a new CNN structure that is composed of multiple Variable-filter-size Residue-learning blocks, namely VRCNN-ext, for artifact reduction. VRCNN-ext is trained by natural images as well as their compressed versions at different quality levels. Second, we investigate a new in-loop filter based on the trained VRCNN-ext models. Specifically, we observed that using VRCNN-ext directly on the inter pictures is not effective. To solve this problem, we further train a classifier to decide whether to use VRCNN-ext for each coding unit (CU). The classifier makes decision based on the compressed information, thus avoiding the overhead bits to control the on/off of the CNN-based filter at the CU level. Experimental results show that our scheme achieves significant bits saving than the HEVC anchor, leading to on average 9.2%, 9.6% and 7.4% BD-rate reduction on the HEVC test sequences, under all-intra, low-delay B and random-access configurations, respectively.

关键词： Decoding Encoding Training Image coding Feature extraction Copper Video coding

Classification of Large-Scale High-Resolution SAR Images with Deep Transfer Learning

学校读者我要写书评

暂无评论

TechRxiv

TechRxiv 2019年

作者： Huang, Zhongling Dumitru, Corneliu Octavian Pan, Zongxu Lei, Bin Datcu, Mihai The Aerospace Information Research Institute Chinese Academy of Sciences Beijing100094 China School of Electronic Electrical and Communication Engineering University of Chinese Academy of Sciences Huairou District Beijing101408 China Key Laboratory of Technology in Geo-spatial Information Processing and Application System Chinese Academy of Sciences Beijing China Wessling 82234 Germany

The classification of large-scale high-resolution SAR land cover images acquired by satellites is a challenging task, facing several difficulties such as semantic annotation with expertise, changing data characteristics due to varying imaging parameters or regional target area differences, and complex scattering mechanisms being different from optical imaging. Given a large-scale SAR land cover dataset collected from TerraSAR-X images with a hierarchical three-level annotation of 150 categories and comprising more than 100,000 patches, three main challenges in automatically interpreting SAR images of highly imbalanced classes, geographic diversity, and label noise are addressed. In this letter, a deep transfer learning method is proposed based on a similarly annotated optical land cover dataset (NWPU-RESISC45). Besides, a top-2 smooth loss function with cost-sensitive parameters was introduced to tackle the label noise and imbalanced classes’ problems. The proposed method shows high efficiency in transferring information from a similarly annotated remote sensing dataset, a robust performance on highly imbalanced classes, and is alleviating the over-fitting problem caused by label noise. What’s more, the learned deep model has a good generalization for other SAR-specific tasks, such as MSTAR target recognition with a state-of-the-art classification accuracy of 99.46%. © 2019, CC BY.

关键词： Synthetic aperture radar

An end-to-end foreground-aware network for person re-identification

学校读者我要写书评

暂无评论

arXiv 2019年

作者： Liu, Yiheng Zhou, Wengang Liu, Jianzhuang Qi, Guojun Tian, Qi Li, Houqiang CAS Key Laboratory of Technology in Geo-spatial Information Processing and Application System Department of Electronic Engineering and Information Science University of Science and Technology of China Hefei230027 China Noah’s Ark Lab Huawei Technologies Company Limited Shenzhen518129 China Huawei Cloud EI Product Department Cloud & AI Huawei Technologies

Person re-identification is a crucial task of identifying pedestrians of interest across multiple surveillance camera views. For person re-identification, a pedestrian is usually represented with features extracted from a rectangular image region that inevitably contains the scene background, which incurs ambiguity to distinguish different pedestrians and degrades the accuracy. Thus, we propose an end-to-end foreground-aware network to discriminate against the foreground from the background by learning a soft mask for person re-identification. In our method, in addition to the pedestrian ID as supervision for the foreground, we introduce the camera ID of each pedestrian image for background modeling. The foreground branch and the background branch are optimized collaboratively. By presenting a target attention loss, the pedestrian features extracted from the foreground branch become more insensitive to backgrounds, which greatly reduces the negative impact of changing backgrounds on pedestrian matching across different camera views. Notably, in contrast to existing methods, our approach does not require an additional dataset to train a human landmark detector or a segmentation model for locating the background regions. The experimental results conducted on three challenging datasets, i.e., Market-1501, DukeMTMC-reID, and MSMT17, demonstrate the effectiveness of our approach. Copyright © 2019, The Authors. All rights reserved.

关键词： Cameras

What, Where and How to Transfer in SAR Target Recognition Based on Deep CNNs

学校读者我要写书评

暂无评论

arXiv 2019年

作者： Huang, Zhongling Pan, Zongxu Lei, Bin School of Electronic Electrical and Communication Engineering University of Chinese Academy of Sciences Huairou District Beijing101408 China Key Laboratory of Technology in Geo-spatial Information Processing and Application System Chinese Academy of Sciences Beijing100190 China Institute of Electronics Chinese Academy of Sciences Beijing100190 China

Deep convolutional neural networks (DCNNs) have attracted much attention in remote sensing recently. Compared with the large-scale annotated dataset in natural images, the lack of labeled data in remote sensing becomes an obstacle to train a deep network very well, especially in SAR image interpretation. Transfer learning provides an effective way to solve this problem by borrowing the knowledge from the source task to the target task. In optical remote sensing application, a prevalent mechanism is to fine-tune on an existing model pre-trained with a large-scale natural image dataset, such as ImageNet. However, this scheme does not achieve satisfactory performance for SAR application because of the prominent discrepancy between SAR and optical images. In this paper, we attempt to discuss three issues that are seldom studied before in detail: (1) what network and source tasks are better to transfer to SAR targets, (2) in which layer are transferred features more generic to SAR targets and (3) how to transfer effectively to SAR targets recognition. Based on the analysis, a transitive transfer method via multi-source data with domain adaptation is proposed in this paper to decrease the discrepancy between the source data and SAR targets. Several experiments are conducted on OpenSARShip. The results indicate that the universal conclusions about transfer learning in natural images cannot be completely applied to SAR targets, and the analysis of what and where to transfer in SAR target recognition is helpful to decide how to transfer more effectively. Copyright © 2019, The Authors. All rights reserved.

关键词： Deep neural networks

A generalization theory based on independent and task-identically distributed assumption

学校读者我要写书评

暂无评论

arXiv 2019年

作者： Zheng, Guanhua Sang, Jitao Li, Houqiang Yu, Jian Xu, Changsheng University of Science and Technology of China School of Computer and Information Technology Beijing Key Laboratory of Traffic Data Analysis and Mining Beijing Jiaotong University Beijing100044 China Chinese Academy of Sciences Key Laboratory of Technology in Geo-Spatial Information Processing and Application System Hefei230026 China National Lab of Pattern Recognition Institute of Automation CAS Beijing100190 China University of Chinese Academy of Sciences

—Existing generalization theories analyze the generalization performance mainly based on the model complexity and training process. The ignorance of the task properties, which results from the widely used IID assumption, makes these theories fail to interpret many generalization phenomena or guide practical learning tasks. In this paper, we propose a new Independent and Task-Identically Distributed (ITID) assumption, to consider the task properties into the data generating process. The derived generalization bound based on the ITID assumption identifies the significance of hypothesis invariance in guaranteeing generalization performance. Based on the new bound, we introduce a practical invariance enhancement algorithm from the perspective of modifying data distributions. Finally, we verify the algorithm and theorems in the context of image classification task on both toy and real-world datasets. The experimental results demonstrate the reasonableness of the ITID assumption and the effectiveness of new generalization theory in improving practical generalization performance. Copyright © 2019, The Authors. All rights reserved.

关键词： Classification (of information)

Learning for video compression

学校读者我要写书评

暂无评论

arXiv 2018年

作者： Chen, Zhibo He, Tianyu Jin, Xin Wu, Feng CAS Key Laboratory of Technology in Geo-spatial Information Processing and Application System University of Science and Technology of China Hefei230027 China

One key challenge to learning-based video compression is that motion predictive coding, a very effective tool for video compression, can hardly be trained into a neural network. In this paper we propose the concept of PixelMotionCNN (PMCNN) which includes motion extension and hybrid prediction networks. PMCNN can model spatiotemporal coherence to effectively perform predictive coding inside the learning network. On the basis of PMCNN, we further explore a learning-based framework for video compression with additional components of iterative analysis/synthesis, binarization, etc. Experimental results demonstrate the effectiveness of the proposed scheme. Although entropy coding and complex configurations are not employed in this paper, we still demonstrate superior performance compared with MPEG-2 and achieve comparable results with H.264 codec. The proposed learning-based scheme provides a possible new direction to further improve compression efficiency and functionalities of future video coding. Copyright © 2018, The Authors. All rights reserved.

关键词： Image compression

Spatiotemporal dynamics of coastal dead zones in the Gulf of Mexico over 20 years using remote sensing

学校读者我要写书评

暂无评论

Science of the Total Environment 2025年 979卷

作者： Li, Yingjie Xia, Zilong Nguyen, Lan Wan, Ho Yi Wan, Luwen Wang, Mengqiu Jia, Nan Matli, Venkata Rohith Reddy Li, Yi Seeley, Megan Moran, Emilio F. Liu, Jianguo Center for Systems Integration and Sustainability Department of Fisheries and Wildlife Michigan State University East LansingMI48823 United States Environmental Science and Policy Program Michigan State University East LansingMI48823 United States Natural Capital Project Woods Institute for the Environment Doerr School of Sustainability Stanford University StanfordCA94305 United States Jiangsu Provincial Key Laboratory of Geographic Information Science and Technology Key Laboratory for Land Satellite Remote Sensing Applications of Ministry of Natural Resources School of Geography and Ocean Science Nanjing University Jiangsu Nanjing210023 China Jiangsu Center for Collaborative Innovation in Geographical Information Resource Development and Application Jiangsu Nanjing210023 China Department of Biological Sciences University of Calgary CalgaryABT2N 1N4 Canada Department of Wildlife California State Polytechnic University Humboldt ArcataCA95521 United States Department of Wildlife Ecology and Conservation University of Florida GainesvilleFL32611 United States Department of Earth System Science Stanford University StanfordCA94305 United States Earth and Environmental Sciences Michigan State University East LansingMI48824 United States School of Remote Sensing and Information Engineering Wuhan University Wuhan430072 China Department of Earth Sciences The University of Hong Kong Hong Kong 999077 China Center for Geospatial Analytics North Carolina State University RaleighNC27607 United States College of the Environment and Ecology Xiamen University Xiamen361102 China School of Geographical Sciences and Urban Planning Arizona State University TempeAZ85281 United States Center for Global Discovery and Conservation Science Arizona State University TempeAZ85281 United States Center for Global Change and Earth Observations Michigan State University East LansingMI48824 United States Department of Geography Environment and Spatial Science

Spreading marine dead zones (or hypoxia) are threatening coastal ecosystems and affecting billions of people's livelihoods globally. However, the lack of field observations makes it challenging to estimate dead zones with spatial precision and across large scales. While satellites offer great potential for detecting environmental changes through large-scale and temporal consistent data, they have yet to be fully integrated into the spatio-temporal dynamic mapping of hypoxia. To address this limitation, we integrated satellite imagery with field observations in random forest models on the Google Earth Engine platform to characterize dead zone dynamics from 2000 to 2019. We applied the workflow to the Gulf of Mexico, which has the largest dead zones in North America. Our model explained 64 % (± 5 %) of the variance in predicting dead zones using satellite data. The analysis revealed that dead zones in the Gulf peaked in 2009 (17,699 ± 679 km2) and contracted afterward in terms of both size and persistence (% days with hypoxia). Despite this contraction, the average size between 2010 and 2019 was twice that of the coastal reduction goal (2) set by the Gulf of Mexico Hypoxia Task Force. Furthermore, dead zones occurred more frequently in the western Gulf, and nearly half of the western region experienced dead zones annually. In addition to inter-annual changes, our analysis highlighted the intra-annual dynamics of this phenomenon. Notably, dead zones expanded in June, peaking in size from mid-August to early September. The high temporal and spatial resolution of this dataset allows policymakers to develop targeted management plans and environmental policies. Our approach, which incorporates remote sensing for long-term monitoring of coastal dead zones, can be applied to worldwide monitoring initiatives when paired with local field observations. © 2025 Elsevier B.V.

关键词： Eutrophication

Waterline mapping of inland great lake with subpixel accuracy from GF-3 SAR images 6

学校读者我要写书评

暂无评论

Waterline mapping of inland great lake with subpixel accurac...

6th Asia-Pacific Conference on Synthetic Aperture Radar, APSAR 2019

作者： Li, Ning Niu, Shilin Wang, Robert Wu, Lin Guo, Zhengwei College of Computer and Information Engineering Henan University Kaifeng475004 China Henan Key Laboratory of Big Data Analysis and Processing Henan University Kaifeng475004 China Henan Engineering Research Center of Intelligent Technology and Application Kaifeng475004 China Department of Space Microwave Remote Sensing System Institute of Electronics Chinese Academy of Sciences Beijing100190 China

ISBN: (纸本)9781728129129

High-accuracy waterline mapping with Synthetic Aperture Radar (SAR) images is a challenging task because of the inhomogeneities of SAR imagery caused by the speckle noise and complex terrain. This paper presents a novel method for waterline mapping with subpixel accuracy. The proposed method mainly consists of three steps. Firstly, an improved Non-Local filter is adapted to suppress the speckle noise. Secondly, a Fuzzy C-Means (FCM) clustering algorithm is used to extract the pixel-level waterline. Thirdly, along the pixel-level waterline, a novel subpixel-scale waterline extraction method based on bicubic convolution and geometric Active Contour (GAC) model is presented to further improve the accuracy of the waterline. Without loss of generality, the Danjiangkou reservoir (DJKR), which is the largest artificial freshwater lake in Asia, is selected as a study case. SAR images from Chinese GaoFen-3 (GF-3) satellite working on different modes are processed to demonstrate the effectiveness of the proposed method. © 2019 IEEE.

关键词： Synthetic aperture radar