版权所有:内蒙古大学图书馆 技术提供:维普资讯• 智图
内蒙古自治区呼和浩特市赛罕区大学西街235号 邮编: 010021
作者机构:Beijing Inst Technol Beijing Key Lab Embedded Real Time Informat Proc Sch Informat & Elect Beijing 100081 Peoples R China German Aerosp Ctr DLR Remote Sensing Technol Inst IMF D-82234 Wessling Germany Univ Grenoble Alpes GIPSA Lab Ctr Natl Rech Sci CNRS Grenoble Inst TechnolGrenoble INP F-38000 Grenoble France Univ Grenoble Alpes Grenoble Inst Technol Ctr Natl Rech Sci CNRS Lab Jean Kuntzmann LJKInriaGrenoble INP F-38000 Grenoble France Helmholtz Zentrum Dresden Rossendorf Machine Learning Grp Helmholtz Inst Freiberg Resource Technol D-09599 Freiberg Germany Inst Adv Res Artificial Intelligence IARAI A-1030 Vienna Austria
出 版 物:《IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING》 (IEEE Trans Geosci Remote Sens)
年 卷 期:2022年第60卷
页 面:1页
核心收录:
学科分类:0808[工学-电气工程] 1002[医学-临床医学] 08[工学] 0708[理学-地球物理学] 0816[工学-测绘科学与技术]
基 金:MIAI@Grenoble Alpes [ANR-19-P3IA-0003]
主 题:Remote sensing Time series analysis Task analysis Synthetic aperture radar Semantics Optical sensors Optical imaging Feature mask module image time series inherent ambiguities modality translation remote sensing
摘 要:Modality translation, which aims to translate images from a source modality to a target one, has attracted a growing interest in the field of remote sensing recently. Compared to translation problems in multimedia applications, modality translation in remote sensing often suffers from inherent ambiguities, i.e., a single input image could correspond to multiple possible outputs, and the results may not be valid in the following image interpretation tasks, such as classification and change detection. To address these issues, we make the attempt to utilizing time-series data to resolve the ambiguities. We propose a novel multimodality image translation framework, which exploits temporal information from two aspects: 1) by introducing a guidance image from given temporally neighboring images in the target modality, we employ a feature mask module and transfer semantic information from temporal images to the output without requiring the use of any semantic labels and 2) while incorporating multiple pairs of images in time series, a temporal constraint is formulated during the learning process in order to guarantee the uniqueness of the prediction result. We also build a multimodal and multitemporal dataset that contains synthetic aperture radar (SAR), visible, and short-wave length infrared band (SWIR) image time series of the same scene to encourage and promote research on modality translation in remote sensing. Experiments are conducted on the dataset for two cross-modality translation tasks (SAR to visible and visible to SWIR). Both qualitative and quantitative results demonstrate the effectiveness and superiority of the proposed model.