检索结果-内蒙古大学图书馆

Retro-Remote Sensing: Generating images From Ancient texts

IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING 2019年第3期12卷 950-960页

作者： Bejiga, Mesay Belete Melgani, Farid Vascotto, Antonio Univ Trento Dept Comp Sci & Informat Engn I-38123 Trento Italy

The data available in the world come in various modalities, such as audio, text, image, and video. Each data modality has different statistical properties. Understanding each modality, individually, and the relationship between the modalities is vital for a better understanding of the environment surrounding us. Multimodal learning models allow us to process and extract useful information from multimodal sources. For instance, image captioning and text-to-image synthesis are examples of multimodal learning, which require mapping between texts and images. In this paper, we introduce a research area that has never been explored by the remote sensing community, namely the synthesis of remote sensing images from text descriptions. More specifically, in this paper, we focus on exploiting ancient text descriptions of geographical areas, inherited from previous civilizations, to generate equivalent remote sensing images. From a methodological perspective, we propose to rely on generative adversarial networks (GANs) to convert the text descriptions into equivalent pixel values. GANs are a recently proposed class of generative models that formulate learning the distribution of a given dataset as an adversarial competition between two networks. The learned distribution is represented using the weights of a deep neural network and can be used to generate more samples. To fulfill the purpose of this paper, we collected satellite images and ancient texts to train the network. We present the interesting results obtained and propose various future research paths that we believe are important to further develop this new research area.

关键词： Convolutional neural networks (CNN) deep learning generative adversarial networks (GAN) multimodal learning remote sensing text-to-image synthesis

来源：评论

学校读者我要写书评

暂无评论

On Conditioning GANs to Hierarchical Ontologies

On Conditioning GANs to Hierarchical Ontologies

引用

30th International Conference on Database and Expert Systems Applications (DEXA)

作者： Eghbal-zadeh, Hamid Fischer, Lukas Hoch, Thomas Johannes Kepler Univ Linz LIT AI Lab Altenberger Str 69 Linz Austria Johannes Kepler Univ Linz Inst Computat Percept Altenberger Str 69 Linz Austria Software Competence Ctr Hagenberg GmbH SCCH Softwarepk 21 A-4232 Hagenberg Austria

ISBN: (纸本)9783030276843;9783030276836

The recent success of Generative Adversarial Networks (GAN) is a result of their ability to generate high quality images given samples from a latent space. One of the applications of GANs is to generate images from a text description, where the text is first encoded and further used for the conditioning in the generative model. In addition to text, conditional generative models often use label information for conditioning. Hence, the structure of the meta-data and the ontology of the labels is important for such models. In this paper, we propose Ontology Generative Adversarial Networks (O-GANs) to handle the complexities of the data with label ontology. We evaluate our model on a dataset of fashion images with hierarchical label structure. Our results suggest that the incorporation of the ontology, leads to better image quality as measured by Fr ' echet Inception Distance and Inception Score. Additionally, we show that the O-GAN better matches the generated images to their conditioning text, compared to models that do not incorporate the label ontology.

关键词： Generative Adversarial Networks text-to-image synthesis Ontology-driven deep learning

来源：评论

学校读者我要写书评

暂无评论

TOWARDS GENERATING REMOTE SENSING imageS OF THE FAR PAST 39

TOWARDS GENERATING REMOTE SENSING IMAGES OF THE FAR PAST

引用

IEEE International Geoscience and Remote Sensing Symposium (IGARSS)

作者： Bejiga, Mesay Belete Melgani, Farid Univ Trento Dept Informat Engn & Comp Sci I-38123 Trento Italy

ISBN: (纸本)9781538691540

text-to-image synthesis is a research topic that has not yet been addressed by the remote sensing community. It consists in learning a mapping from text description to image pixels. In this paper, we propose to address this topic for the very first time. More specifically, our objective is to convert ancient text descriptions of geographic areas written by past explorers into an equivalent remote sensing image. To this effect, we rely on generative adversarial networks (GANs) to learn the mapping. GANs aim to represent the distribution of a dataset using weights of a deep neural network, which are trained as an adversarial competition between two networks. We collected ancient texts dating back to 7 BC to train our network and obtained interesting results, which form the basis to highlight future research directions to advance this new topic.

关键词： GANs text-to-image synthesis remote sensing

来源：评论

学校读者我要写书评

暂无评论

LAYOUT AND CONtext UNDERSTANDING FOR image synthesis WITH SCENE GRAPHS 26

LAYOUT AND CONTEXT UNDERSTANDING FOR IMAGE SYNTHESIS WITH SC...

引用

26th IEEE International Conference on image Processing (ICIP)

作者： Talavera, Arces Tan, Daniel Stanley Azcarraga, Arnulfo Hua, Kai-Lung Natl Taiwan Univ Sci & Technol Dept CSIE Taipei Taiwan De La Salle Univ Software Technol Dept Manila Philippines

ISBN: (纸本)9781538662496

Advancements on text-to-image synthesis generate remarkable images from textual descriptions. However, these methods are designed to generate only one object with varying attributes. They face difficulties with complex descriptions having multiple arbitrary objects since it would require information on the placement and sizes of each object in the image. Recently, a method that infers object layouts from scene graphs has been proposed as a solution to this problem. However, their method uses only object labels in describing the layout, which fail to capture the appearance of some objects. Moreover, their model is biased towards generating rectangular shaped objects in the absence of ground-truth masks. In this paper, we propose an object encoding module to capture object features and use it as additional information to the image generation network. We also introduce a graph-cuts based segmentation method that can infer the masks of objects from bounding boxes to better model object shapes. Our method produces more discernible images with more realistic shapes as compared to the images generated by the current state-of-the-art method.

关键词： Generative Models text-to-image synthesis Scene Graphs

来源：评论

学校读者我要写书评

暂无评论

PororoGAN: An Improved Story Visualization Model on Pororo-SV Dataset 2019

PororoGAN: An Improved Story Visualization Model on Pororo-S...

引用

Proceedings of the 2019 3rd International Conference on Computer Science and Artificial Intelligence

作者： Gangyan Zeng Zhaohui Li Yuan Zhang School of Information and Communication Engineering Communication University of China Beijing China School of Data Science and Media Intelligence Communication University of China Beijing China

ISBN: (纸本)9781450376273

Generating a sequence of images from a multi-sentence paragraph is a recently proposed task called Story-Visualization. In this task, how to keep the global consistency across dynamic scenes and characters in the story flow is the distinct difference from other single-image works, which is also a significant challenge. However, the visual quality and semantic relevance of existing results are not satisfying when handling datasets with high semantic complexity, such as Pororo-SV cartoon dataset. To address this issue, we propose a new story visualization model named PororoGAN, which jointly considers story-to-image-sequence, sentence-to-image and word-to-image-patch alignment. In particular, we introduce ASE (aligned sentence encoder) and AWE (attentional word encoder) to improve global and local relevance, respectively. Additionally, we add an image patches discriminator to improve the reality of results. Both quantitative and qualitative studies show that PororoGAN outperforms the state-of-the-art models.

关键词： attention Generative Adversarial Network story visualization text-to-image synthesis

来源：评论

学校读者我要写书评

暂无评论

Retro-Remote Sensing With Doc2Vec Encoding

Retro-Remote Sensing With Doc2Vec Encoding

引用

Geoscience and Remote Sensing Symposium (M2GARSS), Mediterranean and Middle-East

作者： Mesay Belete Bejiga Genc Hoxha Farid Melgani Department of Information Engineering and Computer Science University of Trento Trento Italy

ISBN: (数字)9781728121901

ISBN: (纸本)9781728121918

In this work, we attempt to address the issue of developing a sophisticated text encoder for retro-remote sensing application. The encoder converts ancient landscape descriptions into a fixed-size vector that, adequately, represents the available information. This vector is then used as a conditioning data to a Generative adversarial network (GAN) that synthesizes the equivalent image. We propose using a pre-trained Doc2Vec encoder for text encoding and train a Wasserstein GAN (a variant of GAN) to convert landscape descriptions written by travelers and geographers into the equivalent image. Qualitative and quantitative analysis of the generated images signify usefulness of the proposed method.

关键词： Deep learning Generative adversarial networks Retro-remote sensing text embedding text-to-image synthesis ENCODER geographers Travelers coding QUANTITATIVE ANALYSIS texts images gallium nitrate

来源：评论

学校读者我要写书评

暂无评论

Photo Realistic Generation from Arabic text Description Based on Generative Adversarial Networks

引用

ACM Transactions on Asian and Low-Resource Language Information Processing 1000年

作者： Sara Maher Mathematics Mohamed Loey Department Faculty of Science Benha University Benha Egypt Computer Science Department Faculty of Computer and Artificial intelligence Benha University Benha Egypt

Generating accurate high-resolution images from text representations is a difficult problem in computer vision that has a wide range of functional applications. text-to-image conversion is not dissimilar to the difficulties inherent in language processing. For example, each text meaning can be encoded in two distinct human languages, while photographs and text are two distinct encoding languages for similar data. However, these are two distinct issues, since text-to-image or image-to-text conversions are extremely multimodal in nature. The proposed model for creating 256 × 256 realistic images from Arabic text descriptions is discussed in this article. The relationship between an Arabic word in a sentence and its component in a picture as introduced in this paper using the DAMSM model. This model teaches two neural networks how to map the Arabic picture and word sub-regions of a full sentence to a shared semantic model. It performs well as an Arabic-text encoder and a picture encoder. We start with the Modified-Arabic dataset and train the model from scratch. The proposed model establishes a new standard for the conversion of Arabic text to realistic pictures. A mutation happens when Arabic is used as the primary language for converting Arabic texts to real images. The inception score of the newly introduced model reported by 3.42 ± .05 on the CUB database.

关键词： Generative Adversarial Networks (GANs) Natural Language Processing (NLP) text-to-image synthesis image-realistic generation

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：