Building assessment is highly prioritized during rescue operations and damage relief after hurricane disasters. Although machine learning has made remarkable improvement in building damage classification, it remains c...
详细信息
Building assessment is highly prioritized during rescue operations and damage relief after hurricane disasters. Although machine learning has made remarkable improvement in building damage classification, it remains challenging because classifiers must be trained using a massive amount of labeled data. Furthermore, data labeling is labor intensive, costly, and unavailable after a disaster. To address this issue, we propose an unsupervised domain adaptation method with aligned discriminative and representative features (ADRF), which leverage a substantial amount of labeled data of relevant disaster scenes for new classification tasks. The remote sensing imageries of different disasters are collected using different sensors, viewpoints, times, even at various places. Compared with the public datasets used in the domain adaptation community, the remote sensing imageries are more complicated which exhibit characteristics of lower discrimination between categories and higher diversity within categories. As a result, pursuing domain invariance is a huge challenge. To achieve this goal, we build a framework with ADRF to improve the discriminative and representative capability of the extracted features to facilitate the classification task. The ADRF framework consists of three pipelines: a classifier for the labeled data of the source domain and one autoencoder each for the source and target domains. The latent variables of autoencoders are forced to observe unit Gaussian distributions by minimizing the maximum mean discrepancy (MMD), whereas the marginal distributions of both domains are aligned via the MMD. As a case study, two challenging transfer tasks using the hurricane Sandy, Maria, and Irma datasets are investigated. Experimental results demonstrate that ADRF achieves overall accuracy of 71.6% and 84.1% in the transfer tasks from dataset Sandy to dataset Maria and dataset Irma, respectively.
Distributed representations, or embeddings, are commonly learned without supervision on very large unannotated corpora for natural language processing. In speech processing, deep network-based representations such as ...
详细信息
Distributed representations, or embeddings, are commonly learned without supervision on very large unannotated corpora for natural language processing. In speech processing, deep network-based representations such as bottlenecks and x-vectors have had some success,but are limited to supervised or partly supervised settings where annotations are available and are not optimized to separate underlying factors. Here, we propose a generative model with deep encoders and decoders that can learn interpretable speech representations without supervision. Our inductive biases operate as prior distributions in a variational autoencoder model and allow us to separate several latent variables along a continuous range of time-scale properties, as opposed to binary oppositions or hierarchical factorization that have been previously proposed. On simulated data, we confirm that these biases enable the model to accurately recover phonetic and speaker underlying factors. On TIMIT and LibriSpeech, they yield representations that separate phonetic and speaker information, as evidenced by unsupervised results on downstream phoneme and speaker classification tasks using a simple k-means classifier.
Considering the high computation cost required in conventional computation fluid dynamic simulations, machine learning methods have been introduced to flow dynamic simulations in years, aiming on reducing CPU time. In...
详细信息
Considering the high computation cost required in conventional computation fluid dynamic simulations, machine learning methods have been introduced to flow dynamic simulations in years, aiming on reducing CPU time. In this work, we propose a hybrid deep adversarial autoencoder (VAE-GAN) to integrate generative adversarial network (GAN) and variational autoencoder (VAE) for predicting parameterized nonlinear fluid flows in spatial and temporal dimensions. High-dimensional inputs are compressed into the low-dimensional representations by nonlinear functions in a convolutional encoder. In this way, the predictive fluid flows reconstructed in a convolutional decoder contain the dynamic fluid flow physics of high nonlinearity and chaotic nature. In addition, the low-dimensional representations are applied to the adversarial network for model training and parameter optimization, which enables fast computation process. The capability of the hybrid VAE-GAN is illustrated by varying inputs on a flow past a cylinder test case as well as a second case of water column collapse. Numerical results show that this hybrid VAE-GAN has successfully captured the spatio-temporal flow features with CPU speed-up of three orders of magnitude. These promising results suggest that the hybrid VAE-GAN can play a critical role in efficiently and accurately predicting complex flows in future research efforts. (c) 2020 Elsevier B.V. All rights reserved.
Classification is among the core tasks in machine learning. Existing classification algorithms are typically based on the assumption of at least roughly balanced data classes. When performing tasks involving imbalance...
详细信息
Classification is among the core tasks in machine learning. Existing classification algorithms are typically based on the assumption of at least roughly balanced data classes. When performing tasks involving imbalanced data, such classifiers ignore the minority data in consideration of the overall accuracy. The performance of traditional classification algorithms based on the assumption of balanced data distribution is insufficient because the minority-class samples are often more important than others, such as positive samples, in disease diagnosis. In this study, we propose a cost-sensitive variational autoencoding classifier that combines data-level and algorithm-level methods to solve the problem of imbalanced data classification. Cost-sensitive factors are introduced to assign a high cost to the misclassification of minority data, which biases the classifier toward minority data. We also designed misclassification costs closely related to tasks by embedding domain knowledge. Experimental results show that the proposed method performed the classification of bulk amorphous materials well.
Network embedding aims to learn low-dimensional representations for nodes in social networks, which can serve many applications, such as node classification, link prediction and visualization. Most of network embeddin...
详细信息
Network embedding aims to learn low-dimensional representations for nodes in social networks, which can serve many applications, such as node classification, link prediction and visualization. Most of network embedding methods focus on learning the representations solely from the topological structure. Recently, attributed network embedding, which utilizes both the topological structure and node content to jointly learn latent representations, becomes a hot topic. However, previous studies obtain the joint representations by directly concatenating the one from each aspect, which may lose the correlations between the topological structure and node content. In this paper, we propose a new attributed network embedding method, TLVANE, which can address the drawback by exploiting the deep variational autoencoders (VAEs). Particularly, a two-level VAE model is built, where the first-level accounts for the joint representations while the second for the embeddings of each aspect. Extensive experiments on three real-world datasets have been conducted, and the results demonstrate the superiority of the proposed method against state-of-the-art competitors.
As speech-based user interfaces integrated in the devices such as AI speakers become ubiquitous, a large amount of user voice data is being collected to enhance the accuracy of speech recognition systems. Since such v...
详细信息
As speech-based user interfaces integrated in the devices such as AI speakers become ubiquitous, a large amount of user voice data is being collected to enhance the accuracy of speech recognition systems. Since such voice data contain personal information that can endanger the privacy of users, the issue of privacy protection in the speech data has garnered increasing attention after the introduction of the General Data Protection Regulation in the EU, which implies that restrictions and safety measures for the use of speech data become essential. This study aims to filter the speaker-related voice biometrics present in speech data such as voice fingerprint without altering the linguistic content to preserve the usefulness of the data while protecting the privacy of users. To achieve this, we propose an algorithm that produces anonymized speeches by adopting many-to-many voice conversion techniques based on variational autoencoders (VAEs) and modifying the speaker identity vectors of the VAE input to anonymize the speech data. We validated the effectiveness of the proposed method by measuring the speaker-related information and the original linguistic information retained in the resultant speech, using an open source speaker recognizer and a deep neural network-based automatic speech recognizer, respectively. Using the proposed method, the speaker identification accuracy of the speech data was reduced to 0.1-9.2%, indicating successful anonymization, while the speech recognition accuracy was maintained as 78.2-81.3%.
Influence Maximization, aiming at selecting a small set of seed users in a social network to maximize the spread of influence, has attracted considerable attention recently. Most of the existing influence maximization...
详细信息
Influence Maximization, aiming at selecting a small set of seed users in a social network to maximize the spread of influence, has attracted considerable attention recently. Most of the existing influence maximization algorithms focus on the diffusion model of one single-entity, which assumes that only one entity is propagated by users in social network. However, the diffusion situations in real world social networks often involve multiple entities, competitive or complementary, spreading through the whole network, and are more complex than the situations of single independent entity. In this paper, we propose a novel optimization problem, namely, the follower-based influence maximization, which aims to promote a new product into the market by maximizing the influence of a social network where other competitive and complementary products have already been propagating. We tackle this problem by proposing a Recurrent Neural variational model (RNV) and a follower-based greedy algorithm (RNVGA). The RNV model dynamically tracks entity correlations and cascade correlations through a deep generative model and recurrent neural variational inference, while the RNVGA algorithm applies the greedy approach for submodular maximization and efficiently computes the seed node set for the target product. Extensive experiments have been conducted to evaluate effectiveness and efficiency of our method, and the results show the superiority of our method compared with the state-of-the-art methods. (C) 2020 Elsevier Inc. All rights reserved.
In semiconductor manufacturing, several measurement data called wafer maps are obtained in the metrology steps, and the variations in the process are detected by analyzing the wafer map data. Hidden processes or equip...
详细信息
In semiconductor manufacturing, several measurement data called wafer maps are obtained in the metrology steps, and the variations in the process are detected by analyzing the wafer map data. Hidden processes or equipment affecting the process quality variations can be found by comparing the process tracking history and clustered groups of similar wafer maps;thus, clustering analysis is very important to reduce the process quality variations. Currently, clustering wafer maps are becoming more difficult as the wafer maps are formed into more complex patterns along with high-dimensional data. For more effective clustering of complex and high-dimensional wafer maps, we implement a Gaussian mixture model to a variational autoencoder framework to extract features that are more suitable to the clustering environment, and a Dirichlet process is further applied in the variational autoencoder mixture framework for automated one-step clustering. The proposed method is validated using a real dataset from a global semiconductor manufacturing company, and we demonstrate that it is more effective than other competitive methods in determining the number of clusters and clustering wafer map patterns.
In this work, we propose an approach to generate whole-slide image (WSI) tiles by using deep generative models infused with matched gene expression profiles. First, we train a variational autoencoder (VAE) that learns...
详细信息
In this work, we propose an approach to generate whole-slide image (WSI) tiles by using deep generative models infused with matched gene expression profiles. First, we train a variational autoencoder (VAE) that learns a latent, lower-dimensional representation of multi-tissue gene expression profiles. Then, we use this representation to infuse generative adversarial networks (GANs) that generate lung and brain cortex tissue tiles, resulting in a new model that we call RNA-GAN. Tiles generated by RNA-GAN were preferred by expert pathologists compared with tiles generated using traditional GANs, and in addition, RNA-GAN needs fewer training epochs to generate high-quality tiles. Finally, RNA-GAN was able to generalize to gene expression profiles outside of the training set, showing imputation capabilities. A web-based quiz is available for users to play a game distinguishing real and synthetic tiles: https://***/, and the code for RNA-GAN is available here: https://***/gevaertlab/RNA-GAN.
The advent of single-cell multi-omics sequencing technology makes it possible for researchers to leverage multiple modalities for individual cells and explore cell heterogeneity. However, the high-dimensional, discret...
详细信息
The advent of single-cell multi-omics sequencing technology makes it possible for researchers to leverage multiple modalities for individual cells and explore cell heterogeneity. However, the high-dimensional, discrete, and sparse nature of the data make the downstream analysis particularly challenging. Here, we propose an interpretable deep learning method called moETM to perform integrative analysis of high -dimensional single-cell multimodal data. moETM integrates multiple omics data via a product-of-experts in the encoder and employs multiple linear decoders to learn the multi-omics signatures. moETM demonstrates superior performance compared with six state-of-the-art methods on seven publicly available datasets. By applying moETM to the scRNA + scATAC data, we identified sequence motifs corresponding to the transcription factors regulating immune gene signatures. Applying moETM to CITE-seq data from the COVID-19 patients revealed not only known immune cell-type-specific signatures but also composite multi-omics biomarkers of critical conditions due to COVID-19, thus providing insights from both biological and clinical perspectives.
暂无评论