Deep latent variable generative models based on variational autoencoder (VAE) have shown promising performance for audio-visual speech enhancement (AVSE). The underlying idea is to learn a VAE-based audio-visual prior...
详细信息
dB is a web-based interface that serves as a "drummer bot" for exploring interactive groove-making experiences with an AI percussion system. This system, leveraging variational autoencoders (VAEs), transform...
详细信息
In this work, we present DiffVoice, a novel text-to-speech model based on latent diffusion. We propose to first encode speech signals into a phoneme-rate latent representation with a variational autoencoder enhanced b...
详细信息
Recently, the real-time audio variational autoencoder (RAVE) method was developed for high-quality audio waveform synthesis. The RAVE method is based on a variational autoencoder and employs a two-stage training strat...
详细信息
To ensure the safety and reliability of complex industrial processes are very important. Therefore, extracting multiple features of data effectively is a great significance to improve the accuracy of modeling for faul...
详细信息
Bathrooms can be slippery, increasing the risk of falling. In addition, because people enter the bathroom alone, it is difficult to detect accidents immediately when they occur. Therefore, a system is required to quic...
详细信息
Gene-expression profiling enables researchers to quantify transcription levels in cells, thus providing insight into functional mechanisms of diseases and other biological processes. However, because of the high dimen...
详细信息
Gene-expression profiling enables researchers to quantify transcription levels in cells, thus providing insight into functional mechanisms of diseases and other biological processes. However, because of the high dimensionality of these data and the sensitivity of measuring equipment, expression data often contains unwanted confounding effects that can skew analysis. For example, collecting data in multiple runs causes nontrivial differences in the data (known as batch effects), known covariates that are not of interest to the study may have strong effects, and there may be large systemic effects when integrating multiple expression datasets. Additionally, many of these confounding effects represent higher-order interactions that may not be removable using existing techniques that identify linear patterns. We created Confounded to remove these effects from expression data. Confounded is an adversarial variational autoencoder that removes confounding effects while minimizing the amount of change to the input data. We tested the model on artificially constructed data and commonly used gene expression datasets and compared against other common batch adjustment algorithms. We also applied the model to remove cancer-type-specific signal from a pan-cancer expression dataset. Our software is publicly available at https://***/jdayton3/Confounded.
Discriminatively trained neural classifiers can be trusted only when the input data comes from the training distribution (in-distribution). Therefore, detecting out-of-distribution (OOD) samples is very important to a...
详细信息
Discriminatively trained neural classifiers can be trusted only when the input data comes from the training distribution (in-distribution). Therefore, detecting out-of-distribution (OOD) samples is very important to avoid classification errors. In the context of OOD detection for image classification, one of the recent approaches proposes training a classifier called "confident-classifier" by minimizing the standard cross-entropy loss on in-distribution samples and minimizing the KL divergence between the predictive distribution of OOD samples in the low-density "boundary" of in-distribution and the uniform distribution (maximizing the entropy of the outputs). Thus, the samples could be detected as OOD if they have low confidence or high entropy. In this work, we analyze this setting both theoretically and experimentally. We also propose a novel algorithm to generate the "boundary" OOD samples to train a classifier with an explicit "reject" class for OOD samples. We show that this approach is effective in reducing high-confident miss-predictions on OOD samples while maintaining the test-error and high-confidence on the in-distribution samples compared to standard training. We compare our approach against several recent classifier-based OOD detectors including the confident-classifiers on MNIST and FashionMNIST datasets. Overall the proposed approach consistently performs better than others across most of the experiments.
The development of data-driven approaches, such as deep learning, has led to the emergence of systems that have achieved human-like performance in wide variety of tasks. For robotic tasks, deep data-driven models are ...
详细信息
The development of data-driven approaches, such as deep learning, has led to the emergence of systems that have achieved human-like performance in wide variety of tasks. For robotic tasks, deep data-driven models are introduced to create adaptive systems without the need of explicitly programming them. These adaptive systems are needed in situations, where task and environment changes remain unforeseen.
Convolutional neural networks (CNNs) have become the standard way to process visual data in robotics. End-to-end neural network models that operate the entire control task can perform various complex tasks with little feature engineering. However, the adaptivity of these systems goes hand in hand with the level of variation in the training data. Training end-to-end deep robotic systems requires a lot of domain-, task-, and hardware-specific data, which is often costly to provide.
In this work, we propose to tackle this issue by employing a deep neural network with a modular architecture, consisting of separate perception, policy, and trajectory parts. Each part of the system is trained fully on synthetic data or in simulation. The data is exchanged between parts of the system as low-dimensional representations of affordances and trajectories. The performance is then evaluated in a zero-shot transfer scenario using the Franka Panda robotic arm. Results demonstrate that a low-dimensional representation of scene affordances extracted from an RGB image is sufficient to successfully train manipulator policies.
Flow cytometry has been used for several decades to quantitatively analyse single cells in a high-throughput manner. This has resulted in a wide range of medical and biological applications. For example in immunology,...
详细信息
Flow cytometry has been used for several decades to quantitatively analyse single cells in a high-throughput manner. This has resulted in a wide range of medical and biological applications. For example in immunology, flow cytometry data analysis identifies populations of immune cells based on cellular marker expression. As a consequence, flow cytometry has established itself as one of the main instruments in the diagnosis, monitoring and classification of leukemias, human immunodeficiency virus (HIV), and other diseases. State-of-the-art flow cytometers allow the detection of more than 20 cellular parameters, but the instruments used in clinical practice usually have much more limited capabilities. Thus, flow cytometry samples are often split into separate tubes with varying marker combinations to increase the number of measurable markers. However, this poses challenges to the analysis of flow cytometry data because the data from multiple tubes must be integrated while preserving the original biological information. Currently, most of the computational analysis techniques are not able to handle this kind of multi-tube flow cytometry data. In this work, we develop a deep generative modelling framework to enable simultaneous integration, clustering, and visualization of such data. We show that the model, named fcmVI, successfully discovers a latent representation of the cell types from flow cytometry data. Furthermore, we show that the fcmVI model can be used to align multiple tubes originating from the same sample in the latent space. The model is applied to two different data sets from mouse immune cells and human acute myeloid leukemia (AML) samples. In addition, the model enables the imputation of missing marker values for each tube, which is demonstrated on both data sets and the results are compared to nearest neighbor imputation.
暂无评论