To train a deep neural network relies on a large amount of annotated data. In special scenarios like industry defect detection and medical imaging, it is hard to collect sufficient labeled data all at once. Newly anno...
详细信息
ISBN:
(纸本)9798350324471
To train a deep neural network relies on a large amount of annotated data. In special scenarios like industry defect detection and medical imaging, it is hard to collect sufficient labeled data all at once. Newly annotated data may arrive incrementally. In practice, we also prefer our target model to improve its capability gradually as new data comes in by quick re-training. This work tackles this problem from a data selection prospective by constraining ourselves to always retrain the target model with a fix amount of data after new data comes in. A variational autoencoder (VAE) and an adversarial network are combined for data selection, achieving fast model retraining. This enables the target model to continually learn from a small training set while not losing the information learned from previous iterations, thus incrementally adapting itself to new-coming data. We validate our framework on the LGG Segmentation dataset for the semantic segmentation task.
Diabetes encompasses a complex landscape of glycemic control that varies widely among individuals. However, current methods do not faithfully capture this variability at the meal level. On the one hand, expert-crafted...
详细信息
Diabetes encompasses a complex landscape of glycemic control that varies widely among individuals. However, current methods do not faithfully capture this variability at the meal level. On the one hand, expert-crafted features lack the flexibility of data-driven methods;on the other hand, learned representations tend to be uninterpretable which hampers clinical adoption. In this paper, we propose a hybrid variational autoencoder to learn interpretable representations of CGM and meal data. Our method grounds the latent space to the inputs of a mechanistic differential equation, producing embeddings that reflect physiological quantities, such as insulin sensitivity, glucose effectiveness, and basal glucose levels. Moreover, we introduce a novel method to infer the glucose appearance rate, making the mechanistic model robust to unreliable meal logs. On a dataset of CGM and self-reported meals from individuals with type-2 diabetes and pre-diabetes, our unsupervised representation discovers a separation between individuals proportional to their disease severity. Our embeddings produce clusters that are up to 4x better than naive, expert, black-box, and pure mechanistic features. Our method provides a nuanced, yet interpretable, embedding space to compare glycemic control within and across individuals, directly learnable from in-the-wild data.
variational autoencoders (VAEs) are the state-of-the-art model for recommendation with implicit feedback signals. Unfortunately, implicit feedback suffers from selection bias, e.g., popularity bias, position bias, etc...
详细信息
ISBN:
(纸本)9798400700736
variational autoencoders (VAEs) are the state-of-the-art model for recommendation with implicit feedback signals. Unfortunately, implicit feedback suffers from selection bias, e.g., popularity bias, position bias, etc., and as a result, training from such signals produces biased recommendation models. Existing methods for debiasing the learning process have not been applied in a generative setting. We address this gap by introducing an inverse propensity scoring (IPS) based method for training VAEs from implicit feedback data in an unbiased way. Our IPS-based estimator for the VAE training objective, VAE-IPS, is provably unbiased w.r.t. selection bias. Our experimental results show that the proposed VAE-IPS model reaches significantly higher performance than existing baselines. Our contributions enable practitioners to combine state-of-the-art VAE recommendation techniques with the advantages of bias mitigation for implicit feedback.
Much research has been devoted to the problem of learning fair representations;however, they do not explicitly state the relationship between latent representations. In many real-world applications, there may be causa...
详细信息
ISBN:
(纸本)9783031333736;9783031333743
Much research has been devoted to the problem of learning fair representations;however, they do not explicitly state the relationship between latent representations. In many real-world applications, there may be causal relationships between latent representations. Furthermore, most fair representation learning methods focus on group-level fairness and are based on correlation, ignoring the causal relationships underlying the data. In this work, we theoretically demonstrate that using the structured representations enables downstream predictive models to achieve counterfactual fairness, and then we propose the Counterfactual Fairness variational autoencoder (CF-VAE) to obtain structured representations with respect to domain knowledge. The experimental results show that the proposed method achieves better fairness and accuracy performance than the benchmark fairness methods.
Personalized federated learning (PFL) jointly trains a variety of local models through balancing between knowledge sharing across clients and model personalization per client. This paper addresses PFL via explicit dis...
详细信息
ISBN:
(纸本)9781665468916
Personalized federated learning (PFL) jointly trains a variety of local models through balancing between knowledge sharing across clients and model personalization per client. This paper addresses PFL via explicit disentangling latent representations into two parts to capture the shared knowledge and client-specific personalization, which leads to more reliable and effective PFL. The disentanglement is achieved by a novel Federated Dual variational autoencoder (FedDVA), which employs two encoders to infer the two types of representations. FedDVA can produce a better understanding of the trade-off between global knowledge sharing and local personalization in PFL. Moreover, it can be integrated with existing FL methods and turn them into personalized models for heterogeneous downstream tasks. Extensive experiments validate the advantages caused by disentanglement and show that models trained with disentangled representations substantially outperform those vanilla methods.
With the demand for autonomous control and personalized speech generation, the style control and transfer in Text-to-Speech (TTS) is becoming more and more important. In this paper, we propose a new TTS system that ca...
详细信息
With the demand for autonomous control and personalized speech generation, the style control and transfer in Text-to-Speech (TTS) is becoming more and more important. In this paper, we propose a new TTS system that can perform style transfer with interpretability and high fidelity. Firstly, we design a TTS system that combines variational autoencoder (VAE) and diffusion refiner to get refined mel-spectrograms. Specifically, a two-stage and a one-stage system are designed respectively, to improve the audio quality and the performance of style transfer. Secondly, a diffusion bridge of quantized VAE is designed to efficiently learn complex discrete style representations and improve the performance of style transfer. To have a better ability of style transfer, we introduce ControlVAE to improve the reconstruction quality and have good interpretability simultaneously. Experiments on LibriTTS dataset demonstrate that our method is more effective than baseline models
Topic modeling is a dominant method for exploring document collections on the web and in digital libraries. Recent approaches to topic modeling use pretrained contextualized language models and variational autoencoder...
详细信息
ISBN:
(数字)9783031282386
ISBN:
(纸本)9783031282379;9783031282386
Topic modeling is a dominant method for exploring document collections on the web and in digital libraries. Recent approaches to topic modeling use pretrained contextualized language models and variational autoencoders. However, large neural topic models have a considerable memory footprint. In this paper, we propose a knowledge distillation framework to compress a contextualized topicmodel without loss in topic quality. In particular, the proposed distillation objective is to minimize the cross-entropy of the soft labels produced by the teacher and the student models, as well as to minimize the squared 2-Wasserstein distance between the latent distributions learned by the two models. Experiments on two publicly available datasets show that the student trained with knowledge distillation achieves topic coherence much higher than that of the original student model, and even surpasses the teacher while containing far fewer parameters than the teacher. The distilled model also outperforms several other competitive topic models on topic coherence.
Some recent methods address few-shot classification by integrating visual and semantic prototypes. However, they usually ignore the difference in feature structure between the visual and semantic modalities, which lea...
详细信息
ISBN:
(纸本)9798400701085
Some recent methods address few-shot classification by integrating visual and semantic prototypes. However, they usually ignore the difference in feature structure between the visual and semantic modalities, which leads to limited performance improvements. In this paper, we propose a novel method, called bimodal integrator (BMI), to better integrate visual and semantic prototypes. In BMI, we first construct a latent space for each modality via a variational autoencoder, and then align the semantic latent space to the visual latent space. Through this semantics-to-vision alignment, the semantic modality is mapped to the visual latent space and has the same feature structure as the visual modality. As a result, the visual and semantic prototypes can be better integrated. In addition, based on the multivariate Gaussian distribution and the prompt engineering, a data augmentation scheme is designed to ensure the accuracy of modality alignment during the training process. Experimental results demonstrate that BMI significantly improves few-shot classification, making simple baselines outperform the most advanced methods on miniImageNet and tieredImageNet datasets.
A surge in research has occurred because of current developments in single-cell technologies. Above all, single-cell Assay for Transposase-Accessible Chromatin with high throughput sequencing (scATAC-seq) is a popular...
详细信息
A surge in research has occurred because of current developments in single-cell technologies. Above all, single-cell Assay for Transposase-Accessible Chromatin with high throughput sequencing (scATAC-seq) is a popular approach of analyzing chromatin accessibility differences at the level of single cell, either within or between groups. As a result, it is critical to examine cell heterogeneity at a previously unseen level and to identify both recognized and unknown cell types. However, with the ever-increasing number of cells engendered by technological development and the characteristics of the data, such as high noise, sparsity and dimension, challenges in distinguishing cell types have emerged. We propose scVAEBGM, which integrates a variational autoencoder (VAE) with a Bayesian Gaussian-mixture model (BGM) to process and analyze scATAC-seq data. This method combines and takes benefits of a Bayesian Gaussian mixture model to estimate the number of cell types without determining the cluster number in a beforehand. In other words, the size of the clusters is inferred from the data, thus avoiding biases introduced by subjective assessments when manually determining the size of the clusters. Additionally, the method is more robust to noise and can better represent single-cell data in lower dimensions. We also create a further clustering strategy. It is indicated by experiments that further clustering based on the already completed clustering can improve the clustering accuracy again. We test on six public datasets, and scVAEBGM outperforms various dimension reduction baselines. In downstream applications, scVAEBGM can reveal biological cell types. [GRAPHICS] .
This paper presents a new sequential learning via a planning strategy where the future samples are predicted by reflecting the past experiences. Such a strategy is appealing to implement an intelligent machine which f...
详细信息
This paper presents a new sequential learning via a planning strategy where the future samples are predicted by reflecting the past experiences. Such a strategy is appealing to implement an intelligent machine which foresees multiple time steps instead of predicting step by step. In particular, a flexible sequential learning is developed to directly predict future states without visiting all intermediate states. A Bayesian approach to multi-temporal-difference neural network is accordingly proposed to calculate the stochastic belief state for an abstract state machine so as to capture large-span context as well as make high-level prediction. Importantly, the sequence data are represented by multiple jumpy states with varying temporal differences. A Bayesian state machine is trained by maximizing the variational lower bound of log likelihood of sequence data. A generalized sequence model with various number of Markov states is derived with the simplified realization to the previous temporal-difference variational autoencoder. The predictive states are learned to roll forward with jumps. Experiments show that this approach is substantially trained to predict jumpy states in various types of sequence data.
暂无评论