Vertical Federated Learning (VFL) is a prevalent paradigm designed to facilitate collaboration between multiple entities possessing distinct feature sets yet sharing a common user base for model training. However, cha...
详细信息
ISBN:
(数字)9789819754984
ISBN:
(纸本)9789819754977;9789819754984
Vertical Federated Learning (VFL) is a prevalent paradigm designed to facilitate collaboration between multiple entities possessing distinct feature sets yet sharing a common user base for model training. However, challenges emerge when generating predictions for users with partial data, such as the scenario where a new user registers with only one participating company and submits a limited number of features. The typical method of addressing missing data involves padding the absent values with zeros or mean figures, which can cause an out-of-distribution issue, thereby precipitating a notable deterioration in model accuracy. To address this issue, we introduce DiVerFed, a distribution-aware VFL framework for missing information. DiVerFed's primary objective is to maintain robust model performance even when confronted with incomplete user information. Within this framework, we treat the VFL's top model as an encoder responsible for capturing the underlying data distribution. To make this distribution effectively, we incorporate a reconstruction component. Furthermore, we simulate the scenario of missing information by integrating a Feature-wise Dropout module during the training phase. To enhance the framework's efficacy in classification tasks, we also incorporate label information within the loss function to leverage a class-aware distribution to bolster the model's accuracy. Our experimental analyses confirm that DiVerFed significantly outperforms conventional approaches in classification tasks when information from only one party is accessible.
We consider multi-solution optimization and generative models for the generation of diverse artifacts and the discovery of novel solutions. In cases where the domain's factors of variation are unknown or too compl...
详细信息
ISBN:
(纸本)9781450383509
We consider multi-solution optimization and generative models for the generation of diverse artifacts and the discovery of novel solutions. In cases where the domain's factors of variation are unknown or too complex to encode manually, generative models can provide a learned latent space to approximate these factors. When used as a search space, however, the range and diversity of possible outputs are limited to the expressivity and generative capabilities of the learned model. We compare the output diversity of a quality diversity evolutionary search performed in two different search spaces: 1) a predefined parameterized space and 2) the latent space of a variational autoencoder model. We find that the search on an explicit parametric encoding creates more diverse artifact sets than searching the latent space. A learned model is better at interpolating between known data points than at extrapolating or expanding towards unseen examples. We recommend using a generative model's latent space primarily to measure similarity between artifacts rather than for search and generation. Whenever a parametric encoding is obtainable, it should be preferred over a learned representation as it produces a higher diversity of solutions.
variational autoencoders have gained considerable attention due to their capacity of encoding high dimensional data into a lower dimensional latent space. In this context, several methods have been proposed with the o...
详细信息
ISBN:
(纸本)9781728198354
variational autoencoders have gained considerable attention due to their capacity of encoding high dimensional data into a lower dimensional latent space. In this context, several methods have been proposed with the objective of producing disentangled representations. In this work, we propose a weakly supervised model that explicitly disentangles the factors of variation of a dataset in separate subspaces using a pairwise architecture. We also create a framework that encourages conditional image generation according to the desired factor of variation, by controlling these subspaces. This is achieved by introducing an additional network trained with a triplet loss. Its output approximates representations of images generated from the same factor and push the ones of images generated from different factors apart. Experiments are carried out on widely used datasets, and show that our model is able to disentangle specified factors of variation, and to generate new data while constraining desired properties, even when these factors have small influence on reconstruction loss.
This paper presents a path tracking algorithm for autonomous driving that learns an action command from high-dimensional input state vector, e.g., grid-maps. The learning framework is built upon a variational auto-enc...
详细信息
ISBN:
(纸本)9788993215182
This paper presents a path tracking algorithm for autonomous driving that learns an action command from high-dimensional input state vector, e.g., grid-maps. The learning framework is built upon a variational auto-encoder (VAE) and takes advantage of the efficient path tracking results that already exist or can be obatined from a human expert. The VAE is known to give smooth latent represetations of the input data and we make one of the latent attribute follow the expert's command. We implement an autonomous driving system on the open source robot simulator (Webots) and collect the demonstration data for training the VAE. Numerical results show that the proposed tracking method can drive a vehicle robustly without explicitly detecting the road features.
Text-to-speech synthesis (TTS) has been used as a data augmentation approach for automatic speech recognition (ASR), leveraging additional texts for ASR training. However, in low resource tasks, usually only a limited...
详细信息
ISBN:
(纸本)9781509066315
Text-to-speech synthesis (TTS) has been used as a data augmentation approach for automatic speech recognition (ASR), leveraging additional texts for ASR training. However, in low resource tasks, usually only a limited number of speakers are available, leading to the lack of speaker variations in synthetic speech. In this paper, we propose a novel speaker augmentation approach which can synthesize data with sufficient speaker and text diversity. Here, an end-to-end TTS system is trained with speaker representations from a variational auto-encoder (VAE), which enables TTS to synthesize speech from unseen new speakers via sampling from the trained latent distribution. As a new type of data augmentation approach, speaker augmentation can be combined with traditional feature augmentation approaches, such as SpecAugment. Experiments on a switchboard task show that, given 50 hours of data, the proposed speaker augmentation with SpecAugment significantly reduces word error rate (WER) by 30% relative compared to the system without any data augmentation, and about 18% relative compared to the system with SpecAugment.
Network intrusion detection is one of the most import tasks in today's cyber-security defence applications. In the field of unsupervised learning methods, variants of variational autoencoders promise good results....
详细信息
ISBN:
(纸本)9789897584916
Network intrusion detection is one of the most import tasks in today's cyber-security defence applications. In the field of unsupervised learning methods, variants of variational autoencoders promise good results. The fact that these methods are very computationally time-consuming is hardly considered in the literature. Therefore, we propose a new two-stage approach combining a fast preprocessing or filtering method with a variational autoencoder using reconstruction probability. We investigate several types of anomaly detection methods mainly based on autoencoders to select a pre-filtering method and to evaluate the performance of our concept on two well established datasets.
Late-stage identification of patients at risk of myocardial infarction (MI) inhibits delivery of effective preventive care, increasing the burden on healthcare services and affecting patients' quality of life. Hen...
详细信息
ISBN:
(纸本)9783031120534;9783031120527
Late-stage identification of patients at risk of myocardial infarction (MI) inhibits delivery of effective preventive care, increasing the burden on healthcare services and affecting patients' quality of life. Hence, standardised non-invasive, accessible, and low-cost methods for early identification of patient's at risk of future MI events are desirable. In this study, we demonstrate for the first time that retinal optical coherence tomography (OCT) imaging can be used to identify future adverse cardiac events such as MI. We propose a binary classification network based on a task-aware variational autoencoder (VAE), which learns a latent embedding of patients' OCT images and uses the former to classify the latter into one of two groups, i.e. whether they are likely to have a heart attack (MI) in the future or not. Results obtained for experiments conducted in this study (AUROC 0.74 +/- 0.01, accuracy 0.674 +/- 0.007, precision 0.657 +/- 0.012, recall 0.678 +/- 0.017 and fl-score 0.653 +/- 0.013) demonstrate that our task-aware VAE-based classifier is superior to standard convolution neural network classifiers at identifying patients at risk of future MI events based on their retinal OCT images. This proof-of-concept study indicates that retinal OCT imaging could be used as a low-cost alternative to cardiac magnetic resonance imaging, for identifying patients at risk of MI early.
It is necessary to maintain factory production equipment so it remains in a safe or stable state. Since post-maintenance involves unplanned device shutdowns and greatly affects a wide range of production areas, both b...
详细信息
ISBN:
(纸本)9783030212483;9783030212476
It is necessary to maintain factory production equipment so it remains in a safe or stable state. Since post-maintenance involves unplanned device shutdowns and greatly affects a wide range of production areas, both behind and ahead of the device, it is often better to prevent such breakdowns by using data, about the time the device has been in service or the number of times it has been used, to replace parts via preventive maintenance. That said, recent advances in IoT-related technology, sensors, and data-acquisition computers, the low cost of cloud databases, and simpler technology, have led to a surge of interest in so-called predictive maintenance, based on monitoring the status of the equipment. Due to recent advances in deep learning, it has become possible to accurately estimate machine states using multidimensional features. Here, we evaluate two methods of estimating a machines state based on acceleration data using deep learning, and compare their accuracy and utility for equipment maintenance.
Paraphrase generation is a challenging task that involves expressing the meaning of a sentence using synonyms or different phrases, either to achieve variations or a certain stylistic response. Most previous sequence-...
详细信息
ISBN:
(纸本)9781728103068
Paraphrase generation is a challenging task that involves expressing the meaning of a sentence using synonyms or different phrases, either to achieve variations or a certain stylistic response. Most previous sequence-to-sequence (Seq2Seq) models focus on either generating variations or preserving the content. We mainly address the issue of preserving the content in a sentence while generating diverse paraphrases. In this paper, we propose a novel approach for paraphrase generation using variational autoencoder (VAE) and Pointer Generator Network (PGN). The proposed model uses a copy mechanism to control the content transfer, a VAE to introduce variations and a training technique to restrict the gradient flow for efficient learning. Our evaluations on QUORA and MS COCO datasets show that our model outperforms the state-of-the-art approaches and the generated paraphrases are highly diverse as well as consistent with their original meaning.
Emotional Voice Conversion (EVC) is a task that aims to convert the emotional state of speech from one to another while preserving the linguistic information and identity of the speaker. However, many studies are limi...
详细信息
Emotional Voice Conversion (EVC) is a task that aims to convert the emotional state of speech from one to another while preserving the linguistic information and identity of the speaker. However, many studies are limited by the requirement for parallel speech data between different emotional patterns, which is not widely available in real-life applications. Furthermore, the annotation of emotional data is highly time-consuming and labor-intensive. To address these problems, in this paper, we propose SGEVC, a novel semi-supervised generative model for emotional voice conversion. This paper demonstrates that using as little as 1% supervised data is sufficient to achieve EVC. Experimental results show that our proposed model achieves state-of-the-art (SOTA) performance and consistently outperforms EVC baseline frameworks.
暂无评论