In this paper, we propose Squeezed Convolutional variational autoencoder (SCVAE) for anomaly detection in time series data for Edge Computing in Industrial Internet of Things (IIoT). The proposed model is applied to l...
详细信息
ISBN:
(纸本)9781538653845
In this paper, we propose Squeezed Convolutional variational autoencoder (SCVAE) for anomaly detection in time series data for Edge Computing in Industrial Internet of Things (IIoT). The proposed model is applied to labeled time series data from UCI datasets for exact performance evaluation, and applied to real world data for indirect model performance comparison. In addition, by comparing the models before and after applying Fire Modules from SqueezeNet, we show that model size and inference times are reduced while similar levels of performance is maintained.
Grasp planning and most specifically the grasp space exploration is still an open issue in robotics. This article presents an efficient procedure for exploring the grasp space of a multifingered adaptive gripper for g...
详细信息
ISBN:
(纸本)9781728190778
Grasp planning and most specifically the grasp space exploration is still an open issue in robotics. This article presents an efficient procedure for exploring the grasp space of a multifingered adaptive gripper for generating reliable grasps given a known object pose. This procedure relies on a limited dataset of manually specified expert grasps, and use a mixed analytic and data-driven approach based on the use of a grasp quality metric and variational autoencoders. The performances of this method are assessed by generating grasps in simulation for three different objects. On this grasp planning task, this method reaches a grasp success rate of 99.91% on 7000 trials.
Aiming at learning a probabilistic distribution over data, generative models have been actively studied with broad applications. This paper proposes a complex recurrent variational autoencoder (VAE) framework, for mod...
详细信息
ISBN:
(纸本)9798350359329;9798350359312
Aiming at learning a probabilistic distribution over data, generative models have been actively studied with broad applications. This paper proposes a complex recurrent variational autoencoder (VAE) framework, for modeling time series data, particularly speech signals. First, to account for the temporal structure of speech signals, we introduce complex-valued recurrent neural network in the framework. Then, inspired by recent advancements in speech enhancement and separation, the reconstruction loss in the proposed model is L1-based loss, considering penalty on both complex and magnitude spectrograms. To exemplify the use of the complex generative model, we choose speech resynthesis first and then enhancement as the specific application in this paper. Experiments are conducted on the VCTK, TIMIT, and VoiceBank+DEMAND datasets. The results show that the proposed method can resynthesize complex spectrogram well, and offers improvements on objective metrics in speech intelligibility and signal quality for enhancement.
Knowledge tracking (KT) is a task that predicting the degree of students' knowledge mastery through their learning interaction records. Although existing works improve predictive capability with well-designed neur...
详细信息
ISBN:
(纸本)9789819772438;9789819772445
Knowledge tracking (KT) is a task that predicting the degree of students' knowledge mastery through their learning interaction records. Although existing works improve predictive capability with well-designed neural network models or hypothetical learning mechanisms, the predictive performance is compromised in the scenarios of quantity limited interaction data. In this paper, we utilize variational autoencoder (VAE) and pre-trained network to generate question answer sequence data pairs related to the original interaction data, which can improve the performance of the model when added to the training set even in the case of data scarcity. Specifically, the steps of the data augmentation method for KT we proposed are as follows: 1) Question sequence generation. Generate latent question sequences that are similar to the real interaction question sequences from the pre-designed VAE model. 2) Answer sequence generation. Put the generated data into the pre-trained KT model to get reliable answer label sequences that correspond to latent question sequences. 3) Samples generation and training. Combine the two types of generated sequences as new samples for KT task training. We apply the data augmentation method on four classic datasets and demonstrate its effectiveness by reaching the state-of-the-art performance with an average AUC index improvement of 2.41%. We also verify the method on artificially random extracted data, and with only 20% of the data, it even achieves similar results compared with other methods using 100% of the data.
Self-localization is a crucial task for robots, demanding high accuracy. In this work, we propose a new robot localization method based on the variational autoencoder (VAE). In our method, the robot utilizes the captu...
详细信息
ISBN:
(纸本)9798350307627
Self-localization is a crucial task for robots, demanding high accuracy. In this work, we propose a new robot localization method based on the variational autoencoder (VAE). In our method, the robot utilizes the captured image to generate robot localization in indoor environments. The utilization of VAE makes the system adaptive to varying environmental conditions. Our findings demonstrate that utilizing both the robot's coordinates and images as training data significantly enhances the accuracy of robot self-localization estimation and improves the robustness of the system due to sensor data noise.
Anomaly detection in hyperspectral images is an important and challenging problem. Most available data sets are unlabeled, and very few are labelled. In this paper, we proposed a lightweight variational autoencoder an...
详细信息
ISBN:
(纸本)9789819916474;9789819916481
Anomaly detection in hyperspectral images is an important and challenging problem. Most available data sets are unlabeled, and very few are labelled. In this paper, we proposed a lightweight variational autoencoder anomaly detector (VAE-AD) for hyperspectral data. VAE is used to learn the background distribution of the image, and thereafter it is used to construct a background representation for each pixel. Further reconstruction error is calculated between the background reconstructed image and the original image used for anomaly detection. A GMM-based post-processing step is used to construct the final detection map. The comparative analysis with five real-world hyperspectral data sets shows that the proposed model achieves better or comparable results with few learning parameters of the model, and with less time.
Learning precise representations of users and items to fit observed interaction data is the fundamental task of collaborative filtering. Existing studies usually infer entangled representations to fit such interaction...
详细信息
ISBN:
(纸本)9781611978032
Learning precise representations of users and items to fit observed interaction data is the fundamental task of collaborative filtering. Existing studies usually infer entangled representations to fit such interaction data, neglecting to model the diverse matching relationships between users and items behind their interactions, leading to limited performance and weak interpretability. To address this problem, we propose a Dual Disentangled variational autoencoder (DualVAE) for collaborative recommendation, which combines disentangled representation learning with variational inference to facilitate the generation of implicit interaction data. Specifically, we first implement the disentangling concept by unifying an attention-aware dual disentanglement and disentangled variational autoencoder to infer the disentangled latent representations of users and items. Further, to encourage the correspondence and independence of disentangled representations of users and items, we design a neighborhood-enhanced representation constraint with a customized contrastive mechanism to improve the representation quality. Extensive experiments on three real-world benchmarks show that our proposed model significantly outperforms several recent state-of-the-art baselines. Further empirical experimental results also illustrate the interpretability of the disentangled representations learned by DualVAE.
This paper presents a statistical method of single-channel speech enhancement that uses a variational autoencoder (VAE) as a prior distribution on clean speech. A standard approach to speech enhancement is to train a ...
详细信息
ISBN:
(纸本)9781538646588
This paper presents a statistical method of single-channel speech enhancement that uses a variational autoencoder (VAE) as a prior distribution on clean speech. A standard approach to speech enhancement is to train a deep neural network (DNN) to take noisy speech as input and output clean speech. Although this supervised approach requires a very large amount of pair data for training, it is not robust against unknown environments. Another approach is to use non-negative matrix factorization (NMF) based on basis spectra trained on clean speech in advance and those adapted to noise on the fly. This semi-supervised approach, however, causes considerable signal distortion in enhanced speech due to the unrealistic assumption that speech spectrograms are linear combinations of the basis spectra. Replacing the poor linear generative model of clean speech in NMF with a VAE - a powerful nonlinear deep generative model trained on clean speech, we formulate a unified probabilistic generative model of noisy speech. Given noisy speech as observed data, we can sample clean speech from its posterior distribution. The proposed method outperformed the conventional DNN-based method in unseen noisy environments.
Real-world data are typically described using multiple modalities or multiple types of descriptors that are considered as multiple views. The data from different modalities locate in different subspaces, therefore the...
详细信息
ISBN:
(纸本)9783030863623;9783030863616
Real-world data are typically described using multiple modalities or multiple types of descriptors that are considered as multiple views. The data from different modalities locate in different subspaces, therefore the representations associated with similar semantics would be different. To solve this problem, many approaches have been proposed for fusion representation using data from multiple views. Although effectiveness achieved, most existing models lack precision for gradient diffusion. We proposed Asymmetric Multimodal variational autoencoder (AMVAE) to reduce the effect. The proposed model has two key components: multiple autoencoders and multimodal variational autoencoder. Multiple autoencoders are responsible for encoding view-specific data, while the multimodal variational autoencoder guides the generation of fusion representation. The proposed model effectively solves the problem of low precision. The experimental results show that our method is state of the art on several benchmark datasets for both clustering and classification tasks.
variational autoencoders (VAEs) have been shown to provide efficient neural-network-based approximate Bayesian inference for observation models for which exact inference is intractable. Its extension. the so-called St...
详细信息
ISBN:
(纸本)9781510848764
variational autoencoders (VAEs) have been shown to provide efficient neural-network-based approximate Bayesian inference for observation models for which exact inference is intractable. Its extension. the so-called Structured VAE (SVAE) allows inference in the presence of both discrete and continuous latent variables. Inspired by this extension, we developed a VAE with Hidden Markov Models (HMMs) as latent models. We applied the resulting HMM-VAE to the task of acoustic unit discovery in a zero resource scenario. Starting from an initial model based on variational inference in an HMM with Gaussian Mixture Model (GMM) emission probabilities, the accuracy of the acoustic unit discovery could be significantly improved by the HMM-VAE. In doing so we were able to demonstrate for an unsupervised learning task what is well-known in the supervised learning case: Neural networks provide superior modeling power compared to GMMs.
暂无评论