AUTOVC is a voice-conversion method that performs self-reconstruction using an autoencoder structure for zero-shot voice conversion. AUTOVC has the advantage of being easy and simple to learn because it only uses the ...
详细信息
AUTOVC is a voice-conversion method that performs self-reconstruction using an autoencoder structure for zero-shot voice conversion. AUTOVC has the advantage of being easy and simple to learn because it only uses the autoencoder loss for learning. However, it performs voice conversion by disentangling speech information from speakers and linguistic information by adjusting the bottleneck dimension;this requires highly meticulous fine tuning of the bottleneck dimension and involves a tradeoff between speech quality and speaker similarity. To address these issues, neural analysis and synthesis (NANSY)-a fully self-supervised learning system that uses perturbations to extract speech features-is proposed. NANSY solves the problem of the adjustment of the bottleneck dimension by utilizing perturbation and exhibits high-reconstruction performance. In this study, we propose perturbation AUTOVC, a voice conversion method that utilizes the structure of AUTOVC and the perturbation of NANSY. The proposed method applies perturbations to speech signals (such as NANSY signals) to solve the problem of the voice conversion method using bottleneck dimensions. Perturbation is applied to remove the speaker-dependent information present in the speech, leaving only the linguistic information, which is then passed through a content encoder and modeled as a content embedding containing only the linguistic information. To obtain speaker information, we used x-vectors, which are extensively used in pretrained speaker recognition. The concatenated linguistic and speaker information extracted from the encoder and additional energy information is used as input to the decoder to perform self-reconstruction. Similar to AUTOVC, it is easy and simple to learn using only the autoencoder loss. For the evaluation, we measured three objective evaluation metrics: character error rate (%), cosine similarity, and short-time objective intelligibility, as well as a subjective evaluation metric: mean opinio
With the proliferation of the Internet of Things, a large amount of multivariate time series (MTS) data is being produced daily by industrial systems, corresponding in many cases to life-critical tasks. The recent ano...
详细信息
With the proliferation of the Internet of Things, a large amount of multivariate time series (MTS) data is being produced daily by industrial systems, corresponding in many cases to life-critical tasks. The recent anomaly detection researches focus on using deep learning methods to construct a normal profile for MTS. However, without proper constraints, these methods cannot capture the dependencies and dynamics of MTS and thus fail to model the normal pattern, resulting in unsatisfactory performance. This paper proposes CAE-AD, a novel contrastive autoencoder for anomaly detection in MTS, by introducing multi -grained contrasting methods to extract normal data pattern. First, to capture the temporal dependency of series, a projection layer is employed and a novel contextual contrasting method is applied to learn the robust temporal representation. Second, the projected series is transformed into two different views by using time-domain and frequency-domain data augmentation. Last, an instance contrasting method is proposed to learn local invariant characteristics. The experimental results show that CAE-AD achieves an F1-score ranging from 0.9119 to 0.9376 on the three public datasets, outperforming the baseline methods.(c) 2022 Published by Elsevier Inc.
As an important part of industrial equipment, the safe and stable operation of rolling bearings is an important guarantee for the performance of mechanical equipment. Aiming at the problem that it is difficult to char...
详细信息
As an important part of industrial equipment, the safe and stable operation of rolling bearings is an important guarantee for the performance of mechanical equipment. Aiming at the problem that it is difficult to characterize the running state of rolling bearings, this paper mainly analyzes and processes the vibration signals of rolling bearings, extracts and fuses multi-information entropy, and monitors the running state of rolling bearings and predicts the remaining useful life prediction (RUL) through test verification. Firstly, in view of the difficulty in characterizing the bearings running state characteristics, a rolling bearings running state monitoring method based on multi-information entropy fusion and denoising autoencoder (DAE) was proposed to extract the multi-entropy index features of vibration signals to improve the accuracy of feature extraction, and to solve the problem of not obvious information representation of a single feature indicator and missing information in the feature screening process. Secondly, in view of the problems of low prediction accuracy and poor robustness and generalization in traditional RUL models, a rolling bearings RUL model combining convolutional autoencoder (CAE) and bidirectional long short-term memory network (BiLSTM) was proposed. The introduction of convolution operation made CAE have the feature of weight sharing, reducing the complexity of the model. Finally, the XJTU-SY data set was used to verify the constructed model. The results show that the condition monitoring model established in this paper can accurately evaluate the running state of the rolling bearing and accurately locate the failure time. At the same time, the residual life prediction model can realize the residual life prediction of most data sets, and has good accuracy and robustness.
The self-regulated recognition of human activities from time-series smartphone sensor data is a growing research area in smart and intelligent health care. Deep learning (DL) approaches have exhibited improvements ove...
详细信息
The self-regulated recognition of human activities from time-series smartphone sensor data is a growing research area in smart and intelligent health care. Deep learning (DL) approaches have exhibited improvements over traditional machine learning (ML) models in various domains, including human activity recognition (HAR). Several issues are involved with traditional ML approaches;these include handcrafted feature extraction, which is a tedious and complex task involving expert domain knowledge, and the use of a separate dimensionality reduction module to overcome overfitting problems and hence provide model generalization. In this article, we propose a DL-based approach for activity recognition with smartphone sensor data, i.e., accelerometer and gyroscope data. Convolutional neural networks (CNNs), autoencoders (AEs), and long short-term memory (LSTM) possess complementary modeling capabilities, as CNNs are good at automatic feature extraction, AEs are used for dimensionality reduction and LSTMs are adept at temporal modeling. In this study, we take advantage of the complementarity of CNNs, AEs, and LSTMs by combining them into a unified architecture. We explore the proposed architecture, namely, "ConvAE-LSTM", on four different standard public datasets (WISDM, UCI, PAMAP2, and OPPORTUNITY). The experimental results indicate that our novel approach is practical and provides relative smartphone-based HAR solution performance improvements in terms of computational time, accuracy, F1-score, precision, and recall over existing state-of-the-art methods.
Due to the complicated production mechanism in multivariate industrial processes, different dynamic features of variables raise challenges to traditional data-driven process monitoring methods which assume the process...
详细信息
Due to the complicated production mechanism in multivariate industrial processes, different dynamic features of variables raise challenges to traditional data-driven process monitoring methods which assume the process data is static or dynamically consistent. To tackle this issue, this paper proposes a novel process monitoring method based on the long short-term memory (LSTM) and autoencoder neu-ral network (called LSTMED) for multivariate process monitoring with uneven dynamic features. First, the LSTM units are arranged in the encoder-decoder form to construct an end-to-end model. Then, the constructed model is trained in an unsupervised manner to capture long-term time dependency within variables and dominant representation of high dimensional process data. Afterward, the kernel density estimation (KDE) method is performed to determine the control limit only based on the reconstruction error from historical normal data. Finally, effective online monitoring for uneven dynamic process can be achieved. The performance and advantage of the process monitoring method proposed are explained through typical cases, including the numerical simulation and Tennessee Eastman (TE) benchmark process, and comparative experimental analysis with state-of-the-art methods.(c) 2022 Elsevier Ltd. All rights reserved.
In this research paper, we propose an unsupervised framework for feature learning based on an autoencoder to learn sparse feature representations for EEG-based person identification. autoencoder and CNN do the person ...
详细信息
In this research paper, we propose an unsupervised framework for feature learning based on an autoencoder to learn sparse feature representations for EEG-based person identification. autoencoder and CNN do the person identification task for signal reconstruction and recognition. Electroencephalography (EEG) based biometric system is vesting humans to recognize, identify and communicate with the outer world using brain signals for interactions. EEG-based biometrics are putting forward solutions because of their high-safety capabilities and handy transportable instruments. Motor imagery EEG (MI-EEG) is a maximum broadly centered EEG signal that exhibits a subject's motion intentions without real actions. The Proposed framework proved to be a practical approach to managing the massive volume of EEG data and identifying the person based on their different task with resting states. The experiments have been conducted on the standard publicly available motor imagery EEG dataset with 109 subjects. The highest recognition rate of 87.60% for task-based identification and 99.89% recognition rate for resting-state has been recorded using the autoencoder-CNN model. The outcomes imply that the overall performance of our proposed framework is similar or advanced to that of the state-of-the-art method. The shape is a realistic technique to control the full-size extent of EEG data and to pick out the individual based totally on their specific task.
The imbalanced data classification is a challenging issue in many domains including medical intelligent diagnosis and fraudulent transaction analysis. The performance of the conventional classifier degrades due to the...
详细信息
The imbalanced data classification is a challenging issue in many domains including medical intelligent diagnosis and fraudulent transaction analysis. The performance of the conventional classifier degrades due to the imbalanced class distribution of the training data set. Recently, machine learning and deep learning techniques are used for imbalanced data classification. Data preprocessing approaches are also suitable for handling class imbalance problem. Data augmentation is one of the preprocessing techniques used to handle skewed class distribution. Synthetic Minority Oversampling Technique (SMOTE) is a promising class balancing approach and it generates noise during the process of creation of synthetic samples. In this paper, autoencoder is used as a noise reduction technique and it reduces the noise generated by SMOTE. Further, Deep one-dimensional Convolutional Neural Network is used for classification. The performance of the proposed method is evaluated and compared with existing approaches using different metrics such as Precision, Recall, Accuracy, Area Under the Curve and Geometric Mean. Ten data sets with imbalance ratio ranging from 1.17 to 577.87 and data set size ranging from 303 to 284807 instances are used in the experiments. The different imbalanced data sets used are Heart-Disease, Mammography, Pima Indian diabetes, Adult, Oil-Spill, Phoneme, Creditcard, BankNoteAuthentication, Balance scale weight & distance database and Yeast data sets. The proposed method shows an accuracy of 96.1%, 96.5%, 87.7%, 87.3%, 95%, 92.4%, 98.4%, 86.1%, 94% and 95.9% respectively. The results suggest that this method outperforms other deep learning methods and machine learning methods with respect to G-mean and other performance metrics.
Student engagement is an important factor in meeting the goals of virtual learning programs. Automatic measurement of student engagement provides helpful information for instructors to meet learning program objectives...
详细信息
Student engagement is an important factor in meeting the goals of virtual learning programs. Automatic measurement of student engagement provides helpful information for instructors to meet learning program objectives and individualize program delivery. Many existing approaches solve video-based engagement measurement using the traditional frameworks of binary classification (classifying video snippets into engaged or disengaged classes), multi-class classification (classifying video snippets into multiple classes corresponding to different levels of engagement), or regression (estimating a continuous value corresponding to the level of engagement). However, we observe that while the engagement behavior is mostly well defined (e.g., focused, not distracted), disengagement can be expressed in various ways. In addition, in some cases, the data for disengaged classes may not be sufficient to train generalizable binary or multi-class classifiers. To handle this situation, in this paper, for the first time, we formulate detecting disengagement in virtual learning as an anomaly detection problem. We design various autoencoders, including temporal convolutional network autoencoder, long short-term memory autoencoder, and feedforward autoencoder using different behavioral and affect features for video-based student disengagement detection. The result of our experiments on two publicly available student engagement datasets, DAiSEE and EmotiW, shows the superiority of the proposed approach for disengagement detection as an anomaly compared to binary classifiers for classifying videos into engaged versus disengaged classes (with an average improvement of 9% on the area under the curve of the receiver operating characteristic curve and 22% on the area under the curve of the precision-recall curve).
Asphalt concrete (AC) balanced mix design (BMD) is based on the selection of aggregate gradation, component volumetrics, and binder content to control pavement cracking and rutting potential. The Illinois Flexibility ...
详细信息
Asphalt concrete (AC) balanced mix design (BMD) is based on the selection of aggregate gradation, component volumetrics, and binder content to control pavement cracking and rutting potential. The Illinois Flexibility Index Test (I-FIT) and the Hamburg Wheel Tracking Test (HWTT) results, used to predict cracking and rutting potential, respectively, are used in the BMD approach. However, BMD generally relies on a trial-and-error process to identify the aggregate gradation and binder content needed to meet volumetrics and optimize I-FIT and HWTT results. Minimizing or eliminating the trial-and-error process would increase productivity and accuracy. Therefore, this study proposes an autoencoder deep neural network (ADNN) to develop optimized AC mix design alternatives that can meet a prescribed flexibility index (FI) and rut depth (RD). autoencoders are a type of neural network designed for representation learning composed of an encoder and a decoder. The encoder detects a structured pattern in the original input data to create a compressed representation of the AC mix design. The decoder reconstructs the compressed representation. The proposed autoencoder is composed of an encoder of five hidden layers, a latent space of one node, and a five-hidden-layer decoder. Models were created from a database of 5,357 data sets that include mix properties, I-FIT FI, and HWTT RD (after data preprocessing was conducted). An autoencoder was then trained to predict the total binder content, and aggregate gradation based on a target mix type, FI, and RD.
Intelligent Transportation Systems (ITS), especially Autonomous Vehicles (AVs), are vulnerable to security and safety issues that threaten the lives of the people. Unlike manual vehicles, the security of communication...
详细信息
Intelligent Transportation Systems (ITS), especially Autonomous Vehicles (AVs), are vulnerable to security and safety issues that threaten the lives of the people. Unlike manual vehicles, the security of communications and computing components of AVs can be compromised using advanced hacking techniques, thus barring AVs from the effective use in our routine lives. Once manual vehicles are connected to the Internet, called the Internet of Vehicles (IoVs), it would be exploited by cyber-attacks, like denial of service, sniffing, distributed denial of service, spoofing and replay attacks. In this article, we present a deep learning-based Intrusion Detection System (IDS) for ITS, in particular, to discover suspicious network activity of In-Vehicles Networks (IVN), vehicles to vehicles (V2V) communications and vehicles to infrastructure (V2I) networks. A Deep Learning architecture-based Long-Short Term Memory (LSTM) autoencoder algorithm is designed to recognize intrusive events from the central network gateways of AVs. The proposed IDS is evaluated using two benchmark datasets, i.e., the car hacking dataset for in-vehicle communications and the UNSW-NB15 dataset for external network communications. The experimental results demonstrated that our proposed system achieved over a 99% accuracy for detecting all types of attacks on the car hacking dataset and a 98% accuracy on the UNSW-NB15 dataset, outperforming other eight intrusion detection techniques.
暂无评论