Background: Predicting the response of cancer cell lines to specific drugs is an essential problem in personalized medicine. Since drug response is closely associated with genomic information in cancer cells, some lar...
详细信息
Background: Predicting the response of cancer cell lines to specific drugs is an essential problem in personalized medicine. Since drug response is closely associated with genomic information in cancer cells, some large panels of several hundred human cancer cell lines are organized with genomic and pharmacogenomic data. Although several methods have been developed to predict the drug response, there are many challenges in achieving accurate predictions. This study proposes a novel feature selection-based method, named Auto-HMM-LMF, to predict cell line-drug associations accurately. Because of the vast dimensions of the feature space for predicting the drug response, Auto-HMM-LMF focuses on the feature selection issue for exploiting a subset of inputs with a significant contribution. Results: This research introduces a novel method for feature selection of mutation data based on signature assignments and hidden Markov models. Also, we use the autoencoder models for feature selection of gene expression and copy number variation data. After selecting features, the logistic matrix factorization model is applied to predict drug response values. Besides, by comparing to one of the most powerful feature selection methods, the ensemble feature selection method (EFS), we showed that the performance of the predictive model based on selected features introduced in this paper is much better for drug response prediction. Two datasets, the Genomics of Drug Sensitivity in Cancer (GDSC) and Cancer Cell Line Encyclopedia (CCLE) are used to indicate the efficiency of the proposed method across unseen patient cell-line. Evaluation of the proposed model showed that Auto-HMM-LMF could improve the accuracy of the results of the state-of-the-art algorithms, and it can find useful features for the logistic matrix factorization method. Conclusions: We depicted an application of Auto-HMM-LMF in exploring the new candidate drugs for head and neck cancer that showed the proposed method is use
Closed-circuit television (CCTV) is being widely adopted in water pipeline inspection. The inspector needs to spend a long time to watch the recorded video during the office-based survey and can get fatigue easily. An...
详细信息
Closed-circuit television (CCTV) is being widely adopted in water pipeline inspection. The inspector needs to spend a long time to watch the recorded video during the office-based survey and can get fatigue easily. An automated process can release the inspector?s work load and ensure the consistent quality of the survey. However, a fully automated survey of varied structural discontinuities still remains as a challenge. This study aims to first identify the anomaly frames of the CCTV video, which contain the major anomalies captured from the internal surface of the pipe. Thus, the inspector can focus more on these anomaly frames. In this paper, an anomaly frame detection framework based on steerable pyramid autoencoder (SPAE) is proposed. The SPAE can generate discriminative representations to be used in the prediction. Both the parameter optimization and comparative studies for the proposed SPAE were carried out in this research. The experimental results demonstrate that this novel SPAE algorithm can achieve 0.984 accuracy and 0.984 F1-score, which outperforms other state-of-the-art methods selected for comparison. Thus, the proposed framework can significantly improve the accuracy and efficiency for anomaly frame detection, which will highly facilitate the pipeline condition assessment through the CCTV inspection.
Today's modern industry has widely accepted the intelligent condition monitoring system to improve the industrial organization. As an effect, the data-driven-based fault diagnosis methods are designed by integrati...
详细信息
ISBN:
(纸本)9783030369873;9783030369866
Today's modern industry has widely accepted the intelligent condition monitoring system to improve the industrial organization. As an effect, the data-driven-based fault diagnosis methods are designed by integrating signal processing techniques along with artificial intelligence methods. Various signal processing approaches have been proposed for feature extraction from vibration signals to construct the fault feature space, and thus, over the years, the feature space has increased rapidly. Also, the challenge is to identify the promising features from the space for improving diagnosis performance. Therefore, in this paper, wavelet energy is presented as an input feature set to the fault diagnosis system. In this paper, wavelet energy is utilized to represent the multiple faults for reducing the requirement of number features, and therefore, the complex task of feature extraction becomes simple. Further, the convolutional autoencoder has assisted in finding more distinguishing fault feature from wavelet energy to improve the diagnosis task using extreme learning machine. The proposed method testified using two vibration datasets, and decent results are achieved. The effect of autoencoder on fault diagnosis performance has been observed in comparison to principal component analysis (PCA). Also, the consequence has seen in the size of the extreme learning machine (ELM) architecture.
Nowadays, network intrusions have brought greater impact in a large scale. Intrusion Detection Systems (IDS) have been a recent research hotspot for both the industry and the academic. However, due to the dynamic char...
详细信息
ISBN:
(纸本)9781728150895
Nowadays, network intrusions have brought greater impact in a large scale. Intrusion Detection Systems (IDS) have been a recent research hotspot for both the industry and the academic. However, due to the dynamic characteristics of network traffic, it is challenging to extract significant features and identify the traffic types. This paper focuses on applying deep learning methods to feature extraction. Specifically, an IDS model is proposed based on autoencoder and long short-term memory (LSTM) cell. The overall architecture of the intrusion detection model includes a feature extractor, a classifier, and an evaluation block. Different structures of the feature extraction model have been discussed and researched. Experiments conducted on the UNSW-NB15 dataset produce satisfactory result. A number of selected metrics such as accuracy and false alarm rate are adopted to evaluate the detection performance. Simulation results indicate that our model works better than competing machine learning methods and achieves accuracy of over 92%.
Learning temporal patterns in time series remains a challenging task up until today. Particularly for anomaly detection in time series, it is essential to learn the underlying structure of a system's normal behavi...
详细信息
Learning temporal patterns in time series remains a challenging task up until today. Particularly for anomaly detection in time series, it is essential to learn the underlying structure of a system's normal behavior. Periodic or quasiperiodic signals with complex temporal patterns make the problem even more challenging: Anomalies may be a hard-to-detect deviation from the normal recurring pattern. In this paper, we present TCN-AE, a temporal convolutional network autoencoder based on dilated convolutions. Contrary to many other anomaly detection algorithms, TCN-AE is trained in an unsupervised manner. The algorithm demonstrates its efficacy on a comprehensive real-world anomaly benchmark comprising electrocardiogram (ECG) recordings of patients with cardiac arrhythmia. TCNAE significantly outperforms several other unsupervised state-of-the-art anomaly detection algorithms. Moreover, we investigate the contribution of the individual enhancements and show that each new ingredient improves the overall performance on the investigated benchmark. (C) 2021 Elsevier B.V. All rights reserved.
Recurrent neural network (RNN) based autoencoders, trained in an unsupervised manner, have been widely used to generate fixed-dimensional vector representations or embeddings for varying length multivariate time serie...
详细信息
Recurrent neural network (RNN) based autoencoders, trained in an unsupervised manner, have been widely used to generate fixed-dimensional vector representations or embeddings for varying length multivariate time series. These embeddings have been demonstrated to be useful for time series reconstruction, classification, and creation of health index (HI) curves of machines being used in industrial applications, based on which the remaining useful life (RUL) of machines can be estimated. In this study, we extend the traditional form of RNN autoencoders as a feature extractor for multivariate time series to a more general form in terms of arranging the order of input or output sequences and the hidden unit architecture. We apply the embeddings obtained by different variants of RNN autoencoders for a time series classification task and a machine RUL estimation problem using two publicly available datasets. A random research strategy is used to find the optimal hyperparameters of all variants for each task in order to give a fair comparison of the general performance among different variants over a large hyperparameter space, as well as the best performance that each variant can achieve compared with the best reported values in the literature. Our results show that traditional reversing the order of output time series while maintaining the order of input time series when training an RNN autoencoder does not show improved performance for the two studied cases. Thus, intentionally arranging the input or output order seems unnecessary for training the RNN autoencoder as a feature extractor of time series. We further observe that only the RNN architectures with gating mechanism can achieve the functionality of encoding for the time series, and none of the three common gated architectures we studied shows significantly and consistently improved performance compared to the others on the two studied cases. However, the bidirectional RNN autoencoders yield slightly better perform
Recently, many studies have exploited the potential of deep learning to forecast energy demand, but they cannot explain the results. They only analyze the simple correlations between the input and output to discover t...
详细信息
Recently, many studies have exploited the potential of deep learning to forecast energy demand, but they cannot explain the results. They only analyze the simple correlations between the input and output to discover the most important input features, or they depend on the manual investigation of the latent space embedded with power demand patterns. In this paper, to overcome these shortcomings, we propose a deep autoencoder that can explain the prediction results by manipulating the latent space. It consists of 1) a power encoder that embeds power information, 2) an auxiliary encoder that embeds auxiliary information for an interpretable latent space in two dimensions, 3) a predictor that predicts power demand by using concatenated values of the latent variables extracted from the two encoders, and 4) an explainer that provides the most important input features in predicting the future demand by utilizing the interpretable latent variables. Several experiments on a dataset of household electric energy demand show that the proposed model not only performs better than conventional models, with a mean squared error of 0.376 in predicting electricity demand for 60 min, but also provides the capacity to explain the results by analyzing the correlation of inputs, latent variables, and energy demand predicted.
Background:Single-cell RNA-sequencing(scRNA-seq)is a rapidly evolving technology that enables measurement of gene expression levels at an unprecedented *** the explosive growth in the number of cells that can be assay...
详细信息
Background:Single-cell RNA-sequencing(scRNA-seq)is a rapidly evolving technology that enables measurement of gene expression levels at an unprecedented *** the explosive growth in the number of cells that can be assayed by a single experiment,scRNA-seq still has several limitations,including high rates of dropouts,which result in a large number of genes having zero read count in the scRNA-seq data,and complicate downstream ***:To overcome this problem,we treat zeros as missing values and develop nonparametric deep learning methods for ***,our LATE(Learning with autoencoder)method trains an autoencoder with random initial values of the parameters,whereas our TRANSLATE(TRANSfer learning with LATE)method further allows for the use of a reference gene expression data set to provide LATE with an initial set of parameter ***:On both simulated and real data,LATE and TRANSLATE outperform existing scRNA-seq imputation methods,achieving lower mean squared error in most cases,recovering nonlinear gene-gene relationships,and better separating cell *** are also highly scalable and can efficiently process over 1 million cells in just a few hours on a ***:We demonstrate that our nonparametric approach to imputation based on autoencoders is powerful and highly efficient.
TubeNet has the simplest possible tubular configuration with the uniform number of neurons in all layers and enables explicit inversion. To create a TubeNet, dimension-reduction is a prerequisite for the inverse probl...
详细信息
TubeNet has the simplest possible tubular configuration with the uniform number of neurons in all layers and enables explicit inversion. To create a TubeNet, dimension-reduction is a prerequisite for the inverse problems so that the numbers of neurons in the input and output layerscan be made the same. This study introduces a novel procedure to construct inverse TubeNets. The proposed procedure has three major sequential steps. (1) An autoencoder will be used to extract the necessary number of features from a large number of features in a high-dimensional space;(2) Constraints will be imposed to the autoencoder guided by the concepts of the principal component analysis (PCA), so that the extracted features possess the important orthogonality;(3) An L2 regularizer is proposed to adequately impose these constraints on the off-diagonal entries in the weight matrix of the autoencoder, ensuring quality orthogonality. The benchmark problems of inverse identification of material constants of composite laminated plates are used to evaluate the effect of the present TubeNet procedure with the constraint autoencoder, standard autoencoder and PCA, implemented in TubeNet. The study shows that the present constrained autoencoder can effectively overcome the shortcomings of PCA and standard autoencoder, and offers an effective way for dimension-reduction for inverse TubeNet.
The rapid growth of network-related services in the last decade has produced a huge amount of sensitive data on the internet. But networks are very much prone to intrusions where unauthorized users attempt to access s...
详细信息
The rapid growth of network-related services in the last decade has produced a huge amount of sensitive data on the internet. But networks are very much prone to intrusions where unauthorized users attempt to access sensitive information and even disrupt the system. Building a competent network intrusion detection system (IDS) is necessary to prevent such attacks. IDSs generally use machine learning algorithms for classifying the attacks. But the features used for classification are not always suitable or sufficient. Besides, the number of intrusions is much less than the number of non-intrusions. Hence naive approaches may fail to provide acceptable performance due to this class imbalance. To counter this problem, in this paper, we propose a model that extracts useful features from the given features and then uses a deep learning algorithm to classify the intrusions. It is to be noted that underlying data points cannot be thought of as sampled from the same distribution, rather from two different distributions - one generic to all network intrusions, and the other specific to the domain. Keeping this fact in mind, we propose a unique Generic-Specific autoencoder architecture where the generic one learns the features that are common across all forms of network intrusions, and the specific ones learn features that are pertaining only to that domain. The model has been evaluated on the CICIDS2017 dataset, which is the largest dataset of this type available online, and we have set new benchmark results on this dataset.
暂无评论