Background: Single cell sequencing is a technology for high-throughput sequencing analysis of genome, tran-scriptome and epigenome at the single cell level. It can improve the shortcomings of traditional methods, reve...
详细信息
Background: Single cell sequencing is a technology for high-throughput sequencing analysis of genome, tran-scriptome and epigenome at the single cell level. It can improve the shortcomings of traditional methods, reveal the gene structure and gene expression state of a single cell, and reflect the heterogeneity between cells. Among them, the clustering analysis of single-cell RNA data is a very important step, but the clustering of single-cell RNA data is faced with two difficulties, dropout events and dimension curse. At present, many methods are only driven by data, and do not make full use of the existing biological information. Results: In this work, we propose scSSA, a clustering model based on semi-supervised autoencoder, fast inde-pendent component analysis (FastICA) and Gaussian mixture clustering. Firstly, the semi-supervised autoencoder imputes and denoises the scRNA-seq data, and then get the low-dimensional latent representation. Secondly, the low-dimensional representation is reduced the dimension and clustered by FastICA and Gaussian mixture model respectively. Finally, scSSA is compared with Seurat, CIDR and other methods on 10 public scRNA-seq datasets. Conclusion: The results show that scSSA has superior performance in cell clustering on 10 public datasets. In conclusion, scSSA can accurately identify the cell types and is generally applicable to all kinds of single cell datasets. scSSA has great application potential in the field of scRNA-seq data analysis.
Chemical warfare agents pose a serious threat due to their extreme toxicity, necessitating swift the identification of chemical gases and individual responses to the identified threats. Fourier transform infrared (FTI...
详细信息
Chemical warfare agents pose a serious threat due to their extreme toxicity, necessitating swift the identification of chemical gases and individual responses to the identified threats. Fourier transform infrared (FTIR) spectroscopy offers a method for remote material analysis, particularly in detecting colorless and odorless chemical agents. In this paper, we propose a deep neural network utilizing a semi-supervised autoencoder (SSAE) for the classification of chemical gases based on FTIR spectra. In contrast to traditional methods, the SSAE concurrently trains an autoencoder and a classifier attached to a latent vector of the autoencoder, enhancing feature extraction for classification. The SSAE was evaluated on laboratory-collected FTIR spectra, demonstrating a superior classification performance compared to existing methods. The efficacy of the SSAE lies in its ability to generate denser cluster distributions in latent vectors, thereby enhancing gas classification. This study established a consistent experimental environment for hyperparameter optimization, offering valuable insights into the influence of latent vectors on classification performance.
Recent years have witnessed the significant success of representation learning and deep learning in various prediction and recognition applications. Most of these previous studies adopt the two-phase procedures, namel...
详细信息
ISBN:
(纸本)9781509000760
Recent years have witnessed the significant success of representation learning and deep learning in various prediction and recognition applications. Most of these previous studies adopt the two-phase procedures, namely the first step of representation learning and then the second step of supervised learning. In this process, to fit the training data the initial model weights, which inherits the good properties from the representation learning in the first step, will be changed in the second step. In other words, the second step leans better classification models at the cost of the possible deterioration of the effectiveness of representation learning. Motivated by this observation we propose a joint framework of representation and supervised learning. It aims to learn a model, which not only guarantees the "semantics" of the original data from representation learning but also fit the training data well via supervised learning. Along this line we develop the model of semi-supervised autoencoder under the spirit of the joint learning framework. The experiments on various data sets for classification show the significant effectiveness of the proposed model.
This paper proposes a semi-supervised autoencoder with an auxiliary task (SAAT) to extract a health feature space for power transformer fault diagnosis using dissolved gas analysis (DGA). The health feature space gene...
详细信息
This paper proposes a semi-supervised autoencoder with an auxiliary task (SAAT) to extract a health feature space for power transformer fault diagnosis using dissolved gas analysis (DGA). The health feature space generated by a semi-supervised autoencoder (SSAE) not only identifies normal and thermal/electrical fault types, but also presents the underlying characteristics of DGA. In the proposed approach, by adding an auxiliary task that detects normal and fault states in the loss function of SSAE, the health feature space additionally enables visualization of health degradation properties. The overall procedure of the new approach includes three key steps: 1) preprocessing DGA data, 2) extracting two health features via SAAT, and 3) visualizing the two health features in two-dimensional space. In this paper, we test the proposed approach using massive unlabeled/labeled Korea Electric Power Corporation (KEPCO) databases and IEC TC 10 databases. To demonstrate the effectiveness of the proposed approach, four comparative studies are conducted with these datasets;the studies examined: 1) the effectiveness of an auxiliary detection task, 2) the effectiveness of the visualization method, 3) conventional fault diagnosis methods, and 4) the state-of-the-art, semi-supervised deep learning algorithms. By examining several evaluation metrics, these comparative studies confirm that the proposed approach outperforms SSAE without the auxiliary task, existing methods, and state-of-the-art deep learning algorithms, in terms of defining health degradation performance. We expect that the proposed SAAT-based health feature space approach will be widely applicable to intuitively monitor the health state of power transformers in the real world.
For real-world industrial system modeling, dynamic stochastic errors inevitably exist in data- driven deterministic predictions (i.e., point predictions). The uncertainty of such prediction results directly affects va...
详细信息
For real-world industrial system modeling, dynamic stochastic errors inevitably exist in data- driven deterministic predictions (i.e., point predictions). The uncertainty of such prediction results directly affects various prediction-based operations for work condition identification and production decision-making. Therefore, a novel interval prediction method quantifying multi-output uncertainty is proposed by combining conformal prediction with random vector functional link networks (RVFLNs), which has fast learning speed and high accuracy performance. The proposed algorithm is used for the reliable prediction of molten iron quality in blast furnace ironmaking process. Firstly, to address the issue that shallow learning models have limited expression capabilities to describe complex nonlinear relationships, the dynamic attention mechanism and semi-supervised autoencoder are utilized to reveal and represent the correlations between different input variables and multi-output variables. Subsequently, the Elastic Net regularization technique is adopted to improve the multicollinearity and overfitting problems of traditional RVFLNs. Further, considering the deterioration of prediction accuracy and credibility caused by uncertain system dynamics, an Empirical Copula function-based Copula prediction uncertainty quantification method is introduced to realize multi-output variables reliable prediction with a given confidence level. Finally, actual blast furnace industrial data is applied to demonstrate the validity, utility, and sophistication of model.
semi-supervised learning has shown its potential in many real-world applications where only few labeled examples are available. However, when some fairness constraints need to be satisfied, semi-supervised classificat...
详细信息
semi-supervised learning has shown its potential in many real-world applications where only few labeled examples are available. However, when some fairness constraints need to be satisfied, semi-supervised classification models often struggle as they are required to cope with the lack of sufficient information for predicting the target variable while forgetting its relationships with any sensitive and potentially discriminatory attribute. To address this issue, we propose a fair semi-supervised representation learning architecture that leads to fair and accurate classification results even in very challenging scenarios with few labeled (but biased) instances. We show experimentally that our model can be easily adopted in very general settings, as the learned representations may be employed to train any supervised classifier. Moreover, when applied to several synthetic and real-world datasets, our method is competitive with state-of-the-art fair semi-supervised approaches.
Nowadays, some traditional autoencoders and their extensions have been widely applied in data-driven fault diagnosis for feature extraction. However, because of the fact that traditional autoencoders could not make us...
详细信息
Nowadays, some traditional autoencoders and their extensions have been widely applied in data-driven fault diagnosis for feature extraction. However, because of the fact that traditional autoencoders could not make use of label information, the representations extracted by these traditional autoencoders may show disappointing results when handling ultimate discriminative task. In this paper, we propose a novel semi-supervised autoencoder, which is named as Discriminant autoencoder. The training of proposed Discriminant autoencoder includes a supervised process and an unsupervised process. And a distance penalty is added into the loss function, which enables the proposed Discriminant autoencoder to extract more suitable representations from industrial data samples. In order to explain the effectiveness of this semi-supervised autoencoder, we carry out some experiments and give out a mathematical derivation. Here we use an industrial batch process dataset as the criterion dataset to test the performance of proposed Discriminant autoencoder and other conventional autoencoders.
Node Embedding, which uses low-dimensional non-linear feature vectors to represent nodes in the network, has shown a great promise, not only because it is easy-to-use for downstream tasks, but also because it has achi...
详细信息
ISBN:
(纸本)9781450369763
Node Embedding, which uses low-dimensional non-linear feature vectors to represent nodes in the network, has shown a great promise, not only because it is easy-to-use for downstream tasks, but also because it has achieved great success on many network analysis tasks. One of the challenges has been how to develop a node embedding method for integrating topological information from multiple networks. To address this critical problem, we propose a novel node embedding, called DeepMNE, for multi-network integration using a deep semi-supervised autoencoder. The key point of DeepMNE is that it captures complex topological structures of multiple networks and utilizes correlation among multiple networks as constraints. We evaluate DeepMNE in node classification task and link prediction task on four real-world datasets. The experimental results demonstrate that DeepMNE shows superior performance over seven state-of-the-art single-network and multi-network embedding algorithms.
暂无评论