Background and Objective: The motivation behind cancer subtyping is to identify subgroups of cancer patients with distinguishable phenotypes of clinical importance. It can assist in advancement of subtype -targeted ba...
详细信息
Background and Objective: The motivation behind cancer subtyping is to identify subgroups of cancer patients with distinguishable phenotypes of clinical importance. It can assist in advancement of subtype -targeted based treatments. Subtype identification is a complicated task, therefore requires multi-omics data integration to identify the precise patients' subgroup. Over the years, several computational attempts have been made to identify the cancer subtypes accurately using integrative multi-omics analysis. Some studies have used autoencoders (AE) to capture multi-omics feature integration in lower dimensions for identifying subtypes in specific types of cancer. However, capturing the highly informative latent space by learning the deep architectures of AE to attain a satisfactory generalized performance is required. Therefore, in this study, a novel AE-assisted cancer subtyping framework is presented that utilizes the compressed latent space of a Sparse AE neural network for multi-omics ***: The proposed framework first performs a supervised feature selection based on the survival status of the patients. The selected features from each of the omic data are passed to the AE. The information embedded in the latent space of the trained AE neural networks are then used for cancer subtyping using Spectral clustering. The AE architecture designed in this study exhaustively searches the best compression for multi-omics data by varying the number of neurons in the hidden layers and penalizing activations within the *** and Conclusion: The proposed framework is applied to five different multi-omics cancer datasets taken from The Cancer Genome Atlas. It is observed that for getting a robust information bottleneck, a compression of 10-20% of the input features along with an L1 regularization penalty of 0.01 or 0.001 performs well for most of the cancer datasets. Clustering performed on this latent representation generates clusters with better silh
Topic detection is a process for determining topics from a collection of textual data. One of the topic detection methods is clustering based, which assumes that the centroids are topics. The clustering method has the...
详细信息
Topic detection is a process for determining topics from a collection of textual data. One of the topic detection methods is clustering based, which assumes that the centroids are topics. The clustering method has the advantage that it can process data with negative representations. Therefore, the clustering method allows a combination with a broader-representation learning method. In this paper, we adopt deep learning for topic detection by using a deep autoencoder and fuzzy c-means called "deep autoencoder-based fuzzy c-means". The encoder of the autoencoder performs a lower-dimensional representation learning. Fuzzy c-means groups the lower-dimensional representation to identify the centroids. The autoencoder's decoder transforms the centroids back into the original representation to be interpreted as the topics. Our simulation shows that deep autoencoderbased fuzzy c-means improves the coherence score of eigenspace-based fuzzy c-means and is comparable to the leading standard methods, i.e., nonnegative matrix factorization or latent Dirichlet allocation.
Electromagnetic (EM) metasurfaces have attracted great attention from both engineers and researchers due to their unique physical responses. With the rapid development of complex metasurfaces, the design and optimizat...
详细信息
Electromagnetic (EM) metasurfaces have attracted great attention from both engineers and researchers due to their unique physical responses. With the rapid development of complex metasurfaces, the design and optimization processes have also become extremely time-consuming and computational resource-consuming. Here we proposed a deep learning model (DLM) based on a convolutional autoencoder network and inverse design network, which can help to establish the complex relationships between the geometries of metasurfaces and their EM responses. As a typical example, a metasurface absorber consisting of polymethacrylimide foam/metal ring alternating multilayers is chosen to demonstrate the capability of the DLM. The relative spectral error of the two desired spectra is only 5.80 and 5.49, respectively. Our model shows great predictive power and may be used as an effective tool to accelerate the design and optimization of metasurfaces.
There is a dire need for vision automation for edible bird's nest (EBN) hygiene inspection. To date, an effective impurities detection method for EBNs has yet to be realized owing to the inhomogeneous optical prop...
详细信息
There is a dire need for vision automation for edible bird's nest (EBN) hygiene inspection. To date, an effective impurities detection method for EBNs has yet to be realized owing to the inhomogeneous optical properties, various types and sizes of impurities, and limited sample size. The impurities inspection was formulated as an anomaly detection task, and a hybrid autoencoder model that contains an autoencoder and a single layer convolutional network is proposed. The model was trained to reconstruct only nonimpurity regions of the EBN for impurities segmentation and detection as anomalies. The results showed that with only 50 EBN sample images, the hybrid model achieved a recall of 0.9282, a precision of 0.7718, and a 5.63% undetected rate for impurities. Furthermore, a misclassification rate of 21.53% was recorded due to artifacts mostly with sizes <0.20 mm that were detected as false positive. Nonetheless, the applicability of the proposed autoencoder model was confirmed, with >92 % of successful impurities detected. Therefore, the hybrid autoencoder model is further explored for improvement and practical application. (c) 2022 SPIE and IS&T
An innovations sequence of a time series is a sequence of independent and identically distributed random variables with which the original time series has a causal representation. The innovation at a time is statistic...
详细信息
An innovations sequence of a time series is a sequence of independent and identically distributed random variables with which the original time series has a causal representation. The innovation at a time is statistically independent of the history of the time series. As such, it represents the new information contained at present but not in the past. Because of its simple probability structure, the innovations sequence is the most efficient signature of the original. Unlike the principle or independent component representations, an innovations sequence preserves not only the complete statistical properties but also the temporal order of the original time series. An long-standing open problem is to find a computationally tractable way to extract an innovations sequence of non-Gaussian processes. This paper presents a deep learning approach, referred to as Innovations autoencoder (IAE), that extracts innovations sequences using a causal convolutional neural network. An application of IAE to the one-class anomalous sequence detection problem with unknown anomaly and anomaly-free models is also presented.
As a novel virtual reality (VR) format, panorama maps are attracting increasing attention, while the compression of panorama images is still a concern. In this paper, a densely connected convolutional network block (d...
详细信息
As a novel virtual reality (VR) format, panorama maps are attracting increasing attention, while the compression of panorama images is still a concern. In this paper, a densely connected convolutional network block (dense block) based autoencoder is proposed to compress panorama maps. In the proposed autoencoder, dense blocks are specially designed to reuse feature maps and reduce redundancy of features. Meanwhile, a loss function, which imports a position-dependent weight item for each pixel, is proposed to train and adjust network parameters, in order to make the autoencoder fit to properties of panorama maps. Based on the proposed autoencoder and the weighted loss function, a greedy block-wise training scheme is also designed to avoid gradient vanishing problem and speed up training. During training process, the autoencoder is divided into several sub-nets. After each sub-net is trained separately, the whole network is fine-tuned to achieve the best performance. Experimental results demonstrate that the proposed autoencoder, compared with JPEG, saves up to 79.69 % bit rates, and obtains 7.27dB gain in BD-WS-PSNR or 0.0789 gain in BD-WS-SSIM. The proposed autoencoder also outperforms JPEG 2000, HEVC and VVC in both BD-WS-PSNR and BD-WS-SSIM. Meanwhile, subjective results show that the proposed autoencoder can recover details of panorama images, and reconstruct maps with high visual quality.
Two new, non-intrusive reduced order frameworks for the faster modelling of gas reservoirs with time-varying production are presented and compared. The first method is an extension of a method using proper orthogonal ...
详细信息
Two new, non-intrusive reduced order frameworks for the faster modelling of gas reservoirs with time-varying production are presented and compared. The first method is an extension of a method using proper orthogonal decomposition (POD) in conjunction with radial basis functions (RBFs) that has previously been applied to predicting the performance of oil reservoirs undergoing a constant rate waterflood. The second method uses an autoencoder rather than RBFs to estimate the flow dynamics (pressure distributions) in hyperspace for unseen cases. Both frameworks are 'trained' using sample outputs from off-line, commercial reservoir simulations of a realistic heterogeneous gas reservoir with time-varying production controls typical of gas field operation. These controls include time-varying rate and switching between bottom hole pressure and rate control as well as cases where wells get shut-in. Both POD-based models produce reasonable forecasts of the reservoir performance for new unseen/prediction cases and are between 0.22 and 300 times faster than conventional simulation, including the time spent performing training simulations with conventional simulation solutions. The POD-RBF models are more accurate and consistent with reference commercial simulation outputs than the POD-AE models. In addition, the POD-AE models required more trial and error to set up as the number of hidden layers needed, depends on the particular scenario being modelled. There is no ab initio way of predicting the best number of layers for a given type of scenario. This makes them less suitable for practical application by reservoir engineers. Overall the POD-RBF framework is the most robust and accurate of the two methods.
Protein-protein interactions (PPIs) play a crucial role in biological processes of living organisms. Correct prediction of PPI can prove to be extremely valuable in probing protein functions which can aid in the devel...
详细信息
Protein-protein interactions (PPIs) play a crucial role in biological processes of living organisms. Correct prediction of PPI can prove to be extremely valuable in probing protein functions which can aid in the development of new and powerful therapies for disease prevention. Many experimental studies have been previously performed to investigate PPIs. However, in-vitro techniques to investigate PPIs are resource-extensive and time-consuming. Although various in-silico approaches to predict PPI have been developed in recent years, they could be fallible in terms of accuracy and false-positive rate. To overcome these shortcomings, we propose a novel approach, AE-LGBM to predict the PPIs more accurately. It employs LightGBM classifier and utilizes the autoencoder, which is an artificial neural network, to efficiently produce lower-dimensional, discriminative, and noise-free features. We incorporate conjoint triad (CT) and Composition-Transition-Distribution (CTD) features into the AE-LGBM framework. On performing ten-fold cross-validation, the prediction accuracies obtained by AE-LGBM for Human and Yeast datasets are 98.7% and 95.4% respectively. AE-LGBM was further evaluated on independent datasets and has achieved excellent accuracies of 100%, 100%, 99.9%, 99.3%, 99.2% on E. coli, M. musculus, C. elegans, H. pylori and H. sapiens respectively. AE-LGBM has also obtained the best accuracy when tested over three important PPI networks namely single-core network (CD9), the multiple-core network (The Ras/Raf/MEK/ERK pathway) and the cross-connection network (Wnt Network). The outstanding generalization ability of AE-LGBM makes it a versatile, efficient, and robust PPIs prediction model.
The present paper aims to propose an information-theoretic method for interpreting the inference mechanism of neural networks. The new method aims to interpret the inference mechanism minimally by disentangling comple...
详细信息
The present paper aims to propose an information-theoretic method for interpreting the inference mechanism of neural networks. The new method aims to interpret the inference mechanism minimally by disentangling complex information into simpler and easily interpretable information. This disentanglement of complex information can be realized by maximizing mutual information between input patterns and the corresponding neurons. However, because the use of mutual information has faced difficulty in computation, we use the well-known autoencoder to increase mutual information by re-interpreting the sparsity constraint, which is considered a device to increase mutual information. The computational procedures to increase mutual information are decomposed into the serial operation of equal use of neurons and specific responses to input patterns. The specific responses are realized by enhancing the results by the equal use of neurons. The method was applied to three data sets: the glass, office equipment, and pulsar data sets. With all three data sets, we could observe that, when the number of neurons was forced to increase, mutual information could be increased. Then, collective weights, or average collectively treated weights, showed that the method could extract the simple and linear relations between inputs and targets, making it possible to interpret the inference mechanism minimally.
Effective fusion of structural magnetic resonance imaging (sMRI) and functional magnetic resonance imaging (fMRI) data has the potential to boost the accuracy of infant age prediction thanks to the complementary infor...
详细信息
Effective fusion of structural magnetic resonance imaging (sMRI) and functional magnetic resonance imaging (fMRI) data has the potential to boost the accuracy of infant age prediction thanks to the complementary information provided by different imaging modalities. However, functional connectivity measured by fMRI during infancy is largely immature and noisy compared to the morphological features from sMRI, thus making the sMRI and fMRI fusion for infant brain analysis extremely challenging. With the conventional multimodal fusion strategies, adding fMRI data for age prediction has a high risk of introducing more noises than useful features, which would lead to reduced accuracy than that merely using sMRI data. To address this issue, we develop a novel model termed as disentangled-multimodal adversarial autoencoder (DMM-AAE) for infant age prediction based on multimodal brain MRI. Specifically, we disentangle the latent variables of autoencoder into common and specific codes to represent the shared and complementary information among modalities, respectively. Then, cross-reconstruction requirement and common-specific distance ratio loss are designed as regularizations to ensure the effectiveness and thoroughness of the disentanglement. By arranging relatively independent autoencoders to separate the modalities and employing disentanglement under cross-reconstruction requirement to integrate them, our DMM-AAE method effectively restrains the possible interference cross modalities, while realizing effective information fusion. Taking advantage of the latent variable disentanglement, a new strategy is further proposed and embedded into DMM-AAE to address the issue of incompleteness of the multimodal neuroimages, which can also be used as an independent algorithm for missing modality imputation. By taking six types of cortical morphometric features from sMRI and brain functional connectivity from fMRI as predictors, the superiority of the proposed DMM-AAE is validated o
暂无评论