Genomic signal processing is a new area of research that combines advanced digital signal processing methodologies for enhanced genetic data analysis. It has many promising applications in bioinformatics and next gene...
详细信息
Genomic signal processing is a new area of research that combines advanced digital signal processing methodologies for enhanced genetic data analysis. It has many promising applications in bioinformatics and next generation of healthcare systems, in particular, in the field of microarray data clustering. In this paper we present a comparative performance analysis of enhanced digital spectral analysis methods for robust clustering of gene expression across multiple microarray data samples. Three digital signal processing methods: linear predictive coding, wavelet decomposition, and fractal dimension are studied to provide a comparative evaluation of the clustering performance of these methods on several microarray datasets. The results of this study show that the fractal approach provides the best clustering accuracy compared to other digital signal processing and well known statistical methods.
microarray gene clustering is a big data application that employs the K-means (KM) clustering algorithm to identify hidden patterns, evolutionary relationships, unknown functions and gene trends for disease diagnosis,...
详细信息
microarray gene clustering is a big data application that employs the K-means (KM) clustering algorithm to identify hidden patterns, evolutionary relationships, unknown functions and gene trends for disease diagnosis, tissue detection and biological analysis. The selection of initial centroids is a major issue in the KM algorithm because it influences the effectiveness, efficiency and local optima of the cluster. The existing initial centroid initialization algorithm is computationally expensive and degrades cluster quality due to the large dimensionality and interconnectedness of microarray gene data. To deal with this issue, this study proposed the min-max kurtosis stratum mean (MKSM) algorithm for big data clustering in a single machine environment. The MKSM algorithm uses kurtosis for dimension selection, mean distance for gene relationship identification, and stratification for heterogeneous centroid extraction. The results of the presented algorithm are compared to the state-of-the-art initialization strategy on twelve microarray gene datasets utilizing internal, external and statistical assessment criteria. The experimental results demonstrate that the MKSMKM algorithm reduces iterations, distance computation, data comparison and local optima, and improves cluster performance, effectiveness and efficiency with stable convergence.
We present a methodology, common subcluster mining, to explore gene expression data for possible biomarkers of lung cancer. Subclusters refer to the peaks formed through superimposition of clusters obtained from expre...
详细信息
ISBN:
(纸本)9789811020353;9789811020346
We present a methodology, common subcluster mining, to explore gene expression data for possible biomarkers of lung cancer. Subclusters refer to the peaks formed through superimposition of clusters obtained from expression data of normal samples. Application of the method on the corresponding data sets from diseased samples extracts the genes that undergo high fold changes. The potential candidate genes are examined on the datasets of Stage I through stage IV of the disease. Few genes emerge as indicative molecular markers of lung cancer.
Gene expression data from microarray experiments offer enormous scope for exploring the genetic relationship of deadly diseases. The motivation is to explore possible molecular biomarkers of such diseases with a view ...
详细信息
ISBN:
(纸本)9781509022618
Gene expression data from microarray experiments offer enormous scope for exploring the genetic relationship of deadly diseases. The motivation is to explore possible molecular biomarkers of such diseases with a view to early and periodic detection. A study has been reported in this paper with a methodology for common subcluster mining using FCM clustering. Subcluster refers to the peak formed through superimposition of clusters obtained from expressional data, both from the normal and diseased samples separately. Experiments are carried out on datasets of lung cancer, Acute Myeloid Leukemia(AML) and breast cancer employing the algorithm for common subcluster mining. Results are found to match to a large extent with those obtained in previous studies. Few genes emerge as indicative molecular biomarkers of respective diseases.
暂无评论