检索结果-内蒙古大学图书馆

Computational intelligence for microarray data and biomedical image analysis for the early diagnosis of breast cancer

引用

EXPERT SYSTEMS WITH APPLICATIONS 2012年第16期39卷 12371-12377页

作者： Nahar, Jesmin Imam, Tasadduq Tickle, Kevin S. Ali, A. B. M. Shawkat Chen, Yi-Ping Phoebe Cent Queensland Univ Fac Arts Business Informat & Educ Rockhampton Qld 4702 Australia La Trobe Univ Dept Comp Sci & Comp Engn Melbourne Vic 3086 Australia

The objective of this paper was to perform a comparative analysis of the computational intelligence algorithms to identify breast cancer in its early stages. Two types of data representations were considered: microarray based and medical imaging based. In contrast to previous researches, this research also considered the imbalanced nature of these data. It was observed that the SMO algorithm performed better for the majority of the test data, especially for microarray based data when accuracy was used as performance measure. Considering the imbalanced characteristic of the data, the Naive Bayes algorithm was seen to perform highly in terms of true positive rate (TPR). Regarding the influence of SMOTE, a well-known imbalanced data classification technique, it was observed that there was a notable performance improvement for J48, while the performance of SMO remained comparable for the majority of the datasets. Overall, the results indicated SMO as the most potential candidate for the microarray and image dataset considered in this research. (C) 2012 Elsevier Ltd. All rights reserved.

关键词： Breast cancer microarray data Image data Computational intelligence SMOTE SMO

来源：评论

学校读者我要写书评

暂无评论

Extraction of Informative Genes from Multiple microarray data Integrated by Rank-Based Approach

引用

IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS 2011年第4期E94D卷 841-854页

作者： Hong, Dongwan Yoon, Jeehee Lee, Jongkeun Park, Sanghyun Kim, Jongil Hallym Univ Dept Comp Engn Chunchon South Korea Seoul Natl Univ Coll Med Dept Biochem & Mol Biol Seoul 151 South Korea Yonsei Univ Dept Comp Sci Seoul 120749 South Korea

By converting the expression values of each sample into the corresponding rank values, the rank-based approach enables the direct integration of multiple microarray data produced by different laboratories and/or different techniques. In this study, we verify through statistical and experimental methods that informative genes can be extracted from multiple microarray data integrated by the rank-based approach (briefly, integrated rank-based microarray data). First, after showing that a nonparametric technique can be used effectively as a scoring metric for rank-based microarray data, we prove that the scoring results from integrated rank-based microarray data are statistically significant. Next, through experimental comparisons, we show that the informative genes from integrated rank-based microarray data are statistically more significant than those of single-microarray data. In addition, by comparing the lists of informative genes extracted from experimental data, we show that the rankbased data integration method extracts more significant genes than the z-score-based normalization technique or the rank products technique. Public cancer microarray data were used for our experiments and the marker genes list from the CGAP database was used to compare the extracted genes. The GO database and the GSEA method were also used to analyze the functionalities of the extracted genes.

关键词： microarray data data integration informative gene significance test

来源：评论

学校读者我要写书评

暂无评论

Performance analysis of clustering techniques over microarray data: A case study

引用

PHYSICA A-STATISTICAL MECHANICS AND ITS APPLICATIONS 2018年 493卷 162-176页

作者： Dash, Rasmita Misra, Bijan Bihari Siksha O Anusandhan Univ Inst Tech Educ & Res Dept Comp Sc & Informat Technol Khandagiri Sq Bhubaneswar 751030 Odisha India Silicon Inst Technol Dept Comp Sc & Engn Bhubaneswar 751024 Odisha India

Handling big data is one of the major issues in the field of statistical data analysis. In such investigation cluster analysis plays a vital role to deal with the large scale data. There are many clustering techniques with different cluster analysis approach. But which approach suits a particular dataset is difficult to predict. To deal with this problem a grading approach is introduced over many clustering techniques to identify a stable technique. But the grading approach depends on the characteristic of dataset as well as on the validity indices. So a two stage grading approach is implemented. In this study the grading approach is implemented over five clustering techniques like hybrid swarm based clustering (HSC), k-means, partitioning around medoids (PAM), vector quantization (VQ) and agglomerative nesting (AGNES). The experimentation is conducted over five microarray datasets with seven validity indices. The finding of grading approach that a cluster technique is significant is also established by Nemenyi post-hoc hypothetical test. (C) 2017 Elsevier B.V. All rights reserved.

关键词： microarray data Feature selection Cluster analysis Particle swarm optimization Statistical test

来源：评论

学校读者我要写书评

暂无评论

An integrative gene selection with association analysis for microarray data classification (vol 18, pg 739, 2014)

引用

INTELLIGENT data ANALYSIS 2014年第5期18卷 993-993页

作者： Fang, Ong Huey Mustapha, Norwati Sulaiman, Md. Nasir UCSI Univ FOBIS Sch Informat Technol Kuala Lumpur Malaysia Univ Putra Malaysia FSKTM Dept Comp Sci Serdang 43400 Selangor Malaysia

The rising interest in integrative approach has shifted gene selection from purely data-centric to incorporating additional biological knowledge. Integrative gene selection is viewed as a promising approach in microarray data classification that took into consideration the complex relationships among genes. However, in most of the existing methods, the selection of genes is still based on expression values alone and biological knowledge is integrated at the end of analysis to verify experimental results or to gain biological insights. Thus, this paper proposed an integrative gene selection based on filter method and association analysis for selecting genes that are not only differentially expressed but also informative for classification. Association analysis is employed to integrate microarray data with multiple types of biological knowledge simultaneously, and to identify groups of genes that are frequently co-occurred in target samples. It has been tested on four cancer-related datasets, and two types of biological knowledge are incorporated, namely Gene Ontology (GO) and KEGG Pathways (KEGG). The experimental results show that the recommended GO based models, KEGG based models, and GO-KEGG based models outperformed the expression-only models by attaining better classification accuracies with lesser number of genes. The performance of the integrative models verified the efficiency and scalability of association analysis in mining microarray data.

关键词： Association analysis classification gene selection integrative microarray data

来源：评论

学校读者我要写书评

暂无评论

Clustering microarray data: Theoretical and Practical Issues

引用

COMMUNICATIONS IN STATISTICS-THEORY AND METHODS 2012年第16-17期41卷 3211-3232页

作者： Di Lascio, F. Marta L. Giannerini, Simone Univ Bologna Dept Stat Sci I-40126 Bologna Italy

The analysis of microarray data is a widespread functional genomics approach that allows for the monitoring of the expression of thousands of genes at once. The analysis of the great amount of data generated in a microarray experiment requires powerful statistical techniques. One of the first tasks of the analysis of microarray data is to cluster data into biologically meaningful groups according to their expression patterns. In this article, we discuss classical as well as recent clustering techniques for microarray data. We pay particular attention to both theoretical and practical issues and give some general indications that might be useful to practitioners.

关键词： Distance-based clustering Distance measures microarray data Model-based clustering

来源：评论

学校读者我要写书评

暂无评论

Evaluation of gene importance in microarray data based upon probability of selection

引用

BMC BIOINFORMATICS 2005年第1期6卷 1-11页

作者： Fu, LM Fu-Liu, CS Pacific TB & Canc Res Org Pasadena CA USA Univ Florida Gainesville FL USA

Background: microarray devices permit a genome-scale evaluation of gene function. This technology has catalyzed biomedical research and development in recent years. As many important diseases can be traced down to the gene level, a long-standing research problem is to identify specific gene expression patterns linking to metabolic characteristics that contribute to disease development and progression. The microarray approach offers an expedited solution to this problem. However, it has posed a challenging issue to recognize disease-related genes expression patterns embedded in the microarray data. In selecting a small set of biologically significant genes for classifier design, the nature of high data dimensionality inherent in this problem creates substantial amount of uncertainty. Results: Here we present a model for probability analysis of selected genes in order to determine their importance. Our contribution is that we show how to derive the P value of each selected gene in multiple gene selection trials based on different combinations of data samples and how to conduct a reliability analysis accordingly. The importance of a gene is indicated by its associated P value in that a smaller value implies higher information content from information theory. On the microarray data concerning the subtype classification of small round blue cell tumors, we demonstrate that the method is capable of finding the smallest set of genes ( 19 genes) with optimal classification performance, compared with results reported in the literature. Conclusion: In classifier design based on microarray data, the probability value derived from gene selection based on multiple combinations of data samples enables an effective mechanism for reducing the tendency of fitting local data particularities.

关键词： Support Vector Machine microarray data Support Vector Machine Classifier Gene Selection Subtype Classification

来源：评论

学校读者我要写书评

暂无评论

A Ll-regularized feature selection method for local dimension reduction on microarray data

引用

COMPUTATIONAL BIOLOGY AND CHEMISTRY 2017年 67卷 92-101页

作者： Guo, Shun Guo, Donghui Chen, Lifei Jiang, Qingshan Xiamen Univ Dept Elect Engn Fujian 361005 Peoples R China Chinese Acad Sci Shenzhen Inst Adv Technol Shenzhen 518000 Peoples R China Fujian Normal Univ Sch Math & Comp Sci Fujian 350117 Peoples R China

Dimension reduction is a crucial technique in machine learning and data mining, which is widely used in areas of medicine, bioinformatics and genetics. In this paper, we propose a two-stage local dimension reduction approach for classification on microarray data. In first stage, a new Li-regularized feature selection method is defined to remove irrelevant and redundant features and to select the important features (biomarkers). In the next stage, PLS-based feature extraction is implemented on the selected features to extract synthesis features that best reflect discriminating characteristics for classification. The suitability of the proposal is demonstrated in an empirical study done with ten widely used microarray datasets, and the results show its effectiveness and competitiveness compared with four state-of-the-art methods. The experimental results on St Jude dataset shows that our method can be effectively applied to microarray data analysis for subtype prediction and the discovery of gene coexpression. (C) 2016 Elsevier Ltd. All rights reserved.

关键词： Local dimension reduction Classification Ll-regularized logistic regression microarray data Partial least squares (PLS)

来源：评论

学校读者我要写书评

暂无评论

Integrated analysis of the heterogeneous microarray data

引用

BMC BIOINFORMATICS 2011年第Sup5期12卷 1-8页

作者： Yi, Sung Gon Park, Taesung Seoul Natl Univ Dept Stat Seoul South Korea Case Western Reserve Univ Dept Epidemiol & Biostat Cleveland OH 44106 USA

Background: As the magnitude of the experiment increases, it is common to combine various types of microarrays such as paired and non-paired microarrays from different laboratories or hospitals. Thus, it is important to analyze microarray data together to derive a combined conclusion after accounting for heterogeneity among data sets. One of the main objectives of the microarray experiment is to identify differentially expressed genes among the different experimental groups. We propose the linear mixed effect model for the integrated analysis of the heterogeneous microarray data sets. Results: The proposed linear mixed effect model was illustrated using the data from 133 microarrays collected at three different hospitals. Though simulation studies, we compared the proposed linear mixed effect model approach with the meta-analysis and the ANOVA model approaches. The linear mixed effect model approach was shown to provide higher powers than the other approaches. Conclusions: The linear mixed effect model has advantages of allowing for various types of covariance structures over ANOVA model. Further, it can handle easily the correlated microarray data such as paired microarray data and repeated microarray data from the same subject.

关键词： microarray data Covariance Structure ANOVA Model Linear Mixed Effect Model Statistical Test Procedure

来源：评论

学校读者我要写书评

暂无评论

COMBAT GA-BASED GENE SELECTION FOR CLASSIFICATION OF microarray data

引用

BIOMEDICAL ENGINEERING-APPLICATIONS BASIS COMMUNICATIONS 2008年第6期20卷 345-352页

作者： Chuang, Li-Yeh Yang, Cheng-San Li, Jung-Chike Yang, Cheng-Hong Natl Kaohsiung Univ Appl Sci Dept Elect Engn Kaohsiung 80708 Taiwan I Shou Univ Dept Chem Engn Kaohsiung 80041 Taiwan Natl Cheng Kung Univ Inst Biomed Engn Tainan 70101 Taiwan

microarray data can provide valuable results for a variety of gene expression profile problems and contribute to advances in clinical medicine. The application of microarray data on cancer-type classification has recently gained in popularity. The properties of microarray data contain a large number of features ( genes) with high dimensions, and one in the multi-class category. These facts make testing and training of general classification methods difficult. Reducing the number of genes and achieving lower classification error rates are the main issues to be solved. The classification of microarray data samples can be regarded as a feature selection and classifier design problem. The goal of feature selection is to select those subsets of differentially expressed genes that are potentially relevant for distinguishing the sample classes. Classical genetic algorithms (GAs) may suffer from premature convergence and thus lead to poor experimental results. In this paper, combat genetic algorithm (CGA) is used to implement the feature selection, and a K-nearest neighbor with the leave-one-out cross-validation method serves as a classifier of the CGA fitness function for the classification problem. The proposed method was applied to 10 microarray data sets that were obtained from the literature. The experimental results show that the proposed method not only effectively reduced the number of gene expression levels but also achieved lower classification error rates.

关键词： Feature selection microarray data combat genetic algorithm K-nearest neighbor leave-one-out cross-validation

来源：评论

学校读者我要写书评

暂无评论

Hierarchical mixture models for biclustering in microarray data

引用

STATISTICAL MODELLING 2011年第6期11卷 489-505页

作者： Martella, F. Alfo, M. Vichi, M. Univ Roma La Sapienza Dipartimento Sci Stat I-00185 Rome Italy

In the last few years, model-based clustering techniques have become widely used in the context of microarray data analysis. In this empirical context, a potential purpose for statistical approaches is the identification of clusters of genes that are co-expressed under subsets of experimental conditions. We discuss a hierarchical mixture model to combine advantages of allowing for dependence within gene clusters and for simultaneous clustering of genes and experimental conditions. Thanks to the adopted hierarchical structure, we may distinguish gene clusters from mixture components, where the latter may represent intra-cluster gene-specific extra-Gaussian departures. To cluster experimental conditions, instead, we suggest a suitable parameterization of component-specific means by using a binary row stochastic matrix representing condition membership. The performance of the proposed approach is discussed on both simulated and real datasets.

关键词： Hierarchical mixture model biclustering microarray data

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：