检索结果-内蒙古大学图书馆

Two-stage classification methods for microarray data

EXPERT SYSTEMS WITH APPLICATIONS 2008年第1期34卷 375-383页

作者： Wong, Tzu-Tsung Hsu, Ching-Han Natl Cheng Kung Univ Inst Informat Management Tainan 701 Taiwan

Gene expression data are a key factor for the success of medical diagnosis, and two-stage classification methods are therefore developed for processing microarray data. The first stage for this kind of classification methods is to select a pre-specified number of genes, which are likely to be the most relevant to the occurrence of a disease, and passes these genes to the second stage for classification. In this paper, we use four gene selection mechanisms and two classification tools to compose eight two-stage classification methods, and test these eight methods on eight microarray data sets for analyzing their performance. The first interesting finding is that the genes chosen by different categories of gene selection mechanisms are less than half in common but result in insignificantly different classification accuracies. A subset-gene-ranking mechanism can be beneficial in classification accuracy, but its computational effort is much heavier. Whether the classification tool employed at the second stage should be accompanied with a dimension reduction technique depends on the characteristics of a data set. (c) 2006 Elsevier Ltd. All rights reserved.

关键词： dimension reduction gene selection microarray data two-stage classification method

来源：评论

学校读者我要写书评

暂无评论

Random walk biclustering for microarray data

引用

INFORMATION SCIENCES 2008年第6期178卷 1479-1497页

作者： Angiulli, Fabrizio Cesario, Eugenio Pizzuti, Clara CNR ICAR I-87036 Arcavacata Di Rende CS Italy

A biclustering algorithm, based on a greedy technique and enriched with a local search strategy to escape poor local minima, is proposed. The algorithm starts with an initial random solution and searches for a locally optimal solution by successive transformations that improve a gain function. The gain function combines the mean squared residue, the row variance, and the size of the bicluster. Different strategies to escape local minima are introduced and compared. Experimental results on several microarray data sets show that the method is able to find significant biclusters, also from a biological point of view. (c) 2007 Elsevier Inc. All rights reserved.

关键词： biclustering microarray data local search

来源：评论

学校读者我要写书评

暂无评论

Sequential local least squares imputation estimating missing value of microarray data

引用

COMPUTERS IN BIOLOGY AND MEDICINE 2008年第10期38卷 1112-1120页

作者： Zhang, Xiaobai Song, Xiaofeng Wang, Huinan Zhan, Huanping Nanjing Univ Aeronaut & Astronaut Dept Biomed Engn Nanjing 210016 Peoples R China

Missing values in microarray data can significantly affect subsequent analysis, thus it is important to estimate these missing values accurately. In this paper, a sequential local least squares imputation (SLLSimpute) method is proposed to solve this problem. It estimates missing values sequentially from the gene containing the fewest missing values and partially utilizes these estimated values. In addition, an automatic parameter selection algorithm, which can generate an appropriate number of neighboring genes for each target gene, is presented for parameter estimation. Experimental results confirmed that SLLSimpute method exhibited better estimation ability compared with other currently used imputation methods. (C) 2008 Elsevier Ltd. All rights reserved.

关键词： Missing value estimation Imputation method Least squares principle Normalized root mean squared error (NRMSE) microarray data

来源：评论

学校读者我要写书评

暂无评论

Reducing microarray data via nonnegative matrix factorization for visualization and clustering analysis

引用

JOURNAL OF BIOMEDICAL INFORMATICS 2008年第4期41卷 602-606页

作者： Liu, Weixiang Yuan, Kehong Ye, Datian Tsinghua Univ Biomed Engn Res Ctr Grad Sch Div Life Sci Shenzhen 518055 Peoples R China

In microarray data analysis, each gene expression sample has thousands of genes and reducing such high dimensionality is useful for both visualization and further clustering of samples. Traditional principal component analysis (PCA) is a commonly used method which has problems. Nonnegative Matrix Factorization (NMF) is a new dimension reduction method. In this paper we compare NMF and PCA for dimension reduction. The reduced data is used for visualization, and clustering analysis via k-means on 11 real gene expression datasets. Before the clustering analysis, we apply NMF and PCA for reduction in visualization. The results on one leukemia dataset show that NMF can discover natural clusters and clearly detect one mislabeled sample while PCA cannot. For clustering analysis via k-means, NMF most typically outperforms PCA. Our results demonstrate the superiority of NMF over PCA in reducing microarray data. (C) 2007 Elsevier Inc. All rights reserved.

关键词： microarray data Nonnegative Matrix Factorization principal component analysis visualization clustering analysis

来源：评论

学校读者我要写书评

暂无评论

A Discernibility-Based Approach to Feature Selection for microarray data

A Discernibility-Based Approach to Feature Selection for Mic...

引用

4th International IEEE Conference Intelligent Systems

作者： Voulgaris, Zacharias Magoulas, George D. Univ London Birkbeck Coll Sch Comp Sci & Informat Syst London WC1E 7HX England

ISBN: (纸本)9781424417391

Feature selection has been used widely for a variety of data, yielding higher speeds and reduced computational cost for the classification process. However, it is in microarray datasets where its advantages become more evident and are more required. In this paper we present a novel approach to accomplish this based on the concept of discernibility that we introduce to depict how separated the classes of a dataset are. We develop and test two independent feature selection methods that follow this approach. The results of oar experiments on four microarray datasets show that discernibility-based feature selection reduces the dimensionality of the datasets involved without compromising the performance of the classifiers.

关键词： classification problems feature selection dimensionality reduction discernibility microarray data

来源：评论

学校读者我要写书评

暂无评论

Sparse regularized discriminant analysis with application to microarrays

引用

COMPUTATIONAL BIOLOGY AND CHEMISTRY 2012年 39卷 14-19页

作者： Li, Ran Wu, Baolin Univ Minnesota Sch Publ Hlth Div Biostat Minneapolis MN 55455 USA

For cancer prediction using large-scale gene expression data, it often helps to incorporate gene interactions in the model. However it is not straightforward to simultaneously select important genes while modeling gene interactions. Some heuristic approaches have been proposed in the literature. In this paper, we study a unified modeling approach based on the l(1) penalized likelihood estimation that can simultaneously select important genes and model gene interactions. We will illustrate its competitive performance through simulation studies and applications to public microarray data. (c) 2012 Elsevier Ltd. All rights reserved.

关键词： Lasso microarray data PCA Prediction

来源：评论

学校读者我要写书评

暂无评论

Ensemble methods for biclustering tasks

引用

PATTERN RECOGNITION 2012年第11期45卷 3938-3949页

作者： Hanczar, Blaise Nadif, Mohamed Univ Paris 05 LIPADE F-75006 Paris France

Several biclustering algorithms have been proposed in different fields of microarray data analysis. We present a new approach that improves their performance in using the ensemble methods. An ensemble biclustering is considered and formalized by a problem of binary triclustering. We propose a simple and efficient algorithm to solve it. To illustrate the interest of our ensemble approach, numerical experiments are performed on both artificial and real datasets with two biclustering algorithms commonly used in bioinformatics. (C) 2012 Elsevier Ltd. All rights reserved.

关键词： Co-clustering Ensemble methods microarray data

来源：评论

学校读者我要写书评

暂无评论

Distance based feature selection for clustering microarray data

引用

13th International Conference on database Systems for Advanced Applications

作者： Dash, Manoranjan Gopalkrishnan, Vivekanand Nanyang Technol Univ Singapore Singapore

ISBN: (纸本)9783540785675

In microarray data, clustering is the fundamental task for separating genes into biologically functional groups or for classifying tissues and phenotypes. Recently, with innovative gene expression microarray data technologies, thousands of expression levels of genes (features) can be measured simultaneously in a single experiment. The large number of genes with a lot of noise causes high complexity for cluster analysis. This challenge has raised the demand for feature selection - an effective dimensionality reduction technique that removes noisy features. In this paper we propose a novel filter method for feature selection. The suggested method, called ClosestFS, is based on a distance measure. For each feature, the distance is evaluated by computing its impact on the histogram for the whole data. Our experimental results show that the quality of clustering results (evaluated by several widely used measures) of K-means algorithm using ClosestFS as the pre-processing step is significantly better than that of the pure K-means.

关键词： feature selection clustering distance function microarray data

来源：评论

学校读者我要写书评

暂无评论

Biomarker detection for the diagnosis of lymph node metastasis from oral squamous cell carcinoma

引用

ORAL ONCOLOGY 2012年第4期48卷 311-319页

作者： Kim, Ki-Yeol Lee, Gui Youn Cha, In-Ho Yonsei Univ Dept Oral & Maxillofacial Surg Coll Dent Seoul 120752 South Korea Yonsei Univ Oral Canc Res Inst Coll Dent Seoul 120752 South Korea Yonsei Univ Canc Metastasis Res Ctr Coll Med Seoul 120752 South Korea

Lymph node metastasis is an important prognostic factor in oral squamous cell carcinoma. However, the lack of significant biomarkers for lymph node metastasis can cause patients to be inappropriately treated and produce a poor prognosis. Therefore, there is a need to identify gene sets that are associated with lymph node metastasis. In this study, we used three expression datasets obtained from a public database and selected candidate gene sets that were related with lymph node metastasis from two datasets and a combined dataset. We evaluated the selected gene set using OOB error rates in a validation dataset. The gene set detected from the combined dataset classified the lymph node status more accurately in the validation dataset and clear expression patterns classifying the lymph node status based on chromosomal location were observed. The combined dataset holds promise for use as a more accurate candidate gene set for the diagnosis of lymph node metastasis and the selected gene set could be used for biological validation in further studies. (C) 2011 Elsevier Ltd. All rights reserved.

关键词： Lymph node metastasis Biomarker Gene expression microarray data Combined dataset Array comparative genomic hybridization(aCGH) data Oral squamous cell carcinoma

来源：评论

学校读者我要写书评

暂无评论

A Copula-Based Algorithm for Discovering Patterns of Dependent Observations

引用

JOURNAL OF CLASSIFICATION 2012年第1期29卷 50-75页

作者： Di Lascio, F. Marta L. Giannerini, Simone Univ Bologna Dipartimento Sci Stat I-40126 Bologna Italy

The main aim of this work is the study of clustering dependent data by means of copula functions. Copulas are popular multivariate tools whose importance within clustering methods has not been investigated yet in detail. We propose a new algorithm (CoClust in brief) that allows to cluster dependent data according to the multivariate structure of the generating process without any assumption on the margins. Moreover, the approach does not require either to choose a starting classification or to set a priori the number of clusters;in fact, the CoClust selects them by using a criterion based on the log-likelihood of a copula fit. We test our proposal on simulated data for different dependence scenarios and compare it with a model-based clustering technique. Finally, we show applications of the CoClust to real microarray data of breast-cancer patients.

关键词： Clustering methods CoClust algorithm Copula functions Model-based clustering microarray data

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：