检索结果-内蒙古大学图书馆

Gene expression data clustering using a multiobjective symmetry based clustering technique

COMPUTERS IN BIOLOGY AND MEDICINE 2013年第11期43卷 1965-1977页

作者： Saha, Sriparna Ekbal, Asif Gupta, Kshitija Bandyopadhyay, Sanghamitra Indian Inst Technol Dept Comp Sci & Engn Patna Bihar India Indian Stat Inst Machine Intelligence Unit Kolkata India

The invention of microarrays has rapidly changed the state of biological and biomedical research. Clustering algorithms play an important role in clustering microarray data sets where identifying groups of co-expressed genes are a very difficult task. Here we have posed the problem of clustering the microarray data as a multiobjective clustering problem. A new symmetry based fuzzy clustering technique is developed to solve this problem. The effectiveness of the proposed technique is demonstrated on five publicly available benchmark data sets. Results are compared with some widely used microarray clustering techniques. Statistical and biological significance tests have also been carried out. (C) 2013 Elsevier Ltd. All rights reserved.

关键词： microarray data Gene expression data clustering Clustering Multiobjective optimization (MOO) Symmetry Archived multiobjective simulated annealing based technique (AMOSA) Automatic determination of number of clusters

来源：评论

学校读者我要写书评

暂无评论

Estimating prediction error in microarray classification: Modifications of the 0.632+ bootstrap when n <p

引用

CANADIAN JOURNAL OF STATISTICS-REVUE CANADIENNE DE STATISTIQUE 2013年第1期41卷 133-150页

作者： Jiang, Wenyu Chen, Bingshu E. Queens Univ Dept Math & Stat Kingston ON K7L 3N6 Canada Queens Univ NCIC Clin Trials Grp Kingston ON K7L 3N6 Canada Queens Univ Dept Community Hlth & Epidemiol Kingston ON K7L 3N6 Canada

We are interested in estimating prediction error for a classification model built on high dimensional genomic data when the number of genes (p) greatly exceeds the number of subjects (n). We examine a distance argument supporting the conventional 0.632+ bootstrap proposed for the $n > p$ scenario, modify it for the $n < p$ situation and develop learning curves to describe how the true prediction error varies with the number of subjects in the training set. The curves are then applied to define adjusted resampling estimates for the prediction error in order to achieve a balance in terms of bias and variability. The adjusted resampling methods are proposed as counterparts of the 0.632+ bootstrap when $n < p$, and are found to improve on the 0.632+ bootstrap and other existing methods in the microarray study scenario when the sample size is small and there is some level of differential expression. The Canadian Journal of Statistics 41: 133150;2013 (c) 2012 Statistical Society of Canada

关键词： Bootstrap 0.632+ bootstrap class prediction cross-validation feature selection learning curve microarray data prediction error

来源：评论

学校读者我要写书评

暂无评论

Learning the local Bayesian network structure around the ZNF217 oncogene in breast tumours

引用

COMPUTERS IN BIOLOGY AND MEDICINE 2013年第4期43卷 334-341页

作者： Prestat, Emmanuel de Morais, Sergio Rodrigues Vendrell, Julie A. Thollet, Aurelie Gautier, Christian Cohen, Pascale A. Aussem, Alex Univ Lyon F-69000 Lyon France Univ Lyon 1 F-69000 Lyon France Univ Lyon 1 CNRS UMR5558 Lab Biometrie & Biol Evolut F-69622 Villeurbanne France INRIA Rhone Alpes BAMBOO Team Rhone Alpes France Ecole Cent Lyon F-69134 Ecully France Univ Lyon 1 CNRS UMR5205 Lab Informat Image & Syst Informat F-69622 Villeurbanne France Ctr Rech Cancerol Lyon Inserm U1052 F-69000 Lyon France Ctr Rech Cancerol Lyon CNRS UMR5286 F-69000 Lyon France Ctr Leon Berard F-69000 Lyon France

In this study, we discuss and apply a novel and efficient algorithm for learning a local Bayesian network model in the vicinity of the ZNF217 oncogene from breast cancer microarray data without having to decide in advance which genes have to be included in the learning process. ZNF217 is a candidate oncogene located at 20q13, a chromosomal region frequently amplified in breast and ovarian cancer, and correlated with shorter patient survival in these cancers. To properly address the difficulties in managing complex gene interactions given our limited sample, statistical significance of edge strengths was evaluated using bootstrapping and the less reliable edges were pruned to increase the network robustness. We found that 13 out of the 35 genes associated with deregulated ZNF217 expression in breast tumours have been previously associated with survival and/or prognosis in cancers. Identifying genes involved in lipid metabolism opens new fields of investigation to decipher the molecular mechanisms driven by the ZNF217 oncogene. Moreover, nine of the 13 genes have already been identified as putative ZNF217 targets by independent biological studies. We therefore suggest that the algorithms for inferring local BNs are valuable data mining tools for unraveling complex mechanisms of biological pathways from expression data. The source code is available at http://***-lyon1. fr/similar to aaussem/***. (c) 2012 Elsevier Ltd. All rights reserved.

关键词： Machine learning Bayesian networks Feature selection microarray data Breast cancer Ovarian cancer

来源：评论

学校读者我要写书评

暂无评论

A Positive False Discovery Rate Convergence Result

引用

COMMUNICATIONS IN STATISTICS-THEORY AND METHODS 2013年第23期42卷 4239-4246页

作者： Melnykov, Igor Colorado State Univ Dept Math & Phys Pueblo CO 81001 USA

The positive false discovery rate (pFDR) is the average proportion of false rejections given that the overall number of rejections is greater than zero. Assuming that the proportion of true null hypotheses, proportion of false positives, and proportion of true positives all converge pointwise, the pFDR converges to a continuous limit uniformly over all significance levels. We are showing that the uniform convergence still holds given a weaker assumption that the proportion of true positives converges in L-1.

关键词： Differentially expressed genes FDR microarray data Multiple testing pFDR

来源：评论

学校读者我要写书评

暂无评论

Detecting Dense Subgraphs in Complex Networks Based on Edge Density Coefficient

引用

CHINESE JOURNAL OF ELECTRONICS 2013年第3期22卷 517-520页

作者： Guan Bo Zan Xiangzhen Xiao Biyu Ma Runnian Zhang Fengyue Liu Wenbin Ningbo Univ Technol Coll Electron & Informat Engn Ningbo 315016 Zhejiang Peoples R China Wenzhou Univ Dept Phys & Elect Informat Engn Wenzhou 325035 Zhejiang Peoples R China Air Force Engn Univ Telecommun Inst Xian 710077 Peoples R China Beijing Inst Technol Sch Life Sci & Technol Dept Biomed Engn Beijing 100081 Peoples R China

Densely connected patterns in biological networks can help biologists to elucidate meaningful insights. How to detect dense subgraphs effectively and quickly has been an urgent challenge in recent years. In this paper, we proposed a local measure named the edge density coefficient, which could indicate whether an edge locates a dense subgraph or not. Simulation results showed that this measure could improve both the accuracy and speed in detecting dense subgraphs. Thus, the G-N algorithm can be extended to large biological networks by this local measure. Finally, we applied this algorithm to microarray data sets of Saccharomyces cerevisiae, and performed the gene ontology analysis of the result by the GOEAST.

关键词： Complex network Dense subgraph microarray data

来源：评论

学校读者我要写书评

暂无评论

PCA consistency for the power spiked model in high-dimensional settings

引用

JOURNAL OF MULTIVARIATE ANALYSIS 2013年 122卷 334-354页

作者： Yata, Kazuyoshi Aoshima, Makoto Univ Tsukuba Inst Math Ibaraki 3058571 Japan

In this paper, we propose a general spiked model called the power spiked model in high-dimensional settings. We derive relations among the data dimension, the sample size and the high-dimensional noise structure. We first consider asymptotic properties of the conventional estimator of eigenvalues. We show that the estimator is affected by the high-dimensional noise structure directly, so that it becomes inconsistent. In order to overcome such difficulties in a high-dimensional situation, we develop new principal component analysis (PCA) methods called the noise-reduction methodology and the cross-data-matrix methodology under the power spiked model. We show that the new PCA methods can enjoy consistency properties not only for eigenvalues but also for PC directions and PC scores in high-dimensional settings. (C) 2013 Elsevier Inc. All rights reserved.

关键词： Cross-data-matrix methodology HDLSS Large p small n microarray data Noise-reduction methodology

来源：评论

学校读者我要写书评

暂无评论

Semantic Subgroup Discovery Systems and Workflows in the SDM-Toolkit

引用

COMPUTER JOURNAL 2013年第3期56卷 304-320页

作者： Vavpetic, Anze Lavrac, Nada Jozef Stefan Inst Ljubljana 1000 Slovenia Univ Nova Gorica Nova Gorica Slovenia

This paper addresses semantic data mining, a new data mining paradigm in which ontologies are exploited in the process of data mining and knowledge discovery. This paradigm is introduced together with new semantic subgroup discovery systems SDM-search for enriched gene sets (SEGS) and SDM-Aleph. These systems are made publicly available in the new SDM-Toolkit for semantic data mining. The toolkit is implemented in the Orange4WS data mining platform that supports knowledge discovery workflow construction from local and distributed data mining services. On the basis of the experimental evaluation of semantic subgroup discovery systems on two publicly available biomedical datasets, the paper results in a thorough quantitative and qualitative evaluation of SDM-SEGS and SDM-Aleph and their comparison with SEGS, a system for enriched gene set discovery from microarray data.

关键词： semantic data mining relational data mining inductive logic programming domain knowledge subgroup discovery ontologies microarray data

来源：评论

学校读者我要写书评

暂无评论

Find Significant Gene Information Based on Changing Points of microarray data

引用

IEEE TRANSACTIONS ON BIOMEDICAL ENGINEERING 2009年第4期56卷 1108-1116页

作者： Liu, Yihui Bai, Li Shandong Inst Light Ind Inst Intelligent Informat Proc Sch Informat Sci & Technol Jinan 250353 Peoples R China Univ Nottingham Sch Comp Sci Nottingham NG8 1BB England

For transformations, a set of new basis is normally chosen for the data. The selection of the new basis determines the properties that will be held by the transformed data. For wavelet transform, a set of wavelet basis aims to detect the localized features contained in microarray data. In this research, we investigate the performance of wavelet features based on wavelet detail coefficients at third level in wavelet space, which characterize the changing points of microarray data based on high-order information. In order to find the significant gene information, we reconstruct wavelet details based on detail coefficients. A genetic algorithm is used to select the best features from reconstructed details in original data space, and corresponding gene information is detected based on selected features. Experiments are carried out on four datasets and experimental results show that good performance is achieved based on twofold cross-validation experiments.

关键词： Features extraction feature optimization microarray data wavelet analysis

来源：评论

学校读者我要写书评

暂无评论

Prediction of core cancer genes using multi-task classification framework

引用

JOURNAL OF THEORETICAL BIOLOGY 2013年 317卷 62-70页

作者： Gao, Shan Xu, Shuo Fang, Yaping Fang, Jianwen Univ Kansas Appl Bioinformat Lab Lawrence KS 66047 USA Inst Sci & Tech Informat China Beijing 100038 Peoples R China

Cancer is deemed as a highly heterogeneous disease specific to cell type and tissue origin. All cancers, however, share a common pathogenesis. Therefore, it is widely believed that cancers may share common mechanisms. In this study, we introduce a novel strategy based on multi-tasking learning methods to predict core cancer genes shared by multiple cancers in the hope of elucidating common cancer mechanisms. Our strategy uses two multi-tasking learning algorithms, one for feature selection and the other for validation of selected features. The combined use of two methods results in more robust classifiers and reliable selected features. The top 73 significant features, mapped to 72 genes, are selected as core cancer genes. The effectiveness of the 73 features is further demonstrated in a blind test conducted on an independent test data. The biological significance of these genes is evaluated using systems biology analyses. Extensive functional, pathway and network analysis confirms findings in previous studies and brings new insights into common cancer mechanisms. Our strategy can be used as a general method to find important genes from large gene expression datasets on the genomic level. The selected genes can be used to predict cancers. (C) 2012 Elsevier Ltd. All rights reserved.

关键词： Multi-task learning Classification Core cancer genes Gene differential expression microarray data

来源：评论

学校读者我要写书评

暂无评论

MicroRNA-mRNA interaction network using TSK-type recurrent neural fuzzy network

引用

GENE 2013年第2期515卷 385-390页

作者： Vineetha, S. Bhat, C. Chandra Shekara Idicula, Sumam Mary Govt Engn Coll Dept Comp Sci Idukki Kerala India Natl Inst Interdisciplinary Sci & Technol Trivandrum Kerala India Cochin Univ Sci & Technol Cochin 682016 Kerala India

MicroRNAs are short non-coding RNAs that can regulate gene expression during various crucial cell processes such as differentiation, proliferation and apoptosis. Changes in expression profiles of miRNA play an important role in the development of many cancers, including CRC. Therefore, the identification of cancer related miRNAs and their target genes are important for cancer biology research. In this paper, we applied TSK-type recurrent neural fuzzy network (TRNFN) to infer miRNA-mRNA association network from paired miRNA, mRNA expression profiles of CRC patients. We demonstrated that the method we proposed achieved good performance in recovering known experimentally verified miRNA-mRNA associations. Moreover, our approach proved successful in identifying 17 validated cancer miRNAs which are directly involved in the CRC related pathways. Targeting such miRNAs may help not only to prevent the recurrence of disease but also to control the growth of advanced metastatic tumors. Our regulatory modules provide valuable insights into the pathogenesis of cancer. (C) 2012 Elsevier B.V. All rights reserved.

关键词： MicroRNA microarray data MicroRNA-mRNA Interaction Network TSK-type recurrent neural fuzzy network Fuzzy logic

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：