检索结果-内蒙古大学图书馆

A quantum leap in the reproducibility, precision, and sensitivity of gene expression profile analysis even when sample size is extremely small

引用

JOURNAL OF BIOINFORMATICS AND COMPUTATIONAL BIOLOGY 2015年第4期13卷 1550018-1550018页

作者： Lim, Kevin Li, Zhenhua Choi, Kwok Pui Wong, Limsoon Natl Univ Singapore Sch Comp 13 Comp Dr Singapore 117417 Singapore Natl Univ Singapore Dept Pediat 10 Med Dr Singapore 117597 Singapore Natl Univ Singapore Dept Stat & Appl Probabil 6 Sci Dr 2 Singapore 117546 Singapore

Transcript-level quantification is often measured across two groups of patients to aid the discovery of biomarkers and detection of biological mechanisms involving these biomarkers. Statistical tests lack power and false discovery rate is high when sample size is small. Yet, many experiments have very few samples (<= 5). This creates the impetus for a method to discover biomarkers and mechanisms under very small sample sizes. We present a powerful method, ESSNet, that is able to identify subnetworks consistently across independent datasets of the same disease phenotypes even under very small sample sizes. The key idea of ESSNet is to fragment large pathways into smaller subnetworks and compute a statistic that discriminates the subnetworks in two phenotypes. We do not greedily select genes to be included based on differential expression but rely on gene-expression-level ranking within a phenotype, which is shown to be stable even under extremely small sample sizes. We test our subnetworks on null distributions obtained by array rotation;this preserves the gene-gene correlation structure and is suitable for datasets with small sample size allowing us to consistently predict relevant subnetworks even when sample size is small. For most other methods, this consistency drops to less than 10% when we test them on datasets with only two samples from each phenotype, whereas ESSNet is able to achieve an average consistency of 58% (72% when we consider genes within the subnetworks) and continues to be superior when sample size is large. We further show that the subnetworks identified by ESSNet are highly correlated to many references in the biological literature. ESSNet and supplementary material are available at: http://***:8080/essnet.

关键词： microarray data analysis gene expression profiling subnetworks biological pathways GSEA SNet FSNet PFSNet

来源：评论

学校读者我要写书评

暂无评论

On Efficient Feature Ranking Methods for High-Throughput data analysis

引用

IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2015年第6期12卷 1374-1384页

作者： Liao, Bo Jiang, Yan Liang, Wei Peng, Lihong Peng, Li Hanyurwimfura, Damien Li, Zejun Chen, Min Hunan Univ Key Lab Embedded & Network Comp Hunan Prov Coll Informat Sci & Engn Changsha 410082 Hunan Peoples R China Hunan Univ Sci & Technol Sch Comp Sci & Engn Xiangtan 411201 Peoples R China Changsha Med Univ Dept Comp Sci & Engn Changsha 410219 Hunan Peoples R China

Efficient mining of high-throughput data has become one of the popular themes in the big data era. Existing biologyrelated feature ranking methods mainly focus on statistical and annotation information. In this study, two efficient feature ranking methods are presented. Multi-target regression and graph embedding are incorporated in an optimization framework, and feature ranking is achieved by introducing structured sparsity norm. Unlike existing methods, the presented methods have two advantages: (1) the feature subset simultaneously account for global margin information as well as locality manifold information. Consequently, both global and locality information are considered. (2) Features are selected by batch rather than individually in the algorithm framework. Thus, the interactions between features are considered and the optimal feature subset can be guaranteed. In addition, this study presents a theoretical justification. Empirical experiments demonstrate the effectiveness and efficiency of the two algorithms in comparison with some state-of-the-art feature ranking methods through a set of real-world gene expression data sets.

关键词： Feature ranking l(2,1)-norm microarray data analysis convex optimization regression manifold learning

来源：评论

学校读者我要写书评

暂无评论

Global Identification and Characterization of Pathogen-Inducible Rice promoters

Global Identification and Characterization of Pathogen-Induc...

引用

中国植物病理学会2015年学术年会

作者： KONG Wei-wen CHENG Jia LI Bin School of Horticulture and Plant Protection Yangzhou University

Pathogen-inducible plant promoters(PIPs) are able to respond to pathogens after infection,which are usually activated by pathogens only at the time point after *** could have applications as molecular markers,and for engineering crops with increased disease *** study obtained 62 pathogen-inducible plant promoters sequences from 14 species through literatures and public *** potential cis-acting elements of the above PIPs were identified using Plant CARE and PLACE *** candidate rice PIPs,which contain potential pathogen-inducible AS-1,G-box,H-box and GCC-box cis-acting elements,were screened from rice promoterome *** a result,total 417 candidate rice PIPs were *** genes under the control of potential rice PIPs were annotated by searching NCBI COG database with their sequences as *** the candidate genes,55.26%are function-unknown;13.16%may involve in metabolism;9.57%may function in the cellular processes and signaling;8.61%are poorly characterized;8.37%may play roles in the information storage and processing;and 5.02%may act as dual *** validate the 417 candidate rice PIPs,several microarray data of infected rice by pathogens were downloaded from public database and *** results indicated that changes with highly significance(p < 0.01) were observed in transcriptional level from 128 candidate genes controlled by *** the genes,92 genes were upregulated,and 63 genes were down-regulated in the microarray *** a control,20 rice genes controlled by non-PIPs(out of the PIPs) were randomly selected from 12 rice chromosomes(one or two genes selected from each chromosome) and their expression were analyzed based on the above microarray *** total 139 microarray treatments,5 gene-events were observed whose expressions varied on the significant *** false positive rate is 3.60%(5/139).The results of this study would underlie the elucidation of mechanisms by which rice PIPs regulate g

关键词： Rice Pathogen-inducible promoters cis-acting elements Regulation of gene expression microarray data analysis

来源：评论

学校读者我要写书评

暂无评论

PBC: A Software Framework Facilitating Pattern-Based Clustering for microarray data analysis

PBC: A Software Framework Facilitating Pattern-Based Cluster...

引用

International Joint Conference on Bioinformatics, Systems Biology and Intelligent Computing

作者： Shin, Dong-Guk Hong, Seung-Hyun Joshi, Pujan Nori, Ravi Pei, Baikang Wang, Hsin-Wei Harrington, Patrick Kuo, Lynn Kalajzic, Ivo Rowe, David Univ Connecticut Dept Comp Sci & Engn Storrs CT 06269 USA Univ Connecticut Dept Stat Storrs CT 06269 USA Univ Connecticut Ctr Hlth Dept Genet & Dev Biol Farmington CT 06030 USA

ISBN: (纸本)9780769537399

microarray data produces expression pattern of thousands of genes at once. Grouping these gene expression patterns to have each group convey some biologically meaningful sight entails use of a clustering method. Two problems exist when attempting to use conventional clustering methods for the microarray data analysis. Presence of outliers skews the mean value computation which, in turn influences placement of inconsistent gene expression patterns into one group. The clustering algorithms themselves generally cannot determine the right size of the clusters. We present a new method which approaches to the clustering problem from a different angle. That is, the clustering of gene expression patterns is better dealt with within a software framework that is conducive to helping biologists derive the right size of clusters utilizing their understanding of the experimental context once the baseline clusters are computed using the fold changes of gene expression levels. We discuss our experiences of using the framework in analyzing numerous microarray data experiments.

关键词： clustering microarray data analysis data mining bioinformatic gene expression pattern

来源：评论

学校读者我要写书评

暂无评论

TotalPLS: Local Dimension Reduction for Multicategory microarray data

引用

IEEE TRANSACTIONS ON HUMAN-MACHINE SYSTEMS 2014年第1期44卷 125-138页

作者： You, Wenjie Yang, Zijiang Yuan, Mingshun Ji, Guoli Xiamen Univ Dept Automat Xiamen 361005 Fujian Peoples R China York Univ Sch Informat Technol Toronto ON M3J 1P3 Canada Xiamen Univ Innovat Ctr Cell Biol Xiamen 361102 Peoples R China

Dimension reduction is an important topic in data mining, which is widely used in the areas of genetics, medicine, and bioinformatics. We propose a new local dimension reduction algorithm TotalPLS that operates in a unified partial least squares (PLS) framework and implement an information fusion of PLSbased feature selection and feature extraction. This paper focuses on extracting the potential structure hidden in high-dimensional multicategory microarray data, and interpreting and understanding the results provided by the potential structure information. First, we propose using PLS-based recursive feature elimination (PLSRFE) in multicategory problems. Then, we perform feature importance analysis based on PLSRFE for high-dimensional microarray data to determine the information feature (biomarkers) subset, which relates to the studied tumor subtypes problem. Finally, PLS-based supervised feature extraction is conducted on the selected specific genes subset to extract comprehensive features that best reflect the nature of classification to have a discriminating ability. The proposed algorithm is compared with several state-of-the-art methods using multiple high-dimensional multicategory microarray datasets. Our comparison is performed in terms of recognition accuracy, relevance, and redundancy. Experimental results show that the algorithm proposed by us can improve the recognition rate and computational efficiency. Furthermore, mining potential structure information improves the interpretability and understandability of recognition results. The proposed algorithm can be effectively applied tomicroarray data analysis for the discovery of gene coexpression and coregulation.

关键词： Dimension reduction feature extraction feature selection microarray data analysis partial least squares (PLS)

来源：评论

学校读者我要写书评

暂无评论

Self-service infrastructure container for data intensive application

引用

JOURNAL OF CLOUD COMPUTING-ADVANCES SYSTEMS AND APPLICATIONS 2014年第1期3卷 1-21页

作者： Musa, Ibrahim K. Walker, Stuart D. Owen, Anne M. Harrison, Andrew P. Univ Essex Sch Comp Sci & Elect Engn Wivenhoe Pk Colchester CO4 3SQ Essex England Univ Essex Dept Math Sci & Biol Sci Colchester CO4 3SQ Essex England

Cloud based scientific data management - storage, transfer, analysis, and inference extraction - is attracting interest. In this paper, we propose a next generation cloud deployment model suitable for data intensive applications. Our model is a flexible and self-service container-based infrastructure that delivers - network, computing, and storage resources together with the logic to dynamically manage the components in a holistic manner. We demonstrate the strength of our model with a bioinformatics application. Dynamic algorithms for resource provisioning and job allocation suitable for the chosen dataset are packaged and delivered in a privileged virtual machine as part of the container. We tested the model on our private internal experimental cloud that is built on low-cost commodity hardware. We demonstrate the capability of our model to create the required network and computing resources and allocate submitted jobs. The results obtained shows the benefits of increased automation in terms of both a significant improvement in the time to complete a data analysis and a reduction in the cost of analysis. The algorithms proposed reduced the cost of performing analysis by 50% at 15 GB of data analysis. The total time between submitting a job and writing the results after analysis also reduced by more than 1 hr at 15 GB of data analysis.

关键词： Cloud computing microarray data analysis Bioinformatics Cells-As-A-Service

来源：评论

学校读者我要写书评

暂无评论

BI-COPAM ENSEMBLE CLUSTERING APPLICATION TO FIVE ESCHERICHIA COLI BACTERIAL dataSETS 22

BI-COPAM ENSEMBLE CLUSTERING APPLICATION TO FIVE ESCHERICHIA...

引用

22nd European Signal Processing Conference (EUSIPCO)

作者： Abu-Jamous, Basel Fa, Rui Roberts, David J. Nandi, Asoke K. Brunel Univ Dept Elect & Comp Engn Uxbridge UB8 3PH Middx England Univ Oxford John Radcliffe Hosp Natl Hlth Serv Blood & Transplant Oxford OX3 9UB England Univ Jyvaskyla Dept Math Informat Technol Jyvaskyla Finland

ISBN: (纸本)9780992862619

Bi-CoPaM ensemble clustering has the ability to mine a set of microarray datasets collectively to identify the subsets of genes consistently co-expressed in all of them. It also has the capability of considering the entire gene set without pre-filtering as it implicitly filters out less interesting genes. While it showed success in revealing new insights into the biology of yeast, it has never been applied to bacteria. In this study, we apply Bi-CoPaM to five bacterial datasets, identifying two clusters of genes as the most consistently co-expressed. Strikingly, their average profiles are consistently negatively correlated in most of the datasets. Thus, we hypothesise that they are regulated by a common biological machinery, and that their genes with unknown biological processes may be participating in the same processes in which most of their genes known to participate. Additionally, our results demonstrate the applicability of Bi-CoPaM to a wide range of species.

关键词： Bi-CoPaM microarray data analysis gene clustering Escherichia coli bacteria

来源：评论

学校读者我要写书评

暂无评论

KRLMM: an adaptive genotype calling method for common and low frequency variants

引用

BMC BIOINFORMATICS 2014年第1期15卷 1-11页

作者： Liu, Ruijie Dai, Zhiyin Yeager, Meredith Irizarry, Rafael A. Ritchie, Matthew E. Walter & Eliza Hall Inst Med Res Mol Med Div Parkville Vic 3052 Australia NCI Frederick Canc Genom Res Lab SAIC Frederick Inc Frederick MD 20877 USA Dana Farber Canc Inst Dept Biostat & Computat Biol Boston MA 02215 USA Univ Melbourne Dept Math & Stat Parkville Vic 3010 Australia Univ Melbourne Dept Med Biol Parkville Vic 3010 Australia

Background: SNP genotyping microarrays have revolutionized the study of complex disease. The current range of commercially available genotyping products contain extensive catalogues of low frequency and rare variants. Existing SNP calling algorithms have difficulty dealing with these low frequency variants, as the underlying models rely on each genotype having a reasonable number of observations to ensure accurate clustering. Results: Here we develop KRLMM, a new method for converting raw intensities into genotype calls that aims to overcome this issue. Our method is unique in that it applies careful between sample normalization and allows a variable number of clusters k (1, 2 or 3) for each SNP, where k is predicted using the available data. We compare our method to four genotyping algorithms (GenCall, GenoSNP, Illuminus and OptiCall) on several Illumina data sets that include samples from the HapMap project where the true genotypes are known in advance. All methods were found to have high overall accuracy (> 98%), with KRLMM consistently amongst the best. At low minor allele frequency, the KRLMM, OptiCall and GenoSNP algorithms were observed to be consistently more accurate than GenCall and Illuminus on our test data. Conclusions: Methods that tailor their approach to calling low frequency variants by either varying the number of clusters (KRLMM) or using information from other SNPs (OptiCall and GenoSNP) offer improved accuracy over methods that do not (GenCall and Illuminus). The KRLMM algorithm is implemented in the open-source crlmm package distributed via the Bioconductor project (http://***).

关键词： Genotyping Clustering microarray data analysis

来源：评论

学校读者我要写书评

暂无评论

BI-COPAM ENSEMBLE CLUSTERING APPLICATION TO FIVE ESCHERICHIA COLI BACTERIAL dataSETS

BI-COPAM ENSEMBLE CLUSTERING APPLICATION TO FIVE ESCHERICHIA...

引用

European Signal Processing Conference

作者： Basel Abu-Jamous Rui Fa David J. Roberts Asoke K. Nandi Department of Electronic and Computer Engineering Brunel University National Health Service Blood and Transplant The University of Oxford

ISBN: (纸本)9781479946037

关键词： Bi-CoPaM microarray data analysis Gene clustering Escherichia coli bacteria

来源：评论

学校读者我要写书评

暂无评论

A survey of pattern classification-based methods for predicting survival time of lung cancer patients

A survey of pattern classification-based methods for predict...

引用

IEEE International Conference on Bioinformatics and Biomedicine

作者： Bin Gan Chun-Hou Zheng Hong-Qiang Wang College of Information and Communication Technology Qufu Normal University College of Electrical Engineering and Automation Anhui University Intelligent Computing Lab Institute of Intelligent Machines Chinese Academy of Science

ISBN: (纸本)9781479956708

Cancer prognosis is an important clinical practice in cancer medicine and is an important factor in developing personalized medicine. But till now, researches focus on developing recurrence risk indices that tell poor or good survival for given cancer patients. These indices, however, are insufficient and elusive in the clinic. In this paper, we propose to predict survival time of cancer patients using pattern recognition approach, which is more informative and favorable to clinicians and patients in clinical practice. We conduct an extensive survey of pattern recognition methods for the prognosis based on real-world benchmark microarray data sets. In particular, various types of data preprocessing methods and various types of classification models are introduced and examined for predicting survival time of lung cancer based on gene expression. The experimental results show that pattern recognition method can provide a feasible and efficient way to predict survival time of cancer patients. It is expected that the pattern classification-based strategy opens a new paradigm of cancer prognosis for predicting survival time of cancer patients in the clinic.

关键词： Patten classification Cancer prognosis Lung cancer Surivival time microarray data analysis

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：