检索结果-内蒙古大学图书馆

microarray data Classification Based on Computational Verb

IEEE ACCESS 2019年 7卷 103310-103324页

作者： Liu, Kun-Hong Ng, Vincent To Yee Liong, Sze-Teng Hong, Qingqi Xiamen Univ Sch Informat Xiamen 361005 Fujian Peoples R China Hong Kong Polytech Univ Dept Comp Hong Kong Peoples R China Feng Chia Univ Dept Elect Engn Taichung 40724 Taiwan

Computational verb (CV) theory is a relatively new research field in mathematics and has been applied to many different fields. In the field of pattern recognition, the CV-based rule induction algorithm can generate some simple rules with CVs and adverbs by linguistically interpretable forms. In this paper, we present an interpretable rule extraction framework based on CV rule theory for the classification of microarray data. In contrast to the existing rule-based methods, the CV method enables to explicitly express the relationships of the genes based on some mathematical templates and hence enhance the understanding on the data results. Stay is a typical verb used in the CV to describe the trend of changes. In our algorithm, Stay is applied to generate CVR by a gene pair, named SCVR. The corresponding evolving and similarity functions for calculating the difference between SCVR rules are also presented to illustrate this process. Similar to other rule-based methods, the SCVR can achieve significant gene selection and cancer classification task concurrently. To evaluate the performance of our proposed approach, we conduct the experiments on several binary class and multiclass microarray datasets. Experiments confirm that the proposed method can outperform many rule-based classiers with the fusion of five rules.

关键词： Computational verb computational verb rules stay microarray data classifier ensemble

来源：评论

学校读者我要写书评

暂无评论

Modeling microarray data using a threshold mixture model

引用

BIOMETRICS 2004年第2期60卷 376-387页

作者： Kauermann, G Eilers, P Univ Bielefeld Dept Econ & Business Adm D-33501 Bielefeld Germany Leiden Univ Med Ctr Dept Med Stat NL-2300 RC Leiden Netherlands

An important goal of microarray studies is the detection of genes that show significant changes in expression when two classes of biological samples are being compared. We present an ANOVA-style mixed model with parameters for array normalization, overall level of gene expression, and change of expression between the classes. For the latter we assume a mixing distribution with a probability mass concentrated at zero, representing genes with no changes, and a normal distribution representing the level of change for the other genes. We estimate the parameters by optimizing the marginal likelihood. To make this practical, Laplace approximations and a backfitting algorithm are used. The performance of the model is studied by simulation and by application to publicly available data sets.

关键词： backfitting laplace approximation marginal likelihood microarray data mixed model

来源：评论

学校读者我要写书评

暂无评论

Robust gene selection methods using weighting schemes for microarray data analysis

引用

BMC BIOINFORMATICS 2017年第1期18卷 1-15页

作者： Kang, Suyeon Song, Jongwoo Ewha Womans Univ Dept Stat Seoul South Korea

Background: A common task in microarray data analysis is to identify informative genes that are differentially expressed between two different states. Owing to the high-dimensional nature of microarray data, identification of significant genes has been essential in analyzing the data. However, the performances of many gene selection techniques are highly dependent on the experimental conditions, such as the presence of measurement error or a limited number of sample replicates. Results: We have proposed new filter-based gene selection techniques, by applying a simple modification to significance analysis of microarrays (SAM). To prove the effectiveness of the proposed method, we considered a series of synthetic datasets with different noise levels and sample sizes along with two real datasets. The following findings were made. First, our proposed methods outperform conventional methods for all simulation set-ups. In particular, our methods are much better when the given data are noisy and sample size is small. They showed relatively robust performance regardless of noise level and sample size, whereas the performance of SAM became significantly worse as the noise level became high or sample size decreased. When sufficient sample replicates were available, SAM and our methods showed similar performance. Finally, our proposed methods are competitive with traditional methods in classification tasks for microarrays. Conclusions: The results of simulation study and real data analysis have demonstrated that our proposed methods are effective for detecting significant genes and classification tasks, especially when the given data are noisy or have few sample replicates. By employing weighting schemes, we can obtain robust and reliable results for microarray data analysis.

关键词： microarray data Gene selection method Significance analysis of microarrays Noisy data Robustness False discovery rate

来源：评论

学校读者我要写书评

暂无评论

Distributed feature selection: An application to microarray data classification

引用

APPLIED SOFT COMPUTING 2015年 30卷 136-150页

作者： Bolon-Canedo, V. Sanchez-Marono, N. Alonso-Betanzos, A. Univ A Coruna Dept Comp Sci Lab Res & Dev Artificial Intelligence LIDIA La Coruna 15071 Spain

Feature selection is often required as a preliminary step for many pattern recognition problems. However, most of the existing algorithms only work in a centralized fashion, i.e. using the whole dataset at once. In this research a new method for distributing the feature selection process is proposed. It distributes the data by features, i.e. according to a vertical distribution, and then performs a merging procedure which updates the feature subset according to improvements in the classification accuracy. The effectiveness of our proposal is tested on microarray data, which has brought a difficult challenge for researchers due to the high number of gene expression contained and the small samples size. The results on eight microarray datasets show that the execution time is considerably shortened whereas the performance is maintained or even improved compared to the standard algorithms applied to the non-partitioned datasets. (C) 2015 Elsevier B.V. All rights reserved.

关键词： Feature selection Distributed learning microarray data

来源：评论

学校读者我要写书评

暂无评论

microarray data Analysis of Yeast data using Sparse Non-Negative Matrix Factorization

Microarray Data Analysis of Yeast Data using Sparse Non-Nega...

引用

International Conference on Computational Science and Computational Intelligence (CSCI)

作者： Passi, Kalpdrum Draper, Paul Santala, Jillana Jain, Chakresh Kumar Laurentian Univ Dept Math & Comp Sci Sudbury ON Canada Jaypee Inst Informat Technol Dept Biotechnol Noida India

ISBN: (纸本)9781538626528

microarray expression data contains observations from thousands of genes across hundreds of samples. To extract meaningful information from these large datasets, the dimensionality reduction technique known as non-negative matrix factorization, or NMF, is introduced. This tool transforms the data and makes it more amenable to clustering. NMF was applied to a yeast microarray dataset. Three main clusters were discovered, corresponding to three distinct metabolic cycles. The data were also clustered using the k-means algorithm, and the clustering result was highly similar to that obtained by NMF.

关键词： sparse non-negative matrix factorization yeast microarray data k-means

来源：评论

学校读者我要写书评

暂无评论

An improved binary particle swarm optimization algorithm for clinical cancer biomarker identification in microarray data

引用

COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2024年 244卷 107987-107987页

作者： Yang, Guicheng Li, Wei Xie, Weidong Wang, Linjie Yu, Kun Northeastern Univ Coll Comp Sci & Engn Shenyang 110000 Liaoning Peoples R China Northeastern Univ Key Lab Intelligent Comp Med Image MIIC Minist Educ Shenyang 110000 Liaoning Peoples R China Natl Frontiers Sci Ctr Ind Intelligence & Syst Op Shenyang 110819 Peoples R China Northeastern Univ Coll Med & Bioinformat Engn Shenyang 110819 Liaoning Peoples R China

Background and Objective: The limited number of samples and high-dimensional features in microarray data make selecting a small number of features for disease diagnosis a challenging problem. Traditional feature selection methods based on evolutionary algorithms are difficult to search for the optimal set of features in a limited time when dealing with the high-dimensional feature selection problem. New solutions are proposed to solve the above problems. Methods: In this paper, we propose a hybrid feature selection method (C-IFBPFE) for biomarker identification in microarray data, which combines clustering and improved binary particle swarm optimization while incorporating an embedded feature elimination strategy. Firstly, an adaptive redundant feature judgment method based on correlation clustering is proposed for feature screening to reduce the search space in the subsequent stage. Secondly, we propose an improved flipping probability-based binary particle swarm optimization (IFBPSO), better applicable to the binary particle swarm optimization problem. Finally, we also design a new feature elimination (FE) strategy embedded in the binary particle swarm optimization algorithm. This strategy gradually removes poorer features during iterations to reduce the number of features and improve accuracy. Results: We compared C-IFBPFE with other published hybrid feature selection methods on eight public datasets and analyzed the impact of each improvement. The proposed method outperforms other current state-of-the-art feature selection methods in terms of accuracy, number of features, sensitivity, and specificity. The ablation study of this method validates the efficacy of each component, especially the proposed feature elimination strategy significantly improves the performance of the algorithm. Conclusions: The hybrid feature selection method proposed in this paper helps address the issue of highdimensional microarray data with few samples. It can select a small subset of

关键词： microarray data Feature selection Clustering Particle swarm optimization Embedded feature elimination

来源：评论

学校读者我要写书评

暂无评论

Determination of cluster number in clustering microarray data

引用

APPLIED MATHEMATICS AND COMPUTATION 2005年第2期169卷 1172-1185页

作者： Shen, JD Chang, SI Lee, ES Deng, YP Brown, SJ Kansas State Univ Dept Ind & Mfg Syst Engn Manhattan KS 66506 USA Kansas State Univ Div Biol Bioinformat Program Manhattan KS 66506 USA

The general purpose of clustering analysis of microarray data is to organize the data into meaningful groups based on their closeness. Although various algorithms have been proposed for the clustering of microarray data, the main difficulty remains to be the determination of the optimal number of clusters. To complicate the problem further, meaningful groups or closeness cannot be well defined due to the fuzziness nature of the data. This paper proposes a dynamic validity index to overcome this problem. The proposed index, in addition of the dynamic aspects, also takes care of both the intra- and the inter-distances of the clusters. An algorithm based on the proposed dynamic validity index and the traditional K-means method was developed. To make the proposed dynamic validity index more flexible, a modulating parameter gamma is introduced. This parameter can be used to take care of noisy data and balance the importance between compactness and separateness in the clusters. To illustrate the effectiveness of the approach, a numerical example by using the human serum data from the literature was solved and the sensitivity and robustness of the approach are examined. (c) 2004 Elsevier Inc. All rights reserved.

关键词： clustering microarray data K-means algorithm data mining validity index bio-informatics

来源：评论

学校读者我要写书评

暂无评论

Using a grid computing-based meta-evolutionary mining approach for the microarray data cancer-categorization

引用

ENGINEERING COMPUTATIONS 2017年第1期34卷 134-144页

作者： Chiang, Tai-Wei Chen, Ta-Cheng Natl Formosa Univ Dept Informat Management Yunlin Taiwan Asia Univ Dept M Commerce & Multimedia Applicat Taichung Taiwan

Purpose - The categorization response model through gene expression patterns turns into one of the most favorable utilizations of the microarray technology. In this study, the aim is to propose a grid computing-based meta-evolutionary mining approach as a categorization response model for gene selection and cancer classification. Design/methodology/approach - The proposed approach is based on the grid computing infrastructure for establishing the best attributes set selected from a big microarray data. The novel discriminant analysis is based on vector distant of median method as the evaluation function of meta-evolutionary mining approach. In this study, the proposed approach lays stress on finding the best attributes set for constructing a categorization response model with highest categorization accuracy. Findings - Examples for several benchmarking cancer microarray data sets were used to evaluate the proposed approach, whose results are also compared with other approaches in literatures. Experimental results from four benchmarking problems indicate that the proposed approach works effectively and efficiently, and the results of the proposed methods are superior to or as well as other existing methods in literatures. Originality/value - The novel discriminant analysis is based on vector distant of median method as the evaluation function of meta-evolutionary mining approach to discover the best feature subset automatically from the microarray tumor database. In this study, the proposed approach lays stress on finding the best attributes set for constructing a categorization response model with highest categorization accuracy.

关键词： Cancer categorization Grid computing Meta-evolutionary approach microarray data

来源：评论

学校读者我要写书评

暂无评论

A global learning with local preservation method for microarray data imputation

引用

COMPUTERS IN BIOLOGY AND MEDICINE 2016年第0期77卷 76-89页

作者： Chen, Ye Wang, Aiguo Ding, Huitong Que, Xia Li, Yabo An, Ning Jiang, Lili Hefei Univ Technol Sch Comp & Informat Hefei 230009 Peoples R China Hefei Univ Technol Sch Software Hefei 230009 Peoples R China Lanzhou Univ Coll Life Sci Lanzhou 730000 Peoples R China Umea Univ Dept Comp Sci S-90187 Umea Sweden

microarray data suffer from missing values for various reasons, including insufficient resolution, image noise, and experimental errors. Because missing values can hinder downstream analysis steps that require complete data as input, it is crucial to be able to estimate the missing values. In this study, we propose a Global Learning with Local Preservation method (GL2P) for imputation of missing values in microarray data. GL2P consists of two components: a local similarity measurement module and a global weighted imputation module. The former uses a local structure preservation scheme to exploit as much information as possible from the observable data, and the latter is responsible for estimating the missing values of a target gene by considering all of its neighbors rather than a subset of them. Furthermore, GL2P imputes the missing values in ascending order according to the rate of missing data for each target gene to fully utilize previously estimated values. To validate the proposed method, we conducted extensive experiments on six benchmarked microarray datasets. We compared GL2P with eight state-of-the-art imputation methods in terms of four performance metrics. The experimental results indicate that GL2P outperforms its competitors in terms of imputation accuracy and better preserves the structure of differentially expressed genes. In addition, GL2P is less sensitive to the number of neighbors than other local learning-based imputation. methods. (C) 2016 Elsevier Ltd. All rights reserved.

关键词： Missing value imputation microarray data Global learning Local preservation Regression model

来源：评论

学校读者我要写书评

暂无评论

A novel aggregate gene selection method for microarray data classification

引用

PATTERN RECOGNITION LETTERS 2015年 60-61卷 16-23页

作者： Thanh Nguyen Khosravi, Abbas Creighton, Douglas Nahavandi, Saeid Deakin Univ Ctr Intelligent Syst Res Geelong Vic 3216 Australia

This paper introduces a novel method for gene selection based on a modification of analytic hierarchy process (AHP). The modified AHP (MAHP) is able to deal with quantitative factors that are statistics of five individual gene ranking methods: two-sample t-test, entropy test, receiver operating characteristic curve, Wilcoxon test, and signal to noise ratio. The most prominent discriminant genes serve as inputs to a range of classifiers including linear discriminant analysis, k-nearest neighbors, probabilistic neural network, support vector machine, and multilayer perceptron. Gene subsets selected by MAHP are compared with those of four competing approaches: information gain, symmetrical uncertainty, Bhattacharyya distance and ReliefF. Four benchmark microarray datasets: diffuse large B-cell lymphoma, leukemia cancer, prostate and colon are utilized for experiments. As the number of samples in microarray data datasets are limited, the leave one out cross validation strategy is applied rather than the traditional cross validation. Experimental results demonstrate the significant dominance of the proposed MAHP against the competing methods in terms of both accuracy and stability. With a benefit of inexpensive computational cost, MAHP is useful for cancer diagnosis using DNA gene expression profiles in the real clinical practice. (C) 2015 Elsevier B.V. All rights reserved.

关键词： Gene selection Analytic hierarchy process Classification Gene expression profiles microarray data

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：