检索结果-内蒙古大学图书馆

EEE International Conference on Internet of Things (iThings) and IEEE Green Computing and Communications (GreenCom) and IEEE Cyber, Physical and Social Computing (CPSCom) and IEEE Smart data (Smartdata)

作者： Mohammed, Nwayyin Najat AbdulAzeez, Adnan Mohsin Univ Zakho Dept Comp Sci Zakho Iraq Duhok Poly Tech Univ Duhok Iraq

ISBN: (纸本)9781538630662

data mining is a process which discovers patterns and retrieval knowledge in large datasets. Many learning and data mining algorithms rely on distance metrics. Cluster analysis is one of learning algorithms which adopted to biological data, for example;microarray expression data. In this study, we assessed the validity of five distance metrics (Euclidean, Manhattan, Minkowski, Cosine, and Mahalanobis) with the partitioning around medoids (PAM) algorithm on microarray datasets. microarray datasets were pre-processed prior to analysis, and the evaluation of the algorithm was undertaken using Dunn's validity index. Our results showed when selected microarray datasets were clustered with partitioning around medoids based on Manhattan distance, Minkowski, Cosine and Euclidean distance for different k partitions all distances exhibited unsatisfactory performance, however, the partitioning around medoids algorithm generates an optimal cluster solution when used with Mahalanobis distance.

关键词： microarray data partitioning around medoids distances validity index number of clusters

来源：评论

学校读者我要写书评

暂无评论

Feature selection model based on clustering and ranking in pipeline for microarray data

引用

Informatics in Medicine Unlocked 2017年 9卷 107-122页

作者： Sahu, Barnali Dehuri, Satchidananda Jagadev, Alok Kumar Department of Computer Science and Engineering Siksha ‘O'Anusandhan University Bhubaneswar 751030 Odisha India Department of Information and Communication Technology Fakir Mohan University Vyasa Vihar Balasore 756019 Odisha India School of Computer Engineering KIIT University Bhubaneswar 751024 Odisha India

Most of the available feature selection techniques in the literature are classifier bound. It means a group of features tied to the performance of a specific classifier as applied in wrapper and hybrid approach. Our objective in this study is to select a set of generic features not tied to any classifier based on the proposed framework. This framework uses attribute clustering and feature ranking techniques in pipeline in order to remove redundant features. On each uncovered cluster, signal-to-noise ratio, t-statistics and significance analysis of microarray are independently applied to select the top ranked features. Both filter and evolutionary wrapper approaches have been considered for feature selection and the data set with selected features are given to ensemble of predefined statistically different classifiers. The class labels of the test data are determined using majority voting technique. Moreover, with the aforesaid objectives, this paper focuses on obtaining a stable result out of various classification models. Further, a comparative analysis has been performed to study the classification accuracy and computational time of the current approach and evolutionary wrapper techniques. It gives a better insight into the features and further enhancing the classification accuracy with less computational time. © 2017

关键词： Classification Clustering Feature selection Filter microarray data Wrapper

来源：评论

学校读者我要写书评

暂无评论

A Meta-Review of Feature Selection Techniques in the Context of microarray data 5th

A Meta-Review of Feature Selection Techniques in the Context...

引用

5th International Work-Conference on Bioinformatics and Biomedical Engineering (IWBBIO)

作者： Mungloo-Dilmohamud, Zahra Jaufeerally-Fakim, Yasmina Pena-Reyes, Carlos Univ Mauritius Reduit Mauritius Univ Appl Sci Western Switzerland HES SO Sch Business & Engn Vaud HEIG VD Computat Intelligence Computat Biol Grp SIBCI4CB Yverdon Switzerland

ISBN: (纸本)9783319561486;9783319561479

microarray technologies produce very large amounts of data that need to be classified for interpretation. Large data coupled with small sample sizes make it challenging for researchers to get useful information and therefore a lot of effort goes into the design and testing of feature selection tools;literature abounds with description of numerous methods. In this paper we select five representative review papers in the field of feature selection for microarray data in order to understand their underlying classification of methods. Finally, on this base, we propose an extended taxonomy for categorizing feature selection techniques and use it to classify the main methods presented in the selected reviews.

关键词： Feature selection microarray data Machine learning Statistical methods

来源：评论

学校读者我要写书评

暂无评论

Feature Selection Software Development Using Artificial Bee Colony on DNA microarray data 6

Feature Selection Software Development Using Artificial Bee ...

引用

International Electronics Symposium on Knowledge Creation and Intelligent Computing (IES-KCIC)

作者： Andaru, Wildan Syarif, Iwan Barakbah, Ali Ridho Elect Engn Polytech Inst Surabaya Informat & Comp Engn Surabaya Indonesia

ISBN: (纸本)9781538607169

DNA microarray data is a high-dimensional data that enables the researchers to analyze the expression of many genes in a single reaction quickly and in an efficient manner. Its characteristics such as small sample size, class imbalance, and data complexity causes it difficult to classified. Feature selection is a process that automatically selects features that are most relevant to the predictive modeling in dataset. This research aims at investigating, implementing, and analyzing a feature selection method using the Artificial Bee Colony (ABC) approach. The result is compared with other evolution algorithms, which is Genetic Algorithm (GA) and Particle Swarm Optimization (PSO). The result is that feature selection using ABC has a better result at classification using k - Nearest Neighbor (k-NN) and Decision Tree (DT), but has a slightly higher fracture of features compared to GA and PSO algorithms.

关键词： Feature Selection Artificial Bee Colony microarray data data Mining

来源：评论

学校读者我要写书评

暂无评论

Identifying Three-Way Gene Interactions from microarray data Using Kolmogorov-Smirnov and Cross-Match Tests

Identifying Three-Way Gene Interactions from Microarray Data...

引用

作者： Khadka, Shubhashree University of Arkansas

学位级别：M.S.

Human gene network is much more complex than just pairwise interaction among the genes. Zhang et al. [6] extracted microarray data from International Genomics Consortium (IGC), and presented the detection of three-way gene interactions in their paper using Fisher’s z-transformation test. Three-way gene interactions are closer than pairwise correlations in representing the complex gene structures. Additionally, it was more tractable than assessing four or more gene interactions. In this paper, we are simulating different models where Fisher’s test might not be as effective. Zhang et al.’s approach utilized Pearson’s correlation coefficients and involved detection of linear interactions only. Since gene interactions could show any kind of behavior, their evaluation approach might not work most of the time. Therefore, we are utilizing the dataset Zhang et al. provided in order to detect the three-way gene interaction using non-parametric tests like Kolmogorov-Smirnov and Cross-Match.

关键词： Cross-Match Test Gene Interactions Kolmogorov-Smirnov Test microarray data

来源：评论

学校读者我要写书评

暂无评论

A global learning with local preservation method for microarray data imputation

引用

COMPUTERS IN BIOLOGY AND MEDICINE 2016年第0期77卷 76-89页

作者： Chen, Ye Wang, Aiguo Ding, Huitong Que, Xia Li, Yabo An, Ning Jiang, Lili Hefei Univ Technol Sch Comp & Informat Hefei 230009 Peoples R China Hefei Univ Technol Sch Software Hefei 230009 Peoples R China Lanzhou Univ Coll Life Sci Lanzhou 730000 Peoples R China Umea Univ Dept Comp Sci S-90187 Umea Sweden

microarray data suffer from missing values for various reasons, including insufficient resolution, image noise, and experimental errors. Because missing values can hinder downstream analysis steps that require complete data as input, it is crucial to be able to estimate the missing values. In this study, we propose a Global Learning with Local Preservation method (GL2P) for imputation of missing values in microarray data. GL2P consists of two components: a local similarity measurement module and a global weighted imputation module. The former uses a local structure preservation scheme to exploit as much information as possible from the observable data, and the latter is responsible for estimating the missing values of a target gene by considering all of its neighbors rather than a subset of them. Furthermore, GL2P imputes the missing values in ascending order according to the rate of missing data for each target gene to fully utilize previously estimated values. To validate the proposed method, we conducted extensive experiments on six benchmarked microarray datasets. We compared GL2P with eight state-of-the-art imputation methods in terms of four performance metrics. The experimental results indicate that GL2P outperforms its competitors in terms of imputation accuracy and better preserves the structure of differentially expressed genes. In addition, GL2P is less sensitive to the number of neighbors than other local learning-based imputation. methods. (C) 2016 Elsevier Ltd. All rights reserved.

关键词： Missing value imputation microarray data Global learning Local preservation Regression model

来源：评论

学校读者我要写书评

暂无评论

Pipelining the ranking techniques for microarray data classification: A case study

引用

APPLIED SOFT COMPUTING 2016年第0期48卷 298-316页

作者： Dash, Rasmita Misra, Bijan Bihari Siksha O Anusandhan Univ Inst Tech Educ & Res Dept Comp Sc & Informat Technol Bhubaneswar 751030 Odisha India Silicon Inst Technol Dept Comp Sc & Engn Bhubaneswar 751024 Odisha India

Identification of relevant genes from microarray data is an apparent need in many applications. For such identification different ranking techniques with different evaluation criterion are used, which usually assign different ranks to the same gene. As a result, different techniques identify different gene subsets, which may not be the set of significant genes. To overcome such problems, in this study pipelining the ranking techniques is suggested. In each stage of pipeline, few of the lower ranked features are eliminated and at the end a relatively good subset of feature is preserved. However, the order in which the ranking techniques are used in the pipeline is important to ensure that the significant genes are preserved in the final subset. For this experimental study, twenty four unique pipeline models are generated out of four gene ranking strategies. These pipelines are tested with seven different microarray databases to find the suitable pipeline for such task. Further the gene subset obtained is tested with four classifiers and four performance metrics are evaluated. No single pipeline dominates other pipelines in performance;therefore a grading system is applied to the results of these pipelines to find out a consistent model. The finding of grading system that a pipeline model is significant is also established by Nemenyi post-hoc hypothetical test. Performance of this pipeline model is compared with four ranking techniques, though its performance is not superior always but majority of time it yields better results and can be suggested as a consistent model. However it requires more computational time in comparison to single ranking techniques. (C) 2016 Elsevier B.V. All rights reserved.

关键词： microarray data Feature selection Feature ranking technique Classification Statistical test

来源：评论

学校读者我要写书评

暂无评论

A hybrid gene selection approach for microarray data classification using cellular learning automata and ant colony optimization

引用

GENOMICS 2016年第6期107卷 231-238页

作者： Sharbaf, Fatemeh Vafaee Mosafer, Sara Moattar, Mohammad Hossein Imam Reza Int Univ Dept Comp Engn Mashhad Iran Islamic Azad Univ Dept Software Engn Mashhad Branch Mashhad Iran

This paper proposes an approach for gene selection in microarray data. The proposed approach consists of a primary filter approach using Fisher criterion which reduces the initial genes and hence the search space and time complexity. Then, a wrapper approach which is based on cellular learning automata (CLA) optimized with ant colony method (ACO) is used to find the set of features which improve the classification accuracy. CLA is applied due to its capability to learn and model complicated relationships. The selected features from the last phase are evaluated using ROC curve and the most effective while smallest feature subset is determined. The classifiers which are evaluated in the proposed framework are K-nearest neighbor;support vector machine and naive Bayes. The proposed approach is evaluated on 4 microarray datasets. The evaluations confirm that the proposed approach can find the smallest subset of genes while approaching the maximum accuracy. (C) 2016 Elsevier Inc. All rights reserved.

关键词： Gene selection microarray data Cellular learning automata Ant colony optimization K-nearest neighbor Naive Bayes

来源：评论

学校读者我要写书评

暂无评论

A centroid-based gene selection method for microarray data classification

引用

JOURNAL OF THEORETICAL BIOLOGY 2016年 400卷 32-41页

作者： Guo, Shun Guo, Donghui Chen, Lifei Jiang, Qingshan Xiamen Univ Dept Elect Engn Xiamen 361005 Fujian Peoples R China Chinese Acad Sci Shenzhen Inst Adv Technol Shenzhen 518000 Peoples R China Fujian Normal Univ Sch Math & Comp Sci Fuzhou 350117 Fujian Peoples R China

For classification problems based on microarray data, the data typically contains a large number of irrelevant and redundant features. In this paper, a new gene selection method is proposed to choose the best subset of features for microarray data with the irrelevant and redundant features removed. We formulate the selection problem as a L1-regularized optimization problem, based on a newly defined linear discriminant analysis criterion. Instead of calculating the mean of the samples, a kernel-based approach is used to estimate the class centroid to define both the between-class separability and the within-class compactness for the criterion. Theoretical analysis indicates that the global optimal solution of the L1-regularized criterion can be reached with a general condition, on which an efficient algorithm is derived to the feature selection problem in a linear time complexity with respect to the number of features and the number of samples. The experimental results on ten publicly available microarray datasets demonstrate that the proposed method performs effectively and competitively compared with state-of-the-art methods. (C) 2016 Elsevier Ltd. All rights reserved.

关键词： Class centroid microarray data Classification L1 regularization Gene selection

来源：评论

学校读者我要写书评

暂无评论

Unsupervised Machine Learning Approach for Gene Expression microarray data Using Soft Computing Technique 3rd

Unsupervised Machine Learning Approach for Gene Expression M...

引用

3rd International Conference on Advanced Computing, Networking, and Informatics (ICACNI)

作者： Rana, Madhurima Vijayeeta, Prachi Kar, Utsav Das, Madhabananda Mishra, B. S. P. KIIT Univ Bhubaneswar Orissa India

ISBN: (纸本)9788132225386;9788132225379

Machine learning is a burgeoning technology used for extractions of knowledge from an ocean of data. It has robust binding with optimization and artificial intelligence that delivers theory, methodologies and application domain to the field of statistics and computer science. Machine learning tasks are broadly classified into two groups namely supervised learning and unsupervised learning. The analysis of the unsupervised data requires thorough computational activities using different clustering algorithms. microarray gene expression data are taken into consideration for cluster regulating genes from non-regulating genes. In our work optimization technique (Cat Swarm Optimization) is used to minimize the number of cluster by evaluating the Euclidean distance among the centroids. A comparative study is being carried out by clustering the regulating genes before optimization and after optimization. In our work Principal component analysis (PCA) is incorporated for dimensionality reduction of vast dataset to ensure qualitative cluster analysis.

关键词： Gene expression microarray data Principal component analysis (PCA) Hierarchical clustering (HC) Cat swarm optimization (CSO)

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：