Pandemic influenza is a major concern worldwide. Availability of advanced technologies and the nucleotide sequences of a large number of pandemic and non-pandemic influenza viruses in 2009 provide a great opportunity ...
详细信息
Pandemic influenza is a major concern worldwide. Availability of advanced technologies and the nucleotide sequences of a large number of pandemic and non-pandemic influenza viruses in 2009 provide a great opportunity to investigate the underlying rules of pandemic induction through data mining tools. Here, for the first time, an integrated classification and association rule mining algorithm (cba) was used to discover the rules underpinning alteration of non-pandemic sequences to pandemic ones. We hypothesized that the extracted rules can lead to the development of an efficient expert system for prediction of influenza pandemics. To this end, we used a large dataset containing 5373 HA (hemagglutinin) segments of the 2009 HI NI pandemic and non-pandemic influenza sequences. The analysis was carried out for both nucleotide and protein sequences. We found a number of new rules which potentially present the undiscovered antigenic sites at influenza structure. At the nucleotide level, alteration of thymine (T) at position 260 was the key discriminating feature in distinguishing non-pandemic from pandemic sequences. At the protein level, rules including I233K, M334L were the differentiating features. cba efficiently classifies pandemic and non-pandemic sequences with high accuracy at both the nucleotide and protein level. Finding hotspots in influenza sequences is a significant finding as they represent the regions with low antibody reactivity. We argue that the virus breaks host immunity response by mutation at these spots. Based on the discovered rules, we developed the software, "Prediction of Pandemic Influenza" for discrimination of pandemic from non-pandemic sequences. This study opens a new vista in discovery of association rules between mutation points during evolution of pandemic influenza. (C) 2015 Elsevier Inc. All rights reserved.
As one of KDTICM[8] theory researches, this paper propose an improved algorithm -- cba, which is based on KDD* model and combined with KAAPRO method, for protein secondary structure prediction problem. Further, multi-...
详细信息
ISBN:
(纸本)9781424427239
As one of KDTICM[8] theory researches, this paper propose an improved algorithm -- cba, which is based on KDD* model and combined with KAAPRO method, for protein secondary structure prediction problem. Further, multi-layer systematic prediction model--Compound Pyramid Model, is proposed. The kernel of this model is cba which is a classic association rules analysis algorithm. Domain knowledge is used through the model, and the phy-chemical attributes is chosen by Causal Cellular Automation. In experiment, the proteins bias alpha/beta structure are precisely predicted. The structures of amino acids, whose structure are obscure, are predicted well by the improved cba. Finally, the result of this model is satisfied.
As one of KDTICM[8]theory researches,this paper propose an improved algorithm--cba,which is based on KDD model and combined with KAAPRO method,for protein secondary structure prediction ***, multi-layer systematic pre...
详细信息
As one of KDTICM[8]theory researches,this paper propose an improved algorithm--cba,which is based on KDD model and combined with KAAPRO method,for protein secondary structure prediction ***, multi-layer systematic prediction model--Compound Pyramid Model,is *** kernel of this model is cba which is a classic association rules analysis *** knowledge is used through the model,and the phy-chemical attributes is chosen by Causal Cellular *** experiment,the proteins bias alpha/beta structure are precisely *** structures of amino acids,whose structure are obscure,are predicted well by the improved ***,the result of this model is satisfied.
In last decade, autonomous intelligent agents or multi-intelligent agents and knowledge discovery in database are combined to produce a new research area in intelligent information technology. In this paper, we aim to...
详细信息
ISBN:
(纸本)9783319093390;9783319093383
In last decade, autonomous intelligent agents or multi-intelligent agents and knowledge discovery in database are combined to produce a new research area in intelligent information technology. In this paper, we aim to produce a knowledge discovery approach to extract a set of rules from a dataset for automatic knowledge base construction using cooperative approach between a multi-intelligent agent system and a domain expert in a particular domain. The proposed system consists of several intelligent agents, each one has a specific task. The main task is assign to associative classification mining intelligent agent to deal with a database directly for rules extraction using Classification Based on Associations (cba) rule generation and classification algorithm, and send them to a domain expert for a modification process. Then, the modified rules will be saved in a knowledge base which is used later by other systems (e.g. knowledge-based system). In other words, the aim of this work is to introduce a tool for extracting knowledge from database, more precisely this work has focused on produce the knowledge base automatically that used rules approach for knowledge representation. The MIAKDD is developed and implemented using visual Prolog programming language ver. 7.1 and the approach is tested for a UCI heart diseases dataset.
Educational data mining is a growing field that uses the data obtained from educational information systems to discover knowledge and find answers to questions and problems concerning the education system. High dropou...
详细信息
Educational data mining is a growing field that uses the data obtained from educational information systems to discover knowledge and find answers to questions and problems concerning the education system. High dropout rates and poor academic performance among students are examples of the most common issues that affect the reputation of an educational institution. Students' academic records can be analyzed to explore the factors behind these phenomena. This paper discusses the building of a model to predict the performance of students in a programming course based on their grades in courses in other subjects. A classification based on an association rules algorithm is used to build a classifier to help evaluate the student's performance in the programming course. This model aims to reduce dropout levels by helping student predict their likelihood of success in a course before they enroll in it. In addition, course instructors will be able to enhance student performance in the course by better estimating their abilities to learn the subject matter and adjusting their teaching strategies and methods. (C) 2016 The Authors. Published by Elsevier B.V.
Classification and associative rule mining are two substantial areas in data mining. Some scientists attempt to integrate these two field called rule-based classifiers. Rule-based classifiers can play a very important...
详细信息
Classification and associative rule mining are two substantial areas in data mining. Some scientists attempt to integrate these two field called rule-based classifiers. Rule-based classifiers can play a very important role in applications such as fraud detection, medical diagnosis, etc. Numerous previous studies have shown that this type of classifier achieves a higher classification accuracy than traditional classification algorithms. However, they still suffer from a fundamental limitation. Many rule-based classifiers used various greedy techniques to prune the redundant rules that lead to missing some important rules. Another challenge that must be considered is related to the enormous set of mined rules that result in high processing overhead. The result of these approaches is that the final selected rules may not be the global best rules. These algorithms are not successful at exploiting search space effectively in order to select the best subset of candidate rules. We merged the Apriori algorithm, Harmony Search, and classification-based association rules (cba) algorithm in order to build a rule-based classifier. We applied a modified version of the Apriori algorithm with multiple minimum support for extracting useful rules for each class in the dataset. Instead of using a large number of candidate rules, binary Harmony Search was utilized for selecting the best subset of rules that appropriate for building a classification model. We applied the proposed method on a seventeen benchmark dataset and compared its result with traditional association rule classification algorithms. The statistical results show that our proposed method outperformed other rule-based approaches.
Educational data mining is a growing field that uses the data obtained from educational information systems to discover knowledge and find answers to questions and problems concerning the education system. High dropou...
详细信息
Educational data mining is a growing field that uses the data obtained from educational information systems to discover knowledge and find answers to questions and problems concerning the education system. High dropout rates and poor academic performance among students are examples of the most common issues that affect the reputation of an educational institution. Students’ academic records can be analyzed to explore the factors behind these phenomena. This paper discusses the building of a model to predict the performance of students in a programming course based on their grades in courses in other subjects. A classification based on an association rules algorithm is used to build a classifier to help evaluate the student's performance in the programming course. This model aims to reduce dropout levels by helping student predict their likelihood of success in a course before they enroll in it. In addition, course instructors will be able to enhance student performance in the course by better estimating their abilities to learn the subject matter and adjusting their teaching strategies and methods.
暂无评论