Intrusion detection system (IDS) consists of set of techniques and methods for collection of packets from host system or network and analyzes those packets for anomalous content. IDSs mainly fall into two categories: ...
详细信息
Bagging ensemble techniques have been utilized effectively by practitioners in the field of bioinformatics to alleviate the problem of class imbalance and to improve the performance of classification models. However, ...
详细信息
ISBN:
(纸本)9781509032082
Bagging ensemble techniques have been utilized effectively by practitioners in the field of bioinformatics to alleviate the problem of class imbalance and to improve the performance of classification models. However, many previous works have used bagging only with a single arbitrary number of iterations. In this study, we raise the question of what is the impact of altering the number of iterations/ensembles on the classification performance of bagging classifiers? To answer this question, we conducted an empirical study using four different choices of number of iterations (10, 20, 50, and 100) within the bagging algorithm, across 15 different imbalanced bioinformatics datasets. Our results indicate that the choice of 50 iterations performs slightly better than all others without any exception, but the difference in performance is statistically insignificant. Thus, we recommend bagging with 10 iterations because, it achieves quality classification results, additional iterations do not significantly improve performance, and, a smaller number of iterations would be computationally less costly. The unique contribution of this work is to examine the effects of the number of iterations on the classification performance of bagging classifiers in the context of imbalanced datasets in the bioinformatics field.
Domain adaptation methods are designed to extract shared domain-invariant features by projecting data on a common subspace in order to align their domain distributions. However, these methods do not usually consider d...
详细信息
With the development of science and technology, many disciplines reveal the general trends of crossing and blending, and new interdisciplinary subjects emerge. For decision makers in colleges and universities, it is c...
详细信息
ISBN:
(纸本)9781665495660
With the development of science and technology, many disciplines reveal the general trends of crossing and blending, and new interdisciplinary subjects emerge. For decision makers in colleges and universities, it is crucial and difficult to choose and construct a proper interdisciplinary major. In the paper we simulate the phenomenon of current conduction to improve natural language processing and datamining. An assistant decision-making method of setting up interdisciplinary majors for colleges and universities is designed, which could not only intelligently and automatically recommend the most suitable interdisciplinary majors, but also simplify and correct the work of decision-makers. Our experimental analysis demonstrates that similarity ratio of interdisciplinary majors recommended by the proposed assistant decision-making method runs at more than 70 percent compared with manual recommendation of specialists.
Support Vector machines are an effective form of binary-class classification algorithm. To enhance the utilization of text structural features for information extraction, which are greatly restricted by the Hidden Mar...
详细信息
Support Vector machines are an effective form of binary-class classification algorithm. To enhance the utilization of text structural features for information extraction, which are greatly restricted by the Hidden Markov Model (HMM), this paper proposes a support vector machine multi-class classification based on Markov properties to extract the information from a citation database. The proposed model extracts symbol characteristics as features and composes a binary tree of the transition probabilities. Experiments show that the proposed method outperforms HMM and basic SVM methods.
暂无评论