In this paper we investigate using multi-objective genetic programming to evolve a feature extraction stage for multiple-classclassifiers. We find mappings which transform the input space into a new, multi-dimensiona...
详细信息
In this paper we investigate using multi-objective genetic programming to evolve a feature extraction stage for multiple-classclassifiers. We find mappings which transform the input space into a new, multi-dimensional decision space to increase the discrimination between all classes;the number of dimensions of this decision space is optimized as part of the evolutionary process. A simple and fast multi-classclassifier is then implemented in this multi-dimensional decision space. Mapping to a single decision space has significant computational advantages compared to k-class-to-2-class decompositions;a key design requirement in this work has been the ability to incorporate changing priors and/or costs associated with mislabeling without retraining. We have employed multi-objective optimization in a Pareto framework incorporating solution complexity as an independent objective to be minimized in addition to the main objective of the misclassification error. We thus give preference to simpler solutions which tend to generalize well on unseen data, in accordance with Occam's Razor. We obtain classification results on a series of benchmark problems which are essentially identical to previous, more complex decomposition approaches. Our solutions are much simpler and computationally attractive as well as able to readily incorporate changing priors/costs. In addition, we have also applied our approach to the KDD-99 intrusion detection dataset and obtained results which are highly competitive with the KDD-99 Cup winner but with a significantly simpler classification framework.
Prediction of protein secondary structure is an important step towards elucidating its three dimensional structure and its function. This is a challenging problem in bioinformatics. Segmental semi Markov models (SSMMs...
详细信息
Prediction of protein secondary structure is an important step towards elucidating its three dimensional structure and its function. This is a challenging problem in bioinformatics. Segmental semi Markov models (SSMMs) are one of the best studied methods in this field. However, incorporating evolutionary information to these methods is somewhat difficult. On the other hand, the systems of multiple neural networks (NNs) are powerful tools for multi-class pattern classification which can easily be applied to take these sorts of information into account. To overcome the weakness of SSMMs in prediction, in this work we consider a SSMM as a decision function on outputs of three NNs that uses multiple sequence alignment profiles. We consider four types of observations for outputs of a neural network, Then profile table related to each sequence is reduced to a sequence of four observations. In order to predict secondary structure of each amino acid we need to consider a decision function. We use an SSMM on outputs of three neural networks. The proposed SSMM has discriminative power and weights over different dependency models for outputs of neural networks. The results show that the accuracy of our model in predictions, particularly for strands, is considerably increased. (C) 2008 Elsevier Inc. All rights reserved.
Automatic speech recognition is one active research area which can exploit the pattern recognition capabilities of artificial neural networks. Several researchers have shown that the outputs of artificial neural netwo...
详细信息
ISBN:
(纸本)9788132222507;9788132222491
Automatic speech recognition is one active research area which can exploit the pattern recognition capabilities of artificial neural networks. Several researchers have shown that the outputs of artificial neural networks trained in multi-classclassification mode can be interpreted as estimates of a posteriori probabilities of output classes. These probabilities can be used by the state-of-the-art hidden Markov model for speech recognition in estimating the emission probabilities of the states of the hidden Markov model. In this paper, we explore a pairwise neural network system as an alternative approach to multi-class neural network systems to estimate the emission probabilities of the states of a hidden Markov model. Through experimental analysis it is shown that the pairwise recognition system outperforms the multiclass recognition system in terms of the recognition accuracy of spoken sentences.
The multi-feature and imbalanced nature of network data has always been a challenge to be overcome in the field of network intrusion detection. The redundant features in data could reduce the overall quality of networ...
详细信息
The multi-feature and imbalanced nature of network data has always been a challenge to be overcome in the field of network intrusion detection. The redundant features in data could reduce the overall quality of network data and the accuracy of detection models, because imbalance could lead to a decrease in the detection rate for minority classes. To improve the detection accuracy for imbalanced intrusion data, we develop a data-driven integrated detection method, which utilizes Recursive Feature Elimination (RFE) for feature selection, and screens out features that are conducive to model recognition for improving the overall quality of data analysis. In this work, we also apply the Adaptive Synthetic Sampling (ADASYN) method to generate the input data close to the original dataset, which aims to eliminate the data imbalance in the studied intrusion detection model. Besides, a novel VGG-ResNet classification algorithm is also proposed via integrating the convolutional block with the output feature map size of 128 from the Visual Geometry Group 16 (VGG16) of the deep learning algorithm and the residual block with output feature map size of 256 from the Residual Network 18 (ResNet18). Based on the numerical results conducted on the well-known NSL-KDD dataset and UNSW-NB15 dataset, it illustrates that our method can achieve the accuracy rates of 86.31% and 82.56% in those two test datasets, respectively. Moreover, it can be found that the present algorithm can achieve a better accuracy and performance in the experiments of comparing our method with several existing algorithms proposed in the recent three years.
暂无评论