Standard binary crossover operators such as uniform and one-point crossover are referred to as being “geometric” since they always generate an offspring between its two parents under the Hamming distance. That is, t...
详细信息
Standard binary crossover operators such as uniform and one-point crossover are referred to as being “geometric” since they always generate an offspring between its two parents under the Hamming distance. That is, the sum of the Hamming distances from the offspring to its two parents is the same as the Hamming distance between the two parents. In our former studies, we proposed the probabilistic use of a non-geometric binary crossover operator in evolutionary multiobjective optimization (EMO) algorithms to increase the spread of solutions along the Pareto front in the objective space. Our crossover operator generates an offspring outside its two parents with respect to the Hamming distance. That is, the distance from the offspring to one parent is larger than the distance between the two parents. In this paper, we use our crossover operator as mutation to further examine its effects on the behavior of EMO algorithms. Experimental results show that the use of our crossover operator as mutation improves the performance of NSGA-II on a two-objective knapsack problems by increasing the spread of solutions along the Pareto front. Good results, however, are not obtained when only our crossover operator is used in NSGA-II. The best results are obtained when both our non-geometric crossover and the standard uniform crossover are used.
Multiobjectivization is an interesting idea to solve a difficult single-objective optimization problem through its reformulation as a multiobjective problem. The reformulation is performed by introducing an additional...
Multiobjectivization is an interesting idea to solve a difficult single-objective optimization problem through its reformulation as a multiobjective problem. The reformulation is performed by introducing an additional objective function or decomposing the original objective function into multiple ones. Evolutionary multiobjective optimization (EMO) algorithms are often used to solve the reformulated problem. Such an optimization approach, which is called multiobjectivization, has been used to solve difficult single-objective problems in many studies. In this paper, we discuss the use of multiobjectivization to solve two-objective problems. That is, we discuss the idea of solving a two-objective optimization problem by reformulating it as a four-objective one. In general, the increase in the number of objectives usually makes the problem more difficult for EMO algorithms. Thus the handling of two-objective problems as four-objective ones may simply lead to the deterioration in the quality of obtained non-dominated solutions. However, in this paper, we demonstrate through computational experiments that better results are obtained for some two-objective test problems by increasing the number of objectives from two to four.
Evolutionary algorithms have been actively applied to knowledge discovery, data mining and machine learning under the name of genetics-based machine learning (GBML). The main advantage of using evolutionary algorithms...
详细信息
Evolutionary algorithms have been actively applied to knowledge discovery, data mining and machine learning under the name of genetics-based machine learning (GBML). The main advantage of using evolutionary algorithms in those application areas is their flexibility: Various knowledge extraction criteria such as accuracy and complexity can be easily utilized as fitness functions. On the other hand, the main disadvantage is their large computation load. It is not easy to apply evolutionary algorithms to large data sets. The scalability improvement to large data sets is one of the main research issues in GBML. In our former studies, we proposed an idea of parallel distributed implementation of GBML and examined its effectiveness for genetic fuzzy rule selection. The point of our idea was to realize a quadratic speed-up by dividing not only a population but also training data. Training data subsets were periodically rotated over sub-populations in order to prevent each sub-population from over-fitting to a specific training data subset. In this paper, we propose the use of parallel distributed implementation for the design of ensemble classifiers. An ensemble classifier is designed by combining base classifiers, each of which is obtained from each sub-population. Through computational experiments on parallel distributed genetic fuzzy rule selection, we examine the generalization ability of designed ensemble classifiers under various settings with respect to the size of training data subsets and their rotation frequency.
A high frequency transformer is a critical component in a dual active bridge converter (DAB) used in a power electronics-based solid state transformer. Operation of a DAB converter requires its transformer to have a s...
详细信息
This paper considers cluster validation for fuzzy clustering with noise rejection. Although noise rejection mechanisms such as noise fuzzy clustering or graded possibilistic noise rejection make it possible to remove ...
详细信息
This paper considers cluster validation for fuzzy clustering with noise rejection. Although noise rejection mechanisms such as noise fuzzy clustering or graded possibilistic noise rejection make it possible to remove the influence of noisy samples, they also create problems in applying conventional validity measures designed for fuzzy clustering with probabilistic constraints. In this paper, a PCA-guided validation approach is developed, in which a rotated optimal cluster indicator is derived in a fuzzy PCA-guided manner, considering responsibility weights for c-means clustering. The deviation between a current solution and the optimal solution is estimated through procrustean transformation. Several experimental results demonstrate that the proposed validation approach works well for selecting both the optimal initialization and the cluster number.
Tens of thousands of classifiers have been proposed so far. There is no best classifier among them. It is said that the performance of each classifier strongly depends on data sets used for comparison. In recent years...
详细信息
Tens of thousands of classifiers have been proposed so far. There is no best classifier among them. It is said that the performance of each classifier strongly depends on data sets used for comparison. In recent years, a number of data complexity measures have been proposed to characterize each data set. The aim of this study is to develop a framework for selecting an appropriate classifier and/or its appropriate parameter specification among candidate classifiers based on data complexity measures. It will be possible to clarify the domain of competence of classifiers. As a preliminary study, we propose an appropriate granularity specification method for fuzzy classifier design. First we examine a relation between the performance of classifiers with different granularities and the data complexity of artificial data sets. Next we extract if-then rule-based knowledge from the classification results on the artificial data sets.
Data mining is a very active and rapidly growing research area in the field of computerscience. Its goal is to obtain useful knowledge for users from a database. Association rule mining from a database is one of the ...
详细信息
Data mining is a very active and rapidly growing research area in the field of computerscience. Its goal is to obtain useful knowledge for users from a database. Association rule mining from a database is one of the most well-known data mining techniques. In general, a large number of if-then rules are extracted by specifying minimum support and confidence levels. They are, however, too complicated as knowledge for users to understand many rules at one time. Multiobjective genetic fuzzy rule selection from Pareto-optimal and near Pareto-optimal rules is a promising approach which can obtain an accurate and simple rule set by considering the accuracy maximization and the complexity minimization. In this paper, we propose two extensions of multiobjective genetic fuzzy rule selection for designing more accurate fuzzy rule-based classifiers. One extension is to add compatible rules with misclassified patterns into candidate rules for genetic fuzzy rule selection. The other is to tune membership functions after genetic fuzzy rule selection. We examine the effects of these extensions through computational experiments on imbalanced data sets.
Fuzzy c-means (FCM) clustering is the method for partitioning data into clusters by minimizing an objective function. Therefore, it is important to devise an objective function from which a simple clustering algorithm...
详细信息
Fuzzy c-means (FCM) clustering is the method for partitioning data into clusters by minimizing an objective function. Therefore, it is important to devise an objective function from which a simple clustering algorithm can be derived. An entropy term was introduced by S. Miyamoto in the FCM objective function. We proposed an objective function of the fuzzy counterpart of Gaussian mixture models (GMMs) clustering. The objective function is based on Kullback-Leibler divergence instead of the entropy. S. Miyamoto derived a hard clustering algorithm by linearizing the K-L divergence term of the objective function. In the hard c-means (HCM) clustering approach, covariance matrices are decision variables. For quick and stable convergence of FCM-like clustering, this paper proposes the semi-hard clustering approach by constraining the membership in an interval [ab]. The semi-hard clustering result is used for a classifier design. The membership function suggested by the generalized FCM and K-L based FCM is used for the classifier. The values of hyperparameters are searched by particle swarm optimization (PSO). In terms of classification performance on UCI benchmark data, the classifier is comparable to the support vector machine (SVM) and surpasses the k-nearest neighbor (k-NN) classifier. The computation time of FCM classifier for training a large scale data set is smaller than that of the SVM classifier with decomposition and working set selection algorithms.
In cluster file systems, the metadata management is critical to the whole system. Past researches mainly focus on journaling which alone is not enough to provide high-available metadata service. Some others try to use...
详细信息
Cyber-Physical systems (CPSs) are deeply embedded infrastructures that have significant cyber and physical components that interact with each other in complex ways. These interactions can violate a system's securi...
Cyber-Physical systems (CPSs) are deeply embedded infrastructures that have significant cyber and physical components that interact with each other in complex ways. These interactions can violate a system's security policy, leading to unintended information flow. The physical portion of such systems is inherently observable, and, as such, many methods of preserving confidentiality are not applicable. This fundamental property of CPSs presents new security challenges. To illustrate this, a vehicle composed of an embedded computer system, its operator, and its environment show how information is disclosed to an observer that is watching from the outside. The example is made of up a vehicle with an automated engine management system (smart cruise control) traveling across some terrain with an observer watching the vehicle. The information that is to be protected is the controller of the vehicle. This model is analyzed using formal models of information flow, namely nondeducibility and noninference. The vehicle's operation, in context with the terrain of the road, discloses information to the observer. Context is important; the same information that was disclosed with one terrain type is hidden with a different terrain. This problem, its methodology, and results uncover problems, and solutions, based on the theory of information flow, to quantify security in these new types of systems.
暂无评论