This paper presents a new method for the mining the hottest topics on Chinese webpage which is based on the improved k-means partitioning algorithm. The dictionary applied to word segmentation is reduced by deleting w...
详细信息
This paper presents a new method for the mining the hottest topics on Chinese webpage which is based on the improved k-means partitioning algorithm. The dictionary applied to word segmentation is reduced by deleting words which are useless for clustering, and the dictionary tree is created to be applied to word segmentation. Then the speed of word segmentation is improved. Correspondence between words and integers is created by coding words. Then the title is expressed by integer set, and the cost of space and time for clustering is decreased largely. Determining the value of k is a shortcoming of stream data mining based on k-means. By this new method, the value of k is adjusted in clustering. Then both the accuracy and the speed are improved.
Distribution network cabling planning is a very complex project This paper proposes the application of intelligent decision support technology in Power System. By adding a module library and the concept of model manag...
详细信息
Distribution network cabling planning is a very complex project This paper proposes the application of intelligent decision support technology in Power System. By adding a module library and the concept of model management systems, Intelligent Power Service System realizes intelligence decision support in the distribution network power cabling planning by using dynamic programming, spatial data mining and decision tree techniques, and has a certain amount of self-learning ability.
This paper presents a reasoning algorithm based on interaction with fuzzy rule matrix transformation, and applies it to completing the patterns. Then the new full patterns will be used in training and synthetic judgme...
详细信息
This paper presents a reasoning algorithm based on interaction with fuzzy rule matrix transformation, and applies it to completing the patterns. Then the new full patterns will be used in training and synthetic judgment The investigation shows that the method is effective and may be widely used in Reasoning with Incomplete Knowledge.
Decision tree induction is one of the useful approaches for extracting classification knowledge from a set of feature-based instances. The most popular heuristic information used in the decision tree generation is the...
详细信息
Decision tree induction is one of the useful approaches for extracting classification knowledge from a set of feature-based instances. The most popular heuristic information used in the decision tree generation is the minimum entropy. This heuristic information has a serious disadvantage-the poor generalization capability [3]. Support Vector machine (SVM) is a classification technique of machinelearning based on statistical learning theory. It has good generalization. Considering the relationship between the classification margin of support vector machine(SVM) and the generalization capability, the large margin of SVM can be used as the heuristic information of decision tree, in order to improve its generalization *** paper proposes a decision tree induction algorithm based on large margin heuristic. Comparing with the binary decision tree using the minimum entropy as the heuristic information, the experiments show that the generalization capability has been improved by using the new heuristic.
In Chinese-chess computer game (CCCG), a computer player could find the best move for a given board position by using alpha-beta search algorithm. The technique of iterative deepening is an enhancement to alpha-beta s...
详细信息
In Chinese-chess computer game (CCCG), a computer player could find the best move for a given board position by using alpha-beta search algorithm. The technique of iterative deepening is an enhancement to alpha-beta search. It is helpful to reduce the size of game tree. In this paper, we improved the prototypical one-ply iterative deepening (OPID) and proposed two-ply iterative deepening (TPID). In game tree searching, we extend the search by two plies from the previous iteration. An iterated series of 2-ply, 4-ply, 6-ply,…searches is carried out. In the experiments, we validate that TPID is feasible and effective. Through applying TPID to minimax search and alpha-beta search respectively, we found that the total number of nodes generated in TPID minimax search and TPID alpha-beta search are all reduced compared with OPID.
Determining fuzzy measure from data is an important topic in some practical applications. Some computing techniques are adopted, such as particle swarm optimization (PSO) and gradient descent algorithm (GD), to identi...
详细信息
Determining fuzzy measure from data is an important topic in some practical applications. Some computing techniques are adopted, such as particle swarm optimization (PSO) and gradient descent algorithm (GD), to identify fuzzy measure. However, there exist some limitations. In this paper, we design a hybrid algorithm called CDPSO, through introducing GD to PSO for the first time. This algorithm has the advantages of GD and PSO, and avoids the disadvantages of them. Theoretical analysis and experimental results verify this, and show that GDPSO is effective and efficient.
Classification based on association rules is a common and easily understand algorithm for text classification. To improve its classification accuracy, the key is to generate more effective rules. Sometimes, it will ov...
详细信息
This paper is to discuss the reduction of computation complexity in decision tree generation for the numerical-valued attributes. The proposed method is based on the partition impurity. The partition impurity minimiza...
详细信息
This paper is to discuss the reduction of computation complexity in decision tree generation for the numerical-valued attributes. The proposed method is based on the partition impurity. The partition impurity minimization is used to select the expanded attribute for generation the sub-node during the tree growth. After inducing the unstable cut-points of numerical-attributes, it is analytically proved that the partition impurity minimization can always be obtained at the unstable cut-points. It implies that the computation on stable cut-points may not be considered during the tree growth. Since the stable cut-points are far more than unstable cut-points, the experimental results show that the proposed method can reduce the computational complexity greatly.
Feature selection is an essential technique used in data mining and machinelearning. Many feature selection methods have been studied for supervised problems. However feature selection for unsupervised learning is ra...
详细信息
Feature selection is an essential technique used in data mining and machinelearning. Many feature selection methods have been studied for supervised problems. However feature selection for unsupervised learning is rarely studied. In this paper, we proposed an approach to select features for unsupervised problems. Firstly, the original features are clustered according to their relevance degree defined by mutual information. And then the most informative feature is selected from each cluster based on the contribution-information of each feature. The experimental results show that the proposed method can match some popular supervised feature selection methods. And the features selected by our method do include most of the information hidden in the overall original features.
It has been shown that the fuzzy integral is an effective tool for the fusion of multiple classifiers. Of primary importance in the development of the system is the choice of the measure which embodies the importance ...
详细信息
It has been shown that the fuzzy integral is an effective tool for the fusion of multiple classifiers. Of primary importance in the development of the system is the choice of the measure which embodies the importance of subsets of classifiers. In this paper we propose a method for a dynamic fuzzy measure which will change following the pattern to be classified (data dependent). This method uses the neural network which has good study ability. Our experiment results show that this method make the classification accurate improve.
暂无评论