This paper presents a new method for the mining the hottest topics on Chinese webpage which is based on the improved k-means partitioning algorithm. The dictionary applied to word segmentation is reduced by deleting w...
详细信息
This paper presents a new method for the mining the hottest topics on Chinese webpage which is based on the improved k-means partitioning algorithm. The dictionary applied to word segmentation is reduced by deleting words which are useless for clustering, and the dictionary tree is created to be applied to word segmentation. Then the speed of word segmentation is improved. Correspondence between words and integers is created by coding words. Then the title is expressed by integer set, and the cost of space and time for clustering is decreased largely. Determining the value of k is a shortcoming of stream data mining based on k-means. By this new method, the value of k is adjusted in clustering. Then both the accuracy and the speed are improved.
In this study, we study set operations on type-2 fuzzy sets. We first discuss join and meet operations of membership grades of type-2 fuzzy sets under left continuous t-norms and derive distributive law of type-2 fuzz...
详细信息
In this study, we study set operations on type-2 fuzzy sets. We first discuss join and meet operations of membership grades of type-2 fuzzy sets under left continuous t-norms and derive distributive law of type-2 fuzzy sets. Then, some properties on compositions of fuzzy relations is discussed. We derived that the distributive laws under union and composition of type-2 fuzzy relations is valid. An example shows the failure of distributive laws under intersection and composition.
Distribution network cabling planning is a very complex project This paper proposes the application of intelligent decision support technology in Power System. By adding a module library and the concept of model manag...
详细信息
Distribution network cabling planning is a very complex project This paper proposes the application of intelligent decision support technology in Power System. By adding a module library and the concept of model management systems, Intelligent Power Service System realizes intelligence decision support in the distribution network power cabling planning by using dynamic programming, spatial data mining and decision tree techniques, and has a certain amount of self-learning ability.
This paper presents a reasoning algorithm based on interaction with fuzzy rule matrix transformation, and applies it to completing the patterns. Then the new full patterns will be used in training and synthetic judgme...
详细信息
This paper presents a reasoning algorithm based on interaction with fuzzy rule matrix transformation, and applies it to completing the patterns. Then the new full patterns will be used in training and synthetic judgment The investigation shows that the method is effective and may be widely used in Reasoning with Incomplete Knowledge.
Both random Fourier features and the Nyström method have been successfully applied to efficient kernel learning. In this work, we investigate the fundamental difference between these two approaches, and how the d...
详细信息
ISBN:
(纸本)9781627480031
Both random Fourier features and the Nyström method have been successfully applied to efficient kernel learning. In this work, we investigate the fundamental difference between these two approaches, and how the difference could affect their generalization performances. Unlike approaches based on random Fourier features where the basis functions (i.e., cosine and sine functions) are sampled from a distribution independent from the training data, basis functions used by the Nyström method are randomly sampled from the training examples and are therefore data dependent. By exploring this difference, we show that when there is a large gap in the eigen-spectrum of the kernel matrix, approaches based on the Nyström method can yield impressively better generalization error bound than random Fourier features based approach. We empirically verify our theoretical findings on a wide range of large data sets.
In Chinese-chess computer game (CCCG), a computer player could find the best move for a given board position by using alpha-beta search algorithm. The technique of iterative deepening is an enhancement to alpha-beta s...
详细信息
In Chinese-chess computer game (CCCG), a computer player could find the best move for a given board position by using alpha-beta search algorithm. The technique of iterative deepening is an enhancement to alpha-beta search. It is helpful to reduce the size of game tree. In this paper, we improved the prototypical one-ply iterative deepening (OPID) and proposed two-ply iterative deepening (TPID). In game tree searching, we extend the search by two plies from the previous iteration. An iterated series of 2-ply, 4-ply, 6-ply,…searches is carried out. In the experiments, we validate that TPID is feasible and effective. Through applying TPID to minimax search and alpha-beta search respectively, we found that the total number of nodes generated in TPID minimax search and TPID alpha-beta search are all reduced compared with OPID.
Decision tree induction is one of the useful approaches for extracting classification knowledge from a set of feature-based instances. The most popular heuristic information used in the decision tree generation is the...
详细信息
Decision tree induction is one of the useful approaches for extracting classification knowledge from a set of feature-based instances. The most popular heuristic information used in the decision tree generation is the minimum entropy. This heuristic information has a serious disadvantage-the poor generalization capability [3]. Support Vector machine (SVM) is a classification technique of machinelearning based on statistical learning theory. It has good generalization. Considering the relationship between the classification margin of support vector machine(SVM) and the generalization capability, the large margin of SVM can be used as the heuristic information of decision tree, in order to improve its generalization *** paper proposes a decision tree induction algorithm based on large margin heuristic. Comparing with the binary decision tree using the minimum entropy as the heuristic information, the experiments show that the generalization capability has been improved by using the new heuristic.
Determining fuzzy measure from data is an important topic in some practical applications. Some computing techniques are adopted, such as particle swarm optimization (PSO) and gradient descent algorithm (GD), to identi...
详细信息
Determining fuzzy measure from data is an important topic in some practical applications. Some computing techniques are adopted, such as particle swarm optimization (PSO) and gradient descent algorithm (GD), to identify fuzzy measure. However, there exist some limitations. In this paper, we design a hybrid algorithm called CDPSO, through introducing GD to PSO for the first time. This algorithm has the advantages of GD and PSO, and avoids the disadvantages of them. Theoretical analysis and experimental results verify this, and show that GDPSO is effective and efficient.
MCS (Minimal Consistent Set) is one of the classical algorithms for minimal consistent subset selection problem. However, when noisy samples are present classification accuracy can suffer. In addition, noise affect th...
详细信息
MCS (Minimal Consistent Set) is one of the classical algorithms for minimal consistent subset selection problem. However, when noisy samples are present classification accuracy can suffer. In addition, noise affect the size of minimal consistent set. Therefore, removing noise is an important issue before sample selection. In this paper, an improvement approach based on MCS to select the representative samples is proposed. Compared with other algorithms which remove the noise by Wilson Editing in advance for the representative samples selection, this algorithm performs the processes of noise removing and samples selection simultaneously. According to this method, most noise can be deleted and the most representative samples can be identified and retained. The experiments show that the proposed method can greatly remove the redundant samples and noise as well as increase the accuracy of solutions when it is used for classification tasks.
A new method to solve the convex hull problem in n-dimensional spaces is proposed in this paper. At each step, a new point is added into the convex hull if the point is judged to be out of the current convex hull by a...
详细信息
A new method to solve the convex hull problem in n-dimensional spaces is proposed in this paper. At each step, a new point is added into the convex hull if the point is judged to be out of the current convex hull by a linear programming model. For the linear separable classification problem, if an instance is regarded as a point of the instances space, the overlap does not still occur between the convex hulls of different classes after a feature is deleted, then we can delete that feature. Repeat this process, an algorithm for feature selection is given. Experimental results show the effectiveness of the algorithm.
暂无评论