In this paper, an improved cluster oriented decision trees algorithm shortly named ICFDT is presented. In this algorithm, fuzzy C-means clustering algorithm (FCM) without instance labels is used to split the nodes and...
详细信息
ProbSim-Annotation is an image annotation algorithm for heterogeneous data driven by a probability-based similarity assessment. Image annotation consists of associating to an image a description in terms of labels (or...
详细信息
ProbSim-Annotation is an image annotation algorithm for heterogeneous data driven by a probability-based similarity assessment. Image annotation consists of associating to an image a description in terms of labels (or words) from a dictionary. This association rests on the premise that similar images have similar annotations. Evaluation of an annotation algorithm is conveyed by the relevance of retrieval from an image data base when the query is described by labels of the dictionary. Previous studies have shown the probability-based similarity to be very useful in assessing similarity between heterogeneous data by mapping heterogeneous distances into their probability distributions. These can then be estimated from the training set. In this paper, for practical use, the empirical CDF of distances is approximated by polynomial series. Combining probability-based similarity across multiple attributes/dimensions leads to an overall similarity. This serves as an important cue to transfer annotations from the training set to a test set using a kNN algorithm. Experimental results performed on Corel5K benchmark dataset show that ProbSim-Annotation is a promising image annotation algorithm.
Sutton, Szepesvári and Maei (2009) recently introduced the first temporal-difference learning algorithm compatible with both linear function approximation and off-policy training, and whose complexity scales only...
详细信息
ISBN:
(纸本)9781605585161
Sutton, Szepesvári and Maei (2009) recently introduced the first temporal-difference learning algorithm compatible with both linear function approximation and off-policy training, and whose complexity scales only linearly in the size of the function approximator. Although their gradient temporal difference (GTD) algorithm converges reliably, it can be very slow compared to conventional linear TD (on on-policy problems where TD is convergent), calling into question its practical utility. In this paper we introduce two new related algorithms with better convergence rates. The first algorithm, GTD2, is derived and proved convergent just as GTD was, but uses a different objective function and converges significantly faster (but still not as fast as conventional TD). The second new algorithm, linear TD with gradient correction, or TDC, uses the same update rule as conventional TD except for an additional term which is initially zero. In our experiments on small test problems and in a computer Go application with a million features, the learning rate of this algorithm was comparable to that of conventional TD. This algorithm appears to extend linear TD to off-policy learning with no penalty in performance while only doubling computational requirements.
Ontology mapping has been widely used in ontology application, but the similarity calculation becomes a thorny issue in the process of ontology mapping. In this paper, the different elements of ontology are considered...
详细信息
Support Vector machine (SVM) is sensitive to noises and outliers. For reducing the effect of noises and outliers, we propose a novel SVM for suppressing error function. The error function is limited to the interval of...
详细信息
ISBN:
(纸本)9780769538877
Support Vector machine (SVM) is sensitive to noises and outliers. For reducing the effect of noises and outliers, we propose a novel SVM for suppressing error function. The error function is limited to the interval of [0, 1]. The separation hypersurface is simplified and the margin of hypersurface is widened. Experimental results show that our proposed method is able to simultaneously increase the classification efficiency and the generalization ability of the SVM.
There are a large number of accessible deep Web sites on the Internet. However, even if identical entity has different representation formats on different Web sites. So entity identification plays a crucial role in de...
详细信息
There are a large number of accessible deep Web sites on the Internet. However, even if identical entity has different representation formats on different Web sites. So entity identification plays a crucial role in deep Web data mining. This paper proposes an entity identification method in the field of Chinese books. First, using improved Jaccard coefficients to calculate similarity of text attributes. Second, AHP (analytic hierarchy process) is used to obtain the weights, and using the sum of weights to calculate the entity similarity. Finally, it needs to integrate duplicate entity to achieve the entity identification. The experiment results demonstrate the approach has higher accuracy with good feasibility.
This paper presents a new method for the mining the hottest topics on Chinese webpage which is based on the improved k-means partitioning algorithm. The dictionary applied to word segmentation is reduced by deleting w...
详细信息
This paper presents a new method for the mining the hottest topics on Chinese webpage which is based on the improved k-means partitioning algorithm. The dictionary applied to word segmentation is reduced by deleting words which are useless for clustering, and the dictionary tree is created to be applied to word segmentation. Then the speed of word segmentation is improved. Correspondence between words and integers is created by coding words. Then the title is expressed by integer set, and the cost of space and time for clustering is decreased largely. Determining the value of k is a shortcoming of stream data mining based on k-means. By this new method, the value of k is adjusted in clustering. Then both the accuracy and the speed are improved.
This paper presents a reasoning algorithm based on interaction with fuzzy rule matrix transformation, and applies it to completing the patterns. Then the new full patterns will be used in training and synthetic judgme...
详细信息
This paper presents a reasoning algorithm based on interaction with fuzzy rule matrix transformation, and applies it to completing the patterns. Then the new full patterns will be used in training and synthetic judgment The investigation shows that the method is effective and may be widely used in Reasoning with Incomplete Knowledge.
Distribution network cabling planning is a very complex project This paper proposes the application of intelligent decision support technology in Power System. By adding a module library and the concept of model manag...
详细信息
Distribution network cabling planning is a very complex project This paper proposes the application of intelligent decision support technology in Power System. By adding a module library and the concept of model management systems, Intelligent Power Service System realizes intelligence decision support in the distribution network power cabling planning by using dynamic programming, spatial data mining and decision tree techniques, and has a certain amount of self-learning ability.
Decision tree induction is one of the useful approaches for extracting classification knowledge from a set of feature-based instances. The most popular heuristic information used in the decision tree generation is the...
详细信息
Decision tree induction is one of the useful approaches for extracting classification knowledge from a set of feature-based instances. The most popular heuristic information used in the decision tree generation is the minimum entropy. This heuristic information has a serious disadvantage-the poor generalization capability [3]. Support Vector machine (SVM) is a classification technique of machinelearning based on statistical learning theory. It has good generalization. Considering the relationship between the classification margin of support vector machine(SVM) and the generalization capability, the large margin of SVM can be used as the heuristic information of decision tree, in order to improve its generalization *** paper proposes a decision tree induction algorithm based on large margin heuristic. Comparing with the binary decision tree using the minimum entropy as the heuristic information, the experiments show that the generalization capability has been improved by using the new heuristic.
暂无评论