Patent classification is a large scale hierarchical text classification (LSHTC) task. Though comprehensive comparisons, either learning algorithms or feature selection strategies, have been fully made in the text cate...
详细信息
Patent classification is a large scale hierarchical text classification (LSHTC) task. Though comprehensive comparisons, either learning algorithms or feature selection strategies, have been fully made in the text categorization field, few work was done for a LSHTC task due to high computational cost and complicated structural label characteristics. For the first time, this paper compares two popular learning frameworks, namely, hierarchical support vector machine (SVM) and k -nearest neighbor ( k -NN) that are applied to a LSHTC task. Our experimental results show that the latter outperforms the former for the LSHTC task, which is quite different from the existing results for normal text categorization tasks. In addition, this paper compares different similarity measures and ranking strategies in k -NN framework for LSHTC task. From our empirical study, conclusions can be drawn that k -NN is more appropriate for the LSHTC task than hierarchical SVM. BM25 outperforms other similarity measures and ListWeak gains a better performance than other ranking strategies. Our empirical results also indicate that using all the labels of the retrieved neighbors can remarkably improve classification performance over only using the first label of the retrieved neighbors.
In our previous work, an elliptical ultrasonic assisted grinding (EUAG) method was proposed and the experimental works including the establishment of experimental apparatus and the preliminary experimental results wer...
详细信息
Hierarchies are very popular in organizing documents and web pages, hence automated hierarchical classification techniques are desired. However, the current dominant hierarchical approach of top-down method suffers ac...
详细信息
An elliptical ultrasonic assisted grinding method is proposed for the machining of sapphire substrate in order to improve the machining efficiency and the work-surface quality. An elliptical ultrasonic vibrator is des...
详细信息
It is not trivial to tune the swarm behavior just by parameter setting because of the randomness, complexity and dynamic involved in particle swarm optimizer (PSO). Hundreds of variants in the literature of last decad...
详细信息
This paper presents a system which adopts a standard sequence labeling technique for hedge detection and scope finding. For the first task, hedge detection, we formulate it as a hedge labeling problem, while for the s...
详细信息
Patent classification is a large scale hierarchical text classification (LSHTC) task. Though comprehensive comparisons, either learning algorithms or feature selection strategies, have been fully made in the text cate...
详细信息
This paper describes a statistical machine translation system for our participation for the WMT10 shared task. Based on MOSES, our system is capable of translating German, French and Spanish into English. Our main con...
详细信息
Support Vector machine (SVM) is a classification technique of machine learning based on statistical learning theory. A quadratic optimization problem needs to be solved in the algorithm, and with the increase of the s...
详细信息
暂无评论