作者:
Weinberger, NirFeder, MeirMIT
IDSS 77 Massachusetts Ave Cambridge MA 02139 USA MIT
LIDS 77 Massachusetts Ave Cambridge MA 02139 USA Tel Aviv Univ
Sch Elect Engn IL-69978 Tel Aviv Israel
The k-vectors algorithm for learning regression functions proposed here is akin to the well-known k-means algorithm. Both algorithms partition the feature space, but unlike the k-means algorithm, the k-vectors algorit...
详细信息
ISBN:
(纸本)9781728131511
The k-vectors algorithm for learning regression functions proposed here is akin to the well-known k-means algorithm. Both algorithms partition the feature space, but unlike the k-means algorithm, the k-vectors algorithm aims to reconstruct the response rather than the feature. The partitioning rule of the algorithm is based on maximizing the correlation (inner product) of the feature vector with a set of k vectors, and generates polyhedral cells, similar to the ones generated by the nearest-neighbor rule of the k-means algorithm. Similarly to k-means, the learning algorithm alternates between two types of steps. In the first type of steps, k labels are determined via a centroid-type rule (in the response space), which uses a surrogate hinge-type loss function to the mean squared error loss function. In the second type of steps, the k vectors which determine the partition are updated according to a multiclass classification rule, in the spirit of support vector machines. It is proved that both steps of the algorithm only require solving convex optimization problems, and that the algorithm is empirically consistent - as the length of the training sequence increases to infinity, fixedpoints of the empirical version of the algorithm tend to fixed points of the population version of the algorithm. Learnability of the predictor class posit by the algorithm is also established.
In many applications, we deal with high dimensional datasets with different types of data. For instance, in text classification and information retrieval problems, we have large collections of documents. Each text is ...
详细信息
ISBN:
(纸本)9783642212567;9783642212574
In many applications, we deal with high dimensional datasets with different types of data. For instance, in text classification and information retrieval problems, we have large collections of documents. Each text is usually represented by a bag-of-words or similar representation, with a large number of features (terms). Many of these features may be irrelevant (or even detrimental) for the learning tasks. This excessive number of features carries the problem of memory usage in order to represent and deal with these collections, clearly showing the need for adequate techniques for feature representation, reduction, and selection, to both improve the classification accuracy and the memory requirements. In this paper, we propose a combined unsupervised feature discretization and feature selection technique. The experimental results on standard datasets show the efficiency of the proposed techniques as well as improvement over previous similar techniques.
暂无评论