Given a graph G and a subset of vertices S = {w1, . . ., wt} ⊆ V (G), the multiset representation of a vertex u ∈ V (G) with respect to S is the multiset m(u|S) = {|dG(u, w1), . . ., dG(u, wt)}| . A subset of vertice...
详细信息
Gait features representation is an important step in gait biometric recognition. It represents content of the individual image and the pattern of human walking simultaneously. Discriminative features can provide the m...
详细信息
The evaluation of clustering algorithms is a field of patternrecognition still open to extensive debate. Most quality measures found in the literature have been conceived to evaluate non-overlapping clusterings, even...
详细信息
Document images prove to be a difficult case for standard stereo correspondence approaches. One of the major problem is that document images are highly self-similar. Most algorithms try to tackle this problem by incor...
详细信息
Document images prove to be a difficult case for standard stereo correspondence approaches. One of the major problem is that document images are highly self-similar. Most algorithms try to tackle this problem by incorporating a global optimization scheme, which tends to be computationally expensive. In this paper, we show that incorporation of layout information into the matching paradigm, as a grouping entity for features, leads to better results in terms of robustness, efficiency, and ultimately in a better 3D model of the captured document, that can be used in various document restoration systems. This can be seen as a divide and conquer approach that partitions the search space into portions given by each grouping entity and then solves each of them independently. As a grouping entity text-lines are preferred over individual character blobs because it is easier to establish correspondences. Text-line extraction works reasonably well on stereo image pairs in the presence of perspective distortions. The proposed approach is highly efficient and matches obtained are more reliable. The claims are backed up by showing their practical applicability through experimental evaluations.
In this paper, a fast k nearest neighbors (k-NN) classifier for documents is presented. Documents are usually represented in a high-dimensional feature space, where their terms are treated as features and the weight o...
详细信息
In this paper, we introduce an incremental nested partition algorithm for finding the inner structuralization of dynamic datasets. Here we use three partition criteria that allow to obtain a hierarchy of clusterings. ...
In this paper, we introduce an incremental nested partition algorithm for finding the inner structuralization of dynamic datasets. Here we use three partition criteria that allow to obtain a hierarchy of clusterings. ...
详细信息
In this paper, we introduce an incremental nested partition algorithm for finding the inner structuralization of dynamic datasets. Here we use three partition criteria that allow to obtain a hierarchy of clusterings. The algorithm is based on some mathematical properties, which are introduced in the paper. The experimental results over the AFP and TDT2 news collections show the usefulness of our method to reveal different levels of the information hidden in the datasets.
In this paper, we present a text evaluation system for students to improve Basque or Spanish writing skills. The system uses Natural Language Processing techniques to evaluate essays by detecting specific measures. Th...
详细信息
This paper describes the clustering-based approach to Word Sense Disambiguation that is followed by the TKB-UO system at SemEval-2007. The underlying disambiguation method only uses WordNet as external resource, and d...
详细信息
In this paper we describe a morphological tagger for Spanish based on Cuban corpora. The tagger combines Hidden Markov Models with some heuristics and dictionaries to provide the appropriate part-of-speech tag for eac...
详细信息
ISBN:
(纸本)9789549174373
In this paper we describe a morphological tagger for Spanish based on Cuban corpora. The tagger combines Hidden Markov Models with some heuristics and dictionaries to provide the appropriate part-of-speech tag for each word in a text document, according to the context in which it appears. Moreover, a morphological analyser that provides all possible morphological interpretations of words is used. It allows us to reduce possible grammatical tags and to obtain not only the appropriate part-of-speech tag, but also its morphological information. The proposed tagger achieves 97.76 % accuracy for a legal corpus.
暂无评论