We estimate the speed of texture change by measuring the spread of texture vectors in their feature space. this method allows us to robustly detect even very slow moving objects. By learning a normal amount of texture...
详细信息
In this paper we address confidentiality issues in distributed data clustering, particularly the inference problem. We present a measure of inference risk as a function of reconstruction precision and number of collud...
详细信息
Sequential patternmining is an important datamining problem with broad applications. Especially, it is also an interesting problem in virtual environments. In this paper, we propose a projection-based, sequential pa...
详细信息
there has been much work recently in the classification of interstitial lung disease from CT scans using texture analysis. the process generally involves one or more radiologists labelling regions of the lung parenchy...
详细信息
ISBN:
(纸本)0889865280
there has been much work recently in the classification of interstitial lung disease from CT scans using texture analysis. the process generally involves one or more radiologists labelling regions of the lung parenchyma as representative of a particular condition. the character of these regions falls broadly into two groups: those which the radiologists present and, it is supposed, all others would agree are highly representative of a particular pattern;and the regions which, although deemed to belong to the class in question, are not necessarily perfect examples and need not possess an homogeneous texture throughout. In short, the data can be clean or contain noise. there are circumstances in which information from only one of these categories may be available and it may be necessary to train a machinelearning algorithm and classify regions which belong to the other type or are of unknown origin. Here we evaluate the decrease in accuracy associated with such incomplete information using a number of common classifiers.
In the typical nonparametric approach to classification in instance-based learning and datamining, random data (the training set of patterns) are collected and used to design a decision rule (classifier). One of the ...
详细信息
ISBN:
(纸本)3540305068
In the typical nonparametric approach to classification in instance-based learning and datamining, random data (the training set of patterns) are collected and used to design a decision rule (classifier). One of the most well known such rules is the k-nearest neighbor decision rule (also known as lazy learning) in which an unknown pattern is classified into the majority class among the k-nearest neighbors in the training set. this rule gives low error rates when the training set is large. However, in practice it is desired to store as little of the training data as possible, without sacrificing the performance. It is well known that thinning (condensing) the training set withthe Gabriel proximity graph is a viable partial solution to the problem. However, this brings up the problem of efficiently computing the Gabriel graph of large training data sets in high dimensional spaces. In this paper we report on a new approach to the instance-based learning problem. the new approach combines five tools: first, editing the data using Wilson-Gabriel-editing to smooththe decision boundary, second, applying Gabriel-thinning to the edited set, third, filtering this output withthe ICF algorithm of Brighton and Mellish, fourth, using the Gabriel-neighbor decision rule to classify new incoming queries, and fifth, using a new data structure that allows the efficient computation of approximate Gabriel graphs in high dimensional spaces. Extensive experiments suggest that our approach is the best on the market.
mining distributed data for global knowledge is getting more attention recently. the problem is especially challenging when data sharing is prohibited due to local constraints like limited bandwidth and data privacy. ...
详细信息
Classification is a major problem in machinelearning. Many classifiers have been developed recently. However, the performance of these classifiers is proportional to the knowledge obtained from the training data. As ...
详细信息
ISBN:
(纸本)9780898715934
Classification is a major problem in machinelearning. Many classifiers have been developed recently. However, the performance of these classifiers is proportional to the knowledge obtained from the training data. As a result, traditional classifiers can not perform very well when the training data space is very limited. In this paper, we propose a new approach to expand the training data space (ETDS) using emerging patterns (EPs) [4] and genetic methods (GMs) [7]. EPs are those itemsets whose supports in one class are significantly higher than their supports in the other classes. GMs are evolutionary methods that incorporate computational techniques inspired by biology [8]. We combine the power of EPs and GMs to expand the training data space before applying standard classifiers. the expansion process is performed by generating more training instances using four techniques. An extensive experimental evaluation carried out on a number of datasets shows that our approach has a great impact on the performance of many traditional classifiers.
the task of extracting knowledge from text is an important research problem for information processing and document understanding. Approaches to capture the semantics of picture objects in documents constitute subject...
详细信息
Steering an autonomous vehicle requires the permanent adaptation of behavior in relation to the various situations the vehicle is in. this paper describes a research which implements such adaptation and optimization b...
详细信息
暂无评论