Advances in wireless and mobile technology flood us with amounts of moving object datathat preclude all means of manual data processing. the volume of data gathered from position sensors of mobile phones, PDAs, or ve...
详细信息
ISBN:
(数字)9783540734994
ISBN:
(纸本)9783540734987
Advances in wireless and mobile technology flood us with amounts of moving object datathat preclude all means of manual data processing. the volume of data gathered from position sensors of mobile phones, PDAs, or vehicles, defies human ability to analyze the stream of input data. On the other hand, vast amounts of gathered data hide interesting and valuable knowledge patterns describing the behavior of moving objects. thus, new algorithms for mining moving object data are required to unearththis knowledge. An important function of the mobile objects management system is the prediction of the unknown location of an object. In this paper we introduce a datamining approach to the problem of predicting the location of a moving object. We mine the database of moving object locations to discover frequent trajectories and movement rules. then, we match the trajectory of a moving object withthe database of movement rules to build a probabilistic model of object location. Experimental evaluation of the proposal reveals prediction accuracy close to 80%. Our original contribution includes the elaboration on the location prediction model, the design of an efficient mining algorithm, introduction of movement rule matching strategies, and a thorough experimental evaluation of the proposed model.
Triangle algorithm is used widely in the field of star patternrecognition, but it also has disadvantage that recognition reliability decreases seriously in areas where there are many stars existing small angular sepa...
详细信息
ISBN:
(纸本)9781424409723
Triangle algorithm is used widely in the field of star patternrecognition, but it also has disadvantage that recognition reliability decreases seriously in areas where there are many stars existing small angular separation. the character match algorithm can solve this problem and it has small size of guide star database, but it can not recognize a star image with displacement caused by camera movement and rotation. To overcome this disadvantage, an effective star patternrecognition algorithm is proposed in this paper. this algorithm divided the entire celestial sphere into a lot of square areas based on some bright stars during constructing guide star database, and then it used sub-areas selected from these square areas for star patternrecognition according to characteristic of star image sensed by star sensor. the simulation results show that the algorithm in this paper not only inherits advantages of character match algorithm, but also has strong robustness against star image displacement.
Subspace learning is crucial for feature extraction and dimensionality reduction which play important role for patternrecognition and machinelearning. It is generally believed that many subspace learning algorithms ...
详细信息
ISBN:
(纸本)9781424409723
Subspace learning is crucial for feature extraction and dimensionality reduction which play important role for patternrecognition and machinelearning. It is generally believed that many subspace learning algorithms can be considered as linear cases of graph-based manifold learning with special edge weights. We develop a robust subspace learning method by designing reasonable edge weights which give rise to good generalization. the value of the edge weights can reflect the distribution of the data of each class and thus the consequent subspace may have good generalization property. Experiments results on face recognition show the effectiveness of the proposed method.
data dimensionality reduction(DDR) is an important preprocessing technique for datamining, pattern classification and so on. DDR aims at obtaining compact representation of the original data while reduce unimportant ...
详细信息
ISBN:
(纸本)9781424409723
data dimensionality reduction(DDR) is an important preprocessing technique for datamining, pattern classification and so on. DDR aims at obtaining compact representation of the original data while reduce unimportant or irrelevant data. In this paper we propose a new measure for determiningthe importance level of the attributes based on the trained support vector regression (SVR) and its derivative characteristics. Based on this new measure, a new approach for data dimensionality reduction based on support vector regression is proposed. the performance of the new approach is demonstrated by several computing cases. the experimental results prove that the approach proposed can improve efficiency and effectiveness significantly compared with other data dimensionality reduction approaches.
data perturbation with random noise signals has been shown to be useful for data hiding in privacy-preserving datamining. Perturbation methods based on additive randomization allows accurate estimation of the Probabi...
详细信息
ISBN:
(数字)9783540734994
ISBN:
(纸本)9783540734987
data perturbation with random noise signals has been shown to be useful for data hiding in privacy-preserving datamining. Perturbation methods based on additive randomization allows accurate estimation of the Probability Density Function (PDF) via the Expectation-Maximization (EM) algorithm but it has been shown that noise-filtering techniques can be used to reconstruct the original data in many cases, leading to security breaches. In this paper, we propose a generic PDF reconstruction algorithm that can be used on non-additive (and additive) randomization techiques for the purpose of privacy-preserving datamining. this two-step reconstruction algorithm is based on Parzen-Window reconstruction and Quadratic Programming over a convex set - the probability simplex. Our algorithm eliminates the usual need for the iterative EM algorithm and it is generic for most randomization models. the simplicity of our two-step reconstruction algorithm, without iteration, also makes it attractive for use when dealing with large datasets.
A practical problem in datamining and machinelearning is the limited availability of data. For example, in a binary classification problem it is often the case that examples of one class are abundant, while examples...
详细信息
ISBN:
(纸本)9780769530697
A practical problem in datamining and machinelearning is the limited availability of data. For example, in a binary classification problem it is often the case that examples of one class are abundant, while examples of the other class are in short supply. Examples from one class, typically the positive class, can be limited due to the financial cost or time required to collect these examples. this work presents a comprehensive empirical study of learning when examples from one class are extremely rare, but examples of the other class(es) are plentiful. Specifically, we address the issue of how many examples from the abundant class should be used when training a classifier on data where one class is very rare. Nearly one million classifiers were built and evaluated to generate the results presented in this work. Our results demonstrate that the often used 'even distribution' is not optimal when dealing with such rare events.
Outlier detection has recently become an important problem in many industrial and financial applications. In this paper, a novel unsupervised algorithm for outlier detection with a solid statistical foundation is prop...
详细信息
ISBN:
(数字)9783540734994
ISBN:
(纸本)9783540734987
Outlier detection has recently become an important problem in many industrial and financial applications. In this paper, a novel unsupervised algorithm for outlier detection with a solid statistical foundation is proposed. First we modify a nonparametric density estimate with a variable kernel to yield a robust local density estimation. Outliers are then detected by comparing the local density of each point to the local density of its neighbors. Our experiments performed on several simulated data sets have demonstrated that the proposed approach can outperform two widely used outlier detection algorithms (LOF and LOCI).
datamining in the CRM aiming at learning available knowledge from the customer relationship by machinelearning or statistical method to instruct the strategic behavior so that obtain the most profit. In recent years...
详细信息
ISBN:
(纸本)9781424409723
datamining in the CRM aiming at learning available knowledge from the customer relationship by machinelearning or statistical method to instruct the strategic behavior so that obtain the most profit. In recent years, Support vector machine (SVMs) has been proposed as a power tool in machine leaning and datamining. this paper applies the SVMs to resolve the practical CRM problem in a company. the final results report the good general performance of SVMs for CRM problem.
In traditional flat neural network, the topologic configurations are needed to be rebuilt withthe width of cold strip changing. So that, the large learn assignment, slow convergence and local minimal in the network a...
详细信息
ISBN:
(纸本)9781424409723
In traditional flat neural network, the topologic configurations are needed to be rebuilt withthe width of cold strip changing. So that, the large learn assignment, slow convergence and local minimal in the network are observed. Moreover, the structure of the traditional neural network according to the experience has been proved that the model is time-consuming and complex. lit, this paper, a new approach of flatness patternrecognition is proposed based on the CMAC neural network. the difference of fuzzy distances between samples and the basic patterns is introduced as the inputs of the CMAC network. Simultaneity momentum term is imported to update the weight of this neural network. the new approach withthe advantages, such as fast learning speed, good generalization, and easiness to implement, is efficient and intelligent. the simulation results show that the speed and accuracy of the flat patternrecognition model are improved obviously.
We present a method, called equivalence learning, which applies a two-class classification approach to object-pairs defined within a multi-class scenario. the underlying idea is that instead of classifying objects int...
详细信息
ISBN:
(数字)9783540734994
ISBN:
(纸本)9783540734987
We present a method, called equivalence learning, which applies a two-class classification approach to object-pairs defined within a multi-class scenario. the underlying idea is that instead of classifying objects into their respective classes, we classify object pairs either as equivalent (belonging to the same class) or non-equivalent (belonging to different classes). the method is based on a vectorisation of the similarity between the objects and the application of a machinelearning algorithm (SVM, ANN, LogReg, Random Forests) to learn the differences between equivalent and non-equivalent object pairs, and define a, unique kernel function that can be obtained via equivalence learning. Using a small dataset of archaeal, bacterial and eukaryotic 3-phosphoglycerate-kinase sequences we found that the classification performance of equivalence learning slightly exceeds those of several simple machinelearning algorithms at the price of a minimal increase in time and space requirements.
暂无评论