In machine diagnostics it is difficult to collect for learning all possible operating modes of machine functioning. Some operating modes will usually be missing. In these circumstances, it is important to know which m...
详细信息
An algorithm for data condensation using support vector machines (SVM) is presented. the algorithm extracts data points lying close to the class boundaries, which form a much reduced but critical set for classificatio...
详细信息
ISBN:
(纸本)0769507506
An algorithm for data condensation using support vector machines (SVM) is presented. the algorithm extracts data points lying close to the class boundaries, which form a much reduced but critical set for classification. the problem of large memory requirements for training SVM in batch mode is circumvented by adopting an active incremental learning algorithm. the learning strategy is motivated from the condensed nearest neighbor classification technique. Experimental results presented show that such active incremental learning enjoy superiority in terms of computation time and condensation ratio, over related methods.
In machine diagnostics it is difficult to collect for learning all possible operating modes of machine functioning. Some of the operating modes are often missing. In these circumstances, it is important to know which ...
详细信息
ISBN:
(纸本)0769507506
In machine diagnostics it is difficult to collect for learning all possible operating modes of machine functioning. Some of the operating modes are often missing. In these circumstances, it is important to know which modes (subclasses) are the most valuable for successful machine diagnosis. It is also of interest to investigate the usefulness of noise injection to cover the missing operating modes in the data. In this paper, we study the importance of selecting different operating modes of a water-pump and using them for learning in both 2-class and 4-class problems. We show that the operating modes representing different running speeds are more valuable than those representing machine loads. We also demonstrate that the 2-nearest neighbours directed noise injection is useful when filing in missing operating modes in the data.
Texture-based recognition for image segmentation and classification is very important in many domains and different numerical features coming from a variety of approaches have been proposed. Texture segmentation using...
详细信息
ISBN:
(纸本)0769507506
Texture-based recognition for image segmentation and classification is very important in many domains and different numerical features coming from a variety of approaches have been proposed. Texture segmentation using six features based on the fractal dimension has been used elsewhere. this paper, studies properties of these features from the point of view of dimensionality reduction, mutual relation, differential relevance, discrete quantization, and classification ability. In an experimental framework, a set of statistical, soft computing, datamining and machinelearning methods were used on a set of different textures (multidimensional scaling, rough sets, factor analysis, cluster analysis and inductive classification). It was found that fractal features effectively have texture recognition ability. Some of these are very relevant (the fractal dimension of smoothed versions of the original image and the multifractal dimension). Not so many quantisation levels of fractal dimension variables are required in order to achieve high recognition performance.
Clustering very large databases is a challenge for traditional patternrecognition algorithms, e.g. the expectation-maximization (EM) algorithm for fitting mixture models, because of high memory and iteration requirem...
详细信息
ISBN:
(纸本)0769507506
Clustering very large databases is a challenge for traditional patternrecognition algorithms, e.g. the expectation-maximization (EM) algorithm for fitting mixture models, because of high memory and iteration requirements. Over large databases, the cost of the numerous scans required to converge and large memory requirement of the algorithm becomes prohibitive. We present a decomposition of the EM algorithm requiring a small amount of memory by limiting iterations to small data subsets. the scalable EM approach requires at most one database scan and is based on identifying regions of the datathat are discardable, regions that are compressible, and regions that must be maintained in memory. data resolution is preserved to the extent possible based upon the size of the memory buffer and fit of the current model to the data. Computational tests demonstrate that the scalable scheme outperforms similarly constrained EM approaches.
datamining holds the promise of extracting unsuspected information from very large databases. One difficulty is that discovery techniques are often drawn from methods in which the amount of work increases geometrical...
详细信息
datamining holds the promise of extracting unsuspected information from very large databases. One difficulty is that discovery techniques are often drawn from methods in which the amount of work increases geometrically withdata quantity. Consequentially, the use of these methods is problematic in very large databases. Categorically based association rules are a linearly complex datamining methodology. Unfortunately, rules formed from categorical data often generate many fine grained rules. the concern is how fine grained rules might be aggregated and the role that non-categorical data might have. It appears that soft computing techniques may be useful.
We propose an approach that extracts patterns from a temporal signal sequence without prior knowledge about the lengths, positions and the number of the patterns. Previous research (Hong et al., 1999) proposes a schem...
详细信息
ISBN:
(纸本)0769507506
We propose an approach that extracts patterns from a temporal signal sequence without prior knowledge about the lengths, positions and the number of the patterns. Previous research (Hong et al., 1999) proposes a scheme for extracting recurrent patterns from a noise free signal without temporal warping. To handle noise and nonlinear temporal warping, a threshold finite state machine (TFSM) is proposed to perform spatial-temporal data modeling. the TFSM is first roughly initialized. A variance of segmental K-means is used to train the TFSM. the training results give us boththe patterns embedding in the signal sequence and the trained TFSM that can be used to represent and detect the patterns.
A method for 3D shapes modeling which has hierarchical structure is proposed. For representing 3D surface shapes, 4th order non-uniform rational B-spline functions with controllable knots are employed as the surface m...
详细信息
ISBN:
(纸本)0769507506
A method for 3D shapes modeling which has hierarchical structure is proposed. For representing 3D surface shapes, 4th order non-uniform rational B-spline functions with controllable knots are employed as the surface model. Consequently, a multiresolution representation technique based on multiresolution wavelet transform can be implemented withthe corresponding B-wavelets. the surface models at each resolution level are obtained by performing a decomposition algorithm, after the surface model at the highest level is estimated. In order to estimate the surface model as accurately as possible, a regularization problem is solved by an iterative algorithm. through several experiments using real data of range images, the effectiveness of the proposed method was confirmed.
the goal of character recognition research is to simplify and automate the development of character recognition algorithms. We describe an approach based on applying preprocessing to data sets of Latin characters and ...
详细信息
Feature subset selection refers to a datamining enhancement technique which aims to reduce the number of features to be used. this reduction is expected to improve the performance of datamining algorithms to be used...
详细信息
Feature subset selection refers to a datamining enhancement technique which aims to reduce the number of features to be used. this reduction is expected to improve the performance of datamining algorithms to be used, in aspects of speed, accuracy and simplicity. Although there has been some work on feature subset selection, the research on the theoretically computational complexity of this problem and on the optimal selection of fuzzy-valued feature subsets has not been found. this paper focuses on a problem called Optimal Fuzzy-valued Feature Subset Selection (OFFSS) which is regarded as being important but difficult in machinelearning and patternrecognition. the measure of the quality of a set of features is defined by the overall overlapping degree between two classes of examples and the size of feature subset. Main contributions of this paper are that: (1) the concept of fuzzy extension matrix is introduced, (2) the computational complexity of OFFSS is proved to be NP-hard, (3) a simple but powerful heuristic algorithm for OFFSS is given, and (4) the feasibility and simplicity of the proposed algorithm are demonstrated via applications of OFFSS to input selection of neuro-fuzzy systems and the fuzzy decision tree induction.
暂无评论