"learning with side-information" is attracting more and more attention in machinelearning problems. In this paper, we propose a general iterative framework for relevant linear feature extraction. It efficie...
详细信息
ISBN:
(纸本)0769521282
"learning with side-information" is attracting more and more attention in machinelearning problems. In this paper, we propose a general iterative framework for relevant linear feature extraction. It efficiently utilizes boththe side-information and unlabeled data to enhance gradually algorithms' performance and robustness. Both good relevant feature extraction and reasonable similarity matrix estimation can be realized. Specifically, we adopt relevant component analysis (RCA) under this framework and get the derived iterative self-enhanced relevant component analysis (ISERCA) algorithm. the experimental results on several data sets show that ISERCA outperforms RCA.
In this paper, we are interested in the sender's name extraction in fax cover pages through a machinelearning scheme. For this purpose, two analysis methods are implemented to work in parallel. the first one is b...
详细信息
In this paper, we are interested in the sender's name extraction in fax cover pages through a machinelearning scheme. For this purpose, two analysis methods are implemented to work in parallel. the first one is based on image document analysis (OCR recognition, physical block selection), the other on text analysis (word feature extraction, local grammar rules). Our main contribution consisted in introducing a neural network to find an optimal combination of the two approaches. Tests carried on real fax images show that the neural network improves performance compared to an empirical combination function and to each method used separately.
We propose an unsupervised approach to select representative face samples (models) from raw videos and build an appearance-based face recognition system. the approach is based on representing the face manifold in a lo...
详细信息
We propose an unsupervised approach to select representative face samples (models) from raw videos and build an appearance-based face recognition system. the approach is based on representing the face manifold in a low-dimensional space using the locally linear embedding (LLE) algorithm and then performing K-means clustering. We define the face models as the cluster centers. Our strategy is motivated by the efficiency of LLE to recover meaningful low-dimensional structures hidden in complex and high dimensional data such as face images. Two other well-known unsupervised learning algorithms (Isomap and SOM) are also considered. We compare and assess the efficiency of these different schemes on the CMU MoBo database which contains 96 face sequences of 24 subjects. the results clearly show significant performance enhancements over traditional methods such as the PCA-based one.
Research on cooperative, adaptive intelligentsystems, involves studying, developing and evaluatingarchitectures and methods to solve complex problemsusing adaptive and cooperative systems. these systemsmay range from ...
详细信息
Research on cooperative, adaptive intelligentsystems, involves studying, developing and evaluatingarchitectures and methods to solve complex problemsusing adaptive and cooperative systems. these systemsmay range from simple software modules (such as aclustering or a classification algorithm) to physicalsystems (such as autonomous robots, machines orsensors).the main characteristic of these systems is that theyare adaptive and cooperative. By adaptive, it is meantthat the systems have a learning ability that makes themadjust their behaviour or performance to cope withchanging situations. the systems are willing tocooperate together to solve complex problems or toachieve common *** patternrecognition, there are notable contributionson the use of multiple classifiers. the most dominantdecomposition model used is an ensemble of classifiers(identical structures) that are trained differently. Mostof the innovations are in the combining methods. thereare weighting and voting approaches, probabilisticapproaches and approximate and fuzzy logicapproaches. In the area of sensor fusion, there havebeen some interesting ideas for fusing the data anddecisions of the sensors. However, most of thesecombining schemes are usually applied as a postprocessing *** this work are concerned with investigatingarchitectures and methods of aggregating decisions in amulti-classifier or multi-agent environment. Newarchitectures that allow active cooperation will bedeveloped. the classifiers (or agents) have to knowsome knowledge about others in the system. Differentforms of cooperation will be reported. In order for thesearchitectures to allow for dynamic decision fusion, theaggregation procedures have to have the flexibility toadapt to changes in the input and output and adjust toimprove on the final output. Changes will be learned bymeans of extracting features using feature *** of these architectures to problems inclassification of data, distributed datamining a
the proceedings contain 80 papers. the topics discussed include: intelligent target recognition based on hybrid support vector machine;a maximum likelihood function approach to direction estimation based on evolutiona...
ISBN:
(纸本)0769519571
the proceedings contain 80 papers. the topics discussed include: intelligent target recognition based on hybrid support vector machine;a maximum likelihood function approach to direction estimation based on evolutionary algorithms;improved decomposition method for support vector machines;face recognition using self-organizing feature maps and support vector machines;support vector machine classifications for microarray expression data set;intelligent encoding of concepts in web document retrieval;model of web oriented intelligent tutoring system for distance education;web mining research;web mining based on multi-agents;study on knowledge processing techniques in air defense operation intelligent aid decision;face detection and facial feature extraction in color image;similarity measure learning for image retrieval using feature subspace analysis;and image retrieval based on multiple features using wavelet.
We present a model-based method for accurate extraction of pedestrian silhouettes from video sequences. Our approach is based on two assumptions, 1) there is a common appearance to all pedestrians, and 2) each individ...
详细信息
ISBN:
(纸本)0769519504
We present a model-based method for accurate extraction of pedestrian silhouettes from video sequences. Our approach is based on two assumptions, 1) there is a common appearance to all pedestrians, and 2) each individual looks like him/herself over a short amount of time. these assumptions allow us to learn pedestrian models that encompass both a pedestrian population appearance and the individual appearance variations. Using our models, we are able to produce pedestrian silhouettes that have fewer noise pixels and missing parts. We apply our silhouette extraction approach to the NIST gait data set and show that under the gait recognition task, our model-based sulhouettes result in much higher recognition rates than silhouettes directly extracted from background subtraction, or any non-model-based smoothing schemes.
the proceedings contain 56 papers. the special focus in this conference is on machinelearning, Probability and Topology. the topics include: Pruning for monotone classification trees;regularized learning with flexibl...
ISBN:
(纸本)3540408134
the proceedings contain 56 papers. the special focus in this conference is on machinelearning, Probability and Topology. the topics include: Pruning for monotone classification trees;regularized learning with flexible constraints;learning to answer emails;a semi-supervised method for learningthe structure of robot environment interactions;using domain specific knowledge for automated modeling;resolving rule conflicts with double induction;a novel partial-memory learning algorithm based on grey relational structure;constructing hierarchical rule systems;text categorization using hybrid multiple model schemes;learning dynamic bayesian networks from multivariate time series with changing dependencies;topology and intelligent data analysis;coherent conditional probability as a measure of information of the relevant conditioning events;very predictive ngrams for space-limited probabilistic models;learning linear classifiers sensitive to example dependent and noisy costs;an effective associative memory for patternrecognition;similarity based classification;numerical attributes in decision trees;similarity-based neural networks for applications in computational molecular biology;combining pairwise classifiers with stacking and adapting association rule learning to subgroup discovery.
the Branch & Bound (B&B) algorithm is a globally optimal feature selection method. the high computational complexity of this algorithm is a well-known problem. the B&B algorithm constructs a search tree, a...
详细信息
ISBN:
(纸本)3540140409
the Branch & Bound (B&B) algorithm is a globally optimal feature selection method. the high computational complexity of this algorithm is a well-known problem. the B&B algorithm constructs a search tree, and then searches for the optimal feature subset in the tree. Previous work on the B&B algorithm was focused on how to simplify the search tree in order to reduce the search complexity. Several improvements have already existed. A detailed analysis of basic B&B algorithm and existing improvements is given under a common framework in which all the algorithms are compared. Based on this analysis, an improved B&B algorithm, BBPP+, is proposed. Experimental comparison shows that BBPP+ performs best.
Support vector machines (SVMs) have been promising methods for classification and regression analysis because of their solid mathematical foundations which convery several salient properties that other methods hardly ...
详细信息
ISBN:
(纸本)9781581137378
Support vector machines (SVMs) have been promising methods for classification and regression analysis because of their solid mathematical foundations which convery several salient properties that other methods hardly provide. However, despite the prominent properties of SVMs, they are not as favored for large-scale datamining as for patternrecognition or machinelearning because the training complexity of SVMs is highly dependent on the size of a data set. Many real-world datamining applications involve millions or billions of data records where even multiple scans of the entire data are too expensive to perform. this paper presents a new method, Clustering-Based SVM (CB-SVM), which is specifically designed for handling very large data sets. CB-SVM applies a hierarchical micro-clustering algorithm that scans the entire data set only once to provide an SVM with high quality samples that carry the statistical summaries of the data such that the summaries maximize the benefit of learningthe SVM. CB-SVM tries to generate the best SVM boundary for very large data sets given limited amount of resources. Our experiments on synthetic and real data sets show that CB-SVM is highly scalable for very large data sets while also generating high classification accuracy. Copyright 2003 ACM.
Recent times have seen an explosive growth in the availability of various kinds of data. It has resulted in an unprecedented opportunity to develop automated data-driven techniques of extracting useful knowledge. data...
详细信息
暂无评论