In patternrecognition, the principal component analysis (PCA) is one of the most famous feature extraction methods for dimensionality reduction of high-dimensional datasets. Furthermore, Simple-PCA (SPCA) which is a ...
详细信息
ISBN:
(纸本)9781457709661
In patternrecognition, the principal component analysis (PCA) is one of the most famous feature extraction methods for dimensionality reduction of high-dimensional datasets. Furthermore, Simple-PCA (SPCA) which is a faster version of the PCA, has been carried out effectively by iterative operated learning. However, in SPCA, when input data are distributed in a complex way, SPCA might not be efficient because it is learned without class information of the dataset. Thus, SPCA cannot be said that it is optimal for classification. In this paper, we propose a new learning algorithm, which is learned with the class information of the dataset. Eigenvectors spanning eigenspace of the dataset are obtained by calculation of data variations belonging to each class. We will show the derivation of the proposed algorithm and demonstrate some experiments to compare the SPCA with the proposed algorithm by using UCI datasets.
If M given training patterns are not extremely similar, the analog N-vectors representing them are generally separable in the N-space. Then a one-layered binary perceptron containing P neurons (P equals >log2M)...
详细信息
ISBN:
(纸本)0819418455
If M given training patterns are not extremely similar, the analog N-vectors representing them are generally separable in the N-space. Then a one-layered binary perceptron containing P neurons (P equals >log2M) is generally sufficient to do the patternrecognition job. The connection matrix between the input (linear) layer and the neuron layer can be calculated in a noniterative manner. Real-time patternrecognition experiments implementing this theoretical result were reported in this and other national conferences last year. It is demonstrated in these experiments that the noniterative training is very fast, (can be done in real time), and the recognition of the untrained patterns is very robust and very accurate. The present paper concentrates at the theoretical foundation of this noniteratively trained perceptron. The theory starts from an N-dimension Euclidean-geometry approach. An optimally robustlearning scheme is then derived. The robustness and the speed of this optimal learning scheme are to be compared with those of the conventional iterative learning schemes.
Video shot segmentation is a solid foundation for automatic video content analysis, for most content based video retrieval tasks require accurate segmentation of video boundaries. In recent years, using the tools of d...
详细信息
ISBN:
(纸本)9781605588407
Video shot segmentation is a solid foundation for automatic video content analysis, for most content based video retrieval tasks require accurate segmentation of video boundaries. In recent years, using the tools of datamining and machinelearning to detect shot boundaries has become more and more popular. In this paper, we propose an effective video segmentation approach based on a dominant-set clustering algorithm. The algorithm can not only automatically determine the number of video shots, but also obtain accurate shot boundaries with low computation complexity. Experimental results have demonstrated the effectiveness of the proposed shot segmentation approach. Copyright 2009 ACM.
Monitoring plankton populations in situ is fundamental to preserve the aquatic ecosystem. Plankton microorganisms are in fact susceptible of minor environmental perturbations, that can reflect into consequent morpholo...
详细信息
ISBN:
(数字)9781665490627
ISBN:
(纸本)9781665490627
Monitoring plankton populations in situ is fundamental to preserve the aquatic ecosystem. Plankton microorganisms are in fact susceptible of minor environmental perturbations, that can reflect into consequent morphological and dynamical modifications. Nowadays, the availability of advanced automatic or semi-automatic acquisition systems has been allowing the production of an increasingly large amount of plankton image data. The adoption of machinelearning algorithms to classify such data may be affected by the significant cost of manual annotation, due to both the huge quantity of acquired data and the numerosity of plankton species. To address these challenges, we propose an efficient unsupervised learning pipeline to provide accurate classification of plankton microorganisms. We build a set of image descriptors exploiting a two-step procedure. First, a Variational Autoencoder (VAE) is trained on features extracted by a pre-trained neural network. We then use the learnt latent space as image descriptor for clustering. We compare our method with state-of-the-art unsupervised approaches, where a set of pre-defined hand-crafted features is used for clustering of plankton images. The proposed pipeline outperforms the benchmark algorithms for all the plankton datasets included in our analysis, providing better image embedding properties.
The proceedings contain 54 papers. The topics discussed include: an intelligent approach for food recipe rating prediction using machinelearning;hardware implementation of IP-enabled wireless sensor network using 6Lo...
ISBN:
(纸本)9780738131771
The proceedings contain 54 papers. The topics discussed include: an intelligent approach for food recipe rating prediction using machinelearning;hardware implementation of IP-enabled wireless sensor network using 6LoWPAN;an efficient machinelearning-based approach for android v.11 ransomware detection;robotics to enhance the teaching and learning process;an efficient patternrecognition based method for drug-drug interaction diagnosis;enhancing the prediction of MERS-CoV survivability using stacking-based method;a new solution to the brain state permanency for brain-based authentication methods;towards efficient detection and crowd management for law enforcing agencies;and AI support marketing: understanding the customer journey towards the business development.
The proceedings contain 9 papers. The topics discussed include: extending FrameNet to machinelearning domain;PageRank on Wikipedia: towards general importance scores for entities;learning semantic rules for intellige...
The proceedings contain 9 papers. The topics discussed include: extending FrameNet to machinelearning domain;PageRank on Wikipedia: towards general importance scores for entities;learning semantic rules for intelligent transport scheduling in hospitals;the linked datamining challenge 2016;not-so-linked solution to the linked datamining challenge 2016;a hybrid method for rating prediction using linked data features and text reviews;can you judge a music album by its cover?;finding and avoiding bugs in enterprise ontologies;and a two-fold quality assurance approach for dynamic knowledge bases: the 3cixty use case.
In the nonexistence of medical diagnosis substantiations, it is complicated for the expertto speak out about the grade of disease with affirmation. Generally many tests are done that involve clustering or classificati...
详细信息
ISBN:
(纸本)9781479909353;9781467361293
In the nonexistence of medical diagnosis substantiations, it is complicated for the expertto speak out about the grade of disease with affirmation. Generally many tests are done that involve clustering or classification of large scale data However many tests could complicate the main diagnosis process and lead to the difficulty in obtaining the end results, particularly in the case where many tests are performed This kind of difficulty could be resolved with the aid of machinelearning techniques. In this paper survey on three different disease diagnosis are taken in to the consideration. The heart Disease, Breast Cancer Disease and the Diabetes Disease are analyzed and observed with existing works. This survey paper reveals various existing approaches that have processed for diagnosis these diseases using datamining techniques.
During the last decades, the information in the web has increased drastically but larger quantities of data do not provide perse added value for web visitors;there is a need for more efficient access to the required i...
详细信息
ISBN:
(纸本)9789898565143
During the last decades, the information in the web has increased drastically but larger quantities of data do not provide perse added value for web visitors;there is a need for more efficient access to the required information and adaptation to user preferences or needs. The use of machinelearning techniques to build user profiles allows to take into account users' real preferences. We present in this work a preliminary system, based on the collaborative filtering approach, to identify and generate interesting links for the users while they are navigating. The system uses only web navigation logs stored in any web server (according to the Common Log Format) and extracts information from them combining unsupervised and supervised classification techniques and frequent patternmining techniques. It also includes a generalization procedure in the data preprocessing phase and in this work we analyze its effect on the final performance of the whole system. We also analyze the effect of the cold start (0 day problem) in the proposed system. The experiments show that the proposed generalization option improves the results of the designed system, which performs efficiently w.r.t. a web-accessible database and is even able to deal with the cold start problem.
This paper provides an empirical study for feature learning based on induction. We encode image data into first-order expressions and compute their least generalization. An interesting question is whether the least ge...
详细信息
ISBN:
(数字)9783030974541
ISBN:
(纸本)9783030974541;9783030974534
This paper provides an empirical study for feature learning based on induction. We encode image data into first-order expressions and compute their least generalization. An interesting question is whether the least generalization can extract a common pattern of input data. We introduce three different methods for feature extraction based on symbolic manipulation. We perform experiments using the MNIstdatasets and show that the proposed methods successfully capture features from training data and classify testdata in around 90% accuracies. The results of this paper show potentials of induction and symbolic reasoning to feature learning or patternrecognition from raw data.
The utilization of machinelearning (ML) and patternrecognition algorithms, which is used for analysis of complex datasets to detect similarities, differences, or trends, which can be particularly useful in medical d...
详细信息
暂无评论