the proceedings contain 61 papers. the topics discussed include improved comprehensibility and reliability of explanations via restricted halfspace discretization;selection of subsets of ordered features in machine le...
ISBN:
(纸本)3642030696
the proceedings contain 61 papers. the topics discussed include improved comprehensibility and reliability of explanations via restricted halfspace discretization;selection of subsets of ordered features in machinelearning;combination of vector quantization and visualization;discretization of target attributes for subgroup discovery;preserving privacy in time series data classification by discretization;sequential EM for unsupervised adaptive Gaussian mixture model based classifier;optimal double-kernel combination for classification;a linear classification method in a very high dimensional space using distributed representation;PMCRI: a parallel modular classification rule induction framework;dynamic score combination: a supervised and unsupervised score combination method;and ODDboost.
Binary decision diagrams (BDD) is a compact and efficient representation of Boolean functions with extensions available for sets and finite-valued functions. the key feature of the BDD is an ability to employ internal...
详细信息
ISBN:
(纸本)9783319089799;9783319089782
Binary decision diagrams (BDD) is a compact and efficient representation of Boolean functions with extensions available for sets and finite-valued functions. the key feature of the BDD is an ability to employ internal structure (not necessary known upfront) of an object being modelled in order to provide a compact in-memory representation. In this paper we propose application of the BDD for machinelearning as a tool for fast general patternrecognition. Multiple BDDs are used to capture a sets of training samples (patterns) and to estimate the similarity of a given test sample withthe memorized training sets. then, having multiple similarity estimates further analysis is done using additional layer of BDDs or common machinelearning techniques. We describe training algorithms for BDDs (supervised, unsupervised and combined), an approach for constructing multi-layered networks combining BDDs with traditional artificial neurons and present experimental results for handwritten digits recognition on the MNIST dataset.
the proliferation of low power and low cost continuous sensing technology is enabling new and innovative applications in wearables and Internet of things (IoT). At the same time, new applications are creating challeng...
详细信息
ISBN:
(纸本)9783319419206;9783319419190
the proliferation of low power and low cost continuous sensing technology is enabling new and innovative applications in wearables and Internet of things (IoT). At the same time, new applications are creating challenges to maintain real-time response in a resource-constrained device, while maintaining an acceptable performance. In this paper, we describe an IMU (Inertial Measurement Unit) sensor-based generalized hand gesture recognition system, its applications, and the challenges involved in implementing the algorithm in a resource-constrained device. We have implemented a simple algorithm for gesture spotting that substantially reduces the false positives. the gesture recognition model was built using the data collected from 52 unique subjects. the model was mapped onto Intel (R) Quark (TM) SE pattern Matching Engine, and field-tested using 8 additional subjects achieving 92% performance.
the necessity to operate withthe huge number of anonymous documents abounding on the Internet is initiating the study of new methods for authorship recognition. the principal weakness of the methods used in this area...
详细信息
ISBN:
(数字)9783319419206
ISBN:
(纸本)9783319419206;9783319419190
the necessity to operate withthe huge number of anonymous documents abounding on the Internet is initiating the study of new methods for authorship recognition. the principal weakness of the methods used in this area is that they assess the similarity of text styles without any regard to their surroundings. this paper proposes a novel mathematical model of the writing process striving to quantify this dependency. A text is divided into a series of sequential sub-documents, which are represented via term histograms. the histograms proximity is estimated through a simple probability distance. Intending to typify the text writing style, a new characteristic representing the mean distance between a current sub-document and numerous earlier ones is advanced. An empirical distribution over the whole document of this feature specifies the writing style. So, dissimilarity of such distributions indicates a difference in the writing styles, and their coincidence implies the styles' identity. Numerical experiments demonstrate high potential ability of the proposed approach.
this work takes place in the context of conversion rate optimization by enhancing the user experience during navigation on e-commerce web sites. the requirement is to be able to segment visitors into meaningful cluste...
详细信息
ISBN:
(数字)9783319419206
ISBN:
(纸本)9783319419206;9783319419190
this work takes place in the context of conversion rate optimization by enhancing the user experience during navigation on e-commerce web sites. the requirement is to be able to segment visitors into meaningful clusters, which can then be targeted with specific call-to-actions, in order to increase the web site turnover. this paper presents an original approach, which equally combines global-and local-alignment techniques (Needleman-Wunsch and Smith-Waterman) in order to automatically segment visitors according to the sequence of visited pages. Experimental results on synthetic datasets show that our approach out-performs other typically used alignment metrics, such as hybrid approaches or Dynamic Time Warping.
In this paper we address the problem of using bet selections of a large number of mostly non-expert users to improve sports betting tips. A similarity based approach is used to describe individual users' strategie...
详细信息
Traditional kernelised classification methods Could not perforin well sometimes because of the using of a single and fixed kernel, especially oil sonic complicated data sets. In this paper. a novel optimal double-kern...
详细信息
ISBN:
(纸本)9783642030697
Traditional kernelised classification methods Could not perforin well sometimes because of the using of a single and fixed kernel, especially oil sonic complicated data sets. In this paper. a novel optimal double-kernel combination (ODKC) method is proposed for complicated classification tasks. Firstly, data sets are mapped by two basic kernels into different feature spaces respectively, and then three kinds of optimal composite kernels are constructed by integrating information of the two feature spaces. Comparative experiments demonstrate the effectiveness of our methods.
the FP-Growth algorithm has been studied extensively in the field of frequent patternmining. the algorithm offers the advantage of avoiding costly database scans in comparison with Apriori-based algorithms. However, ...
详细信息
ISBN:
(纸本)9783319419206;9783319419190
the FP-Growth algorithm has been studied extensively in the field of frequent patternmining. the algorithm offers the advantage of avoiding costly database scans in comparison with Apriori-based algorithms. However, since it still requires two database scans, it cannot be used on streaming data. Also, the algorithm is designed for static datasets, where the input transactions are fixed and thus cannot be used for incremental or interactive mining. Existing incremental mining algorithms are not easily adoptable for on-the-fly, fast, and memory efficient FP-tree mining. In this paper we propose a novel SPFP-tree (single pass frequent pattern tree) algorithm that scans the database only once and provides the same tree as FP-Growth. Our algorithm changes the tree structure dynamically to create a highly compact frequency-ordered tree on the fly. Withthe insertion of each new transaction our algorithm dynamically maintains a tree identical to an FP-tree. Experimental results show the efficiency of the SPFP-tree algorithm in both incremental and interactive mining of frequent patterns.
this work presents an image analysis framework driven by emerging evidence and constrained by the semantics expressed in an ontology. Human perception, apart from visual stimulus and patternrecognition, relies also o...
详细信息
ISBN:
(纸本)9783642030697
this work presents an image analysis framework driven by emerging evidence and constrained by the semantics expressed in an ontology. Human perception, apart from visual stimulus and patternrecognition, relies also on general knowledge and application context for understanding visual content in conceptual terms. Our work is an attempt to imitate this behavior by devising an evidence driven probabilistic, inference framework using ontologies and bayesian networks. Experiments conducted for two different image analysis, tasks showed improvement performance, compared to the case where computer vision techniques act isolated from any type of knowledge or context.
Multivariate georeferenced data have become omnipresent in the many scientific fields and pose substantial analysis challenges. One of them is the grouping of data locations into spatially contiguous clusters so that ...
详细信息
ISBN:
(纸本)9783319419206;9783319419190
Multivariate georeferenced data have become omnipresent in the many scientific fields and pose substantial analysis challenges. One of them is the grouping of data locations into spatially contiguous clusters so that data locations within the same cluster are more similar while clusters are different from each other, in terms of a concept of dissimilarity. In this work, we develop an agglomerative hierarchical clustering approach that takes into account the spatial dependency between observations. It relies on a dissimilarity matrix built from a non-parametric kernel estimator of the multivariate spatial dependence structure of data. It integrates existing methods to find the optimal cluster number. the capability of the proposed approach to provide spatially compact, connected and meaningful clusters is illustrated to the National Geochemical Survey of Australia data.
暂无评论