The proceedings contain 16 papers. The topics discussed include: CluChunk: clustering large scale user-generated content incorporating chunklet information;parallel rough set based knowledge acquisition using mapreduc...
ISBN:
(纸本)9781450315470
The proceedings contain 16 papers. The topics discussed include: CluChunk: clustering large scale user-generated content incorporating chunklet information;parallel rough set based knowledge acquisition using mapreduce from big data;delta-SimRank computing on MapReduce;incrementally optimized decision tree for noisy big data;a density-based clustering structure mining algorithm for datastreams;subscriber classification within telecom networks utilizing big data technologies and machinelearning;accelerating minor allele frequency computation with graphics processors;online feature selection for mining big data;accelerating bayesian network parameter learning using hadoop and MapReduce;stream-dashboard: a framework for mining, tracking and validating clusters in a datastream;and a kernel fused perceptron for the online classification of large-scale data.
During the last decades, the information in the web has increased drastically but larger quantities of data do not provide perse added value for web visitors;there is a need for more efficient access to the required i...
详细信息
ISBN:
(纸本)9789898565143
During the last decades, the information in the web has increased drastically but larger quantities of data do not provide perse added value for web visitors;there is a need for more efficient access to the required information and adaptation to user preferences or needs. The use of machinelearning techniques to build user profiles allows to take into account users' real preferences. We present in this work a preliminary system, based on the collaborative filtering approach, to identify and generate interesting links for the users while they are navigating. The system uses only web navigation logs stored in any web server (according to the Common Log Format) and extracts information from them combining unsupervised and supervised classification techniques and frequent patternmining techniques. It also includes a generalization procedure in the data preprocessing phase and in this work we analyze its effect on the final performance of the whole system. We also analyze the effect of the cold start (0 day problem) in the proposed system. The experiments show that the proposed generalization option improves the results of the designed system, which performs efficiently w.r.t. a web-accessible database and is even able to deal with the cold start problem.
Ontology matching approaches have mostly worked on the schema level so far. With the advent of Linked Open data and the availability of a massive amount of instance information, instance-based approaches become possib...
详细信息
Ontology matching approaches have mostly worked on the schema level so far. With the advent of Linked Open data and the availability of a massive amount of instance information, instance-based approaches become possible. This position paper discusses approaches and challenges for using those instances as input for machinelearning algorithms, with a focus on rule learning algorithms, as a means for ontology matching.
The proceedings contain 147 papers. The topics discussed include: estimation of the common oscillation for phase locked matrix factorization;a general algorithm for calculating force histograms using vector data;on th...
ISBN:
(纸本)9789898425980
The proceedings contain 147 papers. The topics discussed include: estimation of the common oscillation for phase locked matrix factorization;a general algorithm for calculating force histograms using vector data;on the crossover operator for GA-based optimizers in sequential projection pursuit;a dynamic wrapper method for feature discretization and selection;generative embeddings based on rician mixtures - application to kernel-based discriminative classification of magnetic resonance images;clustering complex multimedia objects using an ensemble approach;the stepwise response refinement screener (SRRS) and its applications to analysis of factorial experiments;handling imprecise labels in feature selection with graph Laplacian;the interplay of statistical and structural patternrecognition from a machinelearning perspective;and context sensitive information: model validation by information theory.
The proceedings contain 147 papers. The topics discussed include: estimation of the common oscillation for phase locked matrix factorization;a general algorithm for calculating force histograms using vector data;on th...
ISBN:
(纸本)9789898425980
The proceedings contain 147 papers. The topics discussed include: estimation of the common oscillation for phase locked matrix factorization;a general algorithm for calculating force histograms using vector data;on the crossover operator for GA-based optimizers in sequential projection pursuit;a dynamic wrapper method for feature discretization and selection;generative embeddings based on rician mixtures - application to kernel-based discriminative classification of magnetic resonance images;clustering complex multimedia objects using an ensemble approach;the stepwise response refinement screener (SRRS) and its applications to analysis of factorial experiments;handling imprecise labels in feature selection with graph Laplacian;the interplay of statistical and structural patternrecognition from a machinelearning perspective;and context sensitive information: model validation by information theory.
In this position paper, we present efficient and practical integrity verification techniques that check whether the un-trusted cloud has returned correct result of outsourced data analytics computations. We consider t...
详细信息
ISBN:
(纸本)9781450315968
In this position paper, we present efficient and practical integrity verification techniques that check whether the un-trusted cloud has returned correct result of outsourced data analytics computations. We consider the computation of summation form that is used in a large class of machinelearning and datamining problems. We discuss our verification techniques for both non-collusive and collusive malicious workers in MapReduce. Copyright 2012 ACM.
In this paper, a partially supervised machinelearning approach is proposed for the recognition of emotional user states in HCI from bio-physiological data. To do so, an unsupervised learning preprocessing step is int...
详细信息
data clustering is one of the important datamining tasks. It is the process of grouping objects into clusters such that objects in the same clusters are more similar to each other than the objects in different cluste...
详细信息
Recently, cross-domain learning has become one of the most important research directions in datamining and machinelearning. In multi-domain learning, one problem is that the classification patterns and data distribu...
详细信息
ISBN:
(纸本)9781450315555
Recently, cross-domain learning has become one of the most important research directions in datamining and machinelearning. In multi-domain learning, one problem is that the classification patterns and data distributions are different among domains, which leads to that the knowledge (e.g. classification hyperplane) can not be directly transferred from one domain to another. This paper proposes a framework to combine class-separate objectives (maximize separability among classes) and domain-merge objectives (minimize separability among domains) to achieve cross-domain representation learning. Three special methods called DMCS CSF, DMCS FDA and DMCS PCDML upon this framework are given and the experimental results valid their effectiveness. Copyright 2012 ACM.
The proceedings contain 17 papers. The topics discussed include: unlabeled data and multiple views;studying self- and active-training methods for multi-feature set emotion recognition;semi-supervised linear discrimina...
ISBN:
(纸本)9783642282577
The proceedings contain 17 papers. The topics discussed include: unlabeled data and multiple views;studying self- and active-training methods for multi-feature set emotion recognition;semi-supervised linear discriminant analysis using moment constraints;manifold-regularized minimax probability machine;supervised and unsupervised co-training of adaptive activation functions in neural nets;semi-unsupervised weighted maximum-likelihood estimation of joint densities for the co-training of adaptive activation functions;semi-supervised kernel clustering with sample-to-cluster weights;homeokinetic reinforcement learning;iterative refinement of hmm and HCRF for sequence classification;on the utility of partially labeled data for classification of microarray data;multi-instance methods for partially supervised image segmentation;and semi-supervised training set adaption to unknown countries for traffic sign classifiers.
暂无评论