Withthe popularity of various social media applications, massive social images associated with high quality tags have been made available in many social media web sites nowadays. mining social images on the web has b...
详细信息
ISBN:
(纸本)9781450304931
Withthe popularity of various social media applications, massive social images associated with high quality tags have been made available in many social media web sites nowadays. mining social images on the web has become an emerging important research topic in web search and datamining. In this paper, we propose a machinelearning framework for mining social images and investigate its application to automated image tagging. To effectively discover knowledge from social images that are often associated with multimodal contents (including visual images and textual tags), we propose a novel Unified Distance Metric learning (UDML) scheme, which not only exploits both visual and textual contents of social images, but also effectively unifies both inductive and transductive metric learning techniques in a systematic learning framework. We further develop an efficient stochastic gradient descent algorithm for solving the UDML optimization task and prove the convergence of the algorithm. By applying the proposed technique to the automated image tagging task in our experiments, we demonstrate that our technique is empirically effective and promising for mining social images towards some real applications. Copyright 2011 ACM.
Support Vector machine (Support Vector machine, SVM) demonstrates many unique advantages in solving the small sample, nonlinear and high dimensional patternrecognition, and can promote to the application of the use o...
详细信息
datamining algorithms are widely used today for the analysis of large corporate and scientific datasets stored in databases and data archives. Industry, science and commerce fields often need to analyze very large da...
详细信息
ISBN:
(纸本)9783642226052
datamining algorithms are widely used today for the analysis of large corporate and scientific datasets stored in databases and data archives. Industry, science and commerce fields often need to analyze very large datasets maintained over geographically distributed sites by using the computational power of distributed and parallel systems. Grid computing emerged as an important new field of distributed computing, which could support the distributed knowledge discovery applications. In this paper, we have proposed a method to perform datamining on Grids. the Grid has been setup using Foster and Kesselman's Globus Toolkit, which is the most widely used middleware in scientific and data intensive grid applications. For the development of datamining applications on grids we have used Weka4WS. Weka4WS is an open source framework extended from the Weka toolkit for distributed datamining on Grid, which deploys many of machinelearning algorithms provided by Weka Toolkit. To evaluate the efficiency of the proposed system, a performance analysis of Weka4WS by executing distributed datamining tasks, namely clustering and classification, in grid scenario has been performed. At last, a study on the speed up obtained by doing datamining on grids is done.
Incremental learning has recently received broad attention in many applications of patternrecognition and datamining. With many typical incremental learning situations in the real world where a fast response to chan...
详细信息
ISBN:
(纸本)9783642202667
Incremental learning has recently received broad attention in many applications of patternrecognition and datamining. With many typical incremental learning situations in the real world where a fast response to changing data is necessary, developing a parallel implementation (in fast processing units) will give great impact to many applications. Current research on incremental learning methods employs a modified version of a resource allocating network (RAN) which is one variation of a radial basis function network (RBFN). this paper evaluates the impact of a Graphics Processing Units (GPU) based implementation of a RAN network incorporating Long Term Memory (LTM) [4]. the incremental learning algorithm is compared withthe batch RBF approach in terms of accuracy and computational cost, both in sequential and GPU implementations. the UCI machinelearning benchmark datasets and a real world problem of multimedia forgery detection were considered in the experiments. the preliminary evaluation shows that although the creation of the model is faster withthe RBF algorithm, the RAN-LTM can be useful in environments withthe need of fast changing models and high-dimensional data.
the problem of job stress is generally recognized as one of the major factors leading to a spectrum of health problems. People with certain professions, like intensive care specialists or call-center operators, and pe...
详细信息
Providing methods to support semantic interaction with growing volumes of video data is an increasingly important challenge for datamining. To this end, there has been some success in recognition of simple objects an...
详细信息
mining web traffic data has been addressed in literature mostly using sequential patternmining techniques. Recently, a more powerful pattern called partial order was introduced, withthe hope of providing a more comp...
详细信息
Affective computing (AC) is a unique discipline which includes modeling affect using one or multiple modalities by drawing on techniques from many different fields. AC often deals with problems that are known to be ve...
详细信息
ISBN:
(纸本)9783642245701
Affective computing (AC) is a unique discipline which includes modeling affect using one or multiple modalities by drawing on techniques from many different fields. AC often deals with problems that are known to be very complex and multi-dimensional, involving different kinds of data (numeric, symbolic, visual etc.). However, withthe advancement of machinelearning techniques, a lot of those problems are now becoming more tractable.
In this paper, we propose a nearest neighbor based outlier detection algorithm, N DoT. We introduce a parameter termed as Nearest Neighbor Factor (NNF) to measure the degree of outlierness of a point with respect to i...
详细信息
ISBN:
(纸本)9783642217869
In this paper, we propose a nearest neighbor based outlier detection algorithm, N DoT. We introduce a parameter termed as Nearest Neighbor Factor (NNF) to measure the degree of outlierness of a point with respect to its neighborhood. Unlike the previous outlier detection methods N DoT works by a voting mechanism. Voting mechanism binarizes the decision compared to the top-N style of algorithms. We evaluate our method experimentally and compare results of N DoT with a classical outlier detection method LOF and a recently proposed method LDOF. Experimental results demonstrate that N DoT outperforms LDOF and is comparable with LOF.
Over the past several years, several extensions to Bayesian knowledge tracing have been proposed in order to improve predictions of students' in-tutor and post-test performance. One such extension is Contextual Gu...
详细信息
ISBN:
(纸本)9789038625379
Over the past several years, several extensions to Bayesian knowledge tracing have been proposed in order to improve predictions of students' in-tutor and post-test performance. One such extension is Contextual Guess and Slip, which incorporates machine-learned models of students' guess and slip behaviors in order to enhance the overall model's predictive performance [Baker et al. 2008a]. Similar machinelearning approaches have been introduced in order to detect specific problem-solving steps during which students most likely learned particular skills [Baker, Goldstein, and Heffernan in press]. However, one important class of features that have not been considered in machinelearning models used in these two techniques is metrics of item and skill difficulty, a key type of feature in other assessment frameworks [e.g Hambleton, Swaminathan, & Rogers, 1991;Pavlik, Cen, & Koedinger 2009]. In this paper, a set of engineered features that quantify skill difficulty and related skill-level constructs are investigated in terms of their ability to improve models of guessing, slipping, and detecting moment-by-moment learning. Supervised machinelearning models that have been trained using the new skill-difficulty features are compared to models from the original contextual guess and slip and moment-by-moment learning detector work. this includes performance comparisons for predicting students' in-tutor responses, as well as post-test responses, for a pair of Cognitive Tutor data sets.
暂无评论