Accurate career path prediction can support many stakeholders, like job seekers, recruiters, HR, and project managers. However, publicly available data and tools for career path prediction are scarce. In this work, we...
We consider the task of finding frequent parallel episodes in parallel point processes (or event sequences), allowing for imprecise synchrony of the events constituting occurrences (temporal imprecision) as well as in...
详细信息
Text rewriting with differential privacy (DP) provides concrete theoretical guarantees for protecting the privacy of individuals in textual documents. In practice, existing systems may lack the means to validate their...
详细信息
The K2 metric is a well-known evaluation measure (or scoring function) for learning Bayesian networks from data [7]. It is derived by assuming uniform prior distributions on the values of an attribute for each possibl...
详细信息
Our system combines text similarity measures with a textual entailment system. In the main task, we focused on the influence of lexicalized versus unlexicalized features, and how they affect performance on unseen ques...
ISBN:
(纸本)9781937284497
Our system combines text similarity measures with a textual entailment system. In the main task, we focused on the influence of lexicalized versus unlexicalized features, and how they affect performance on unseen questions and domains. We also participated in the pilot partial entailment task, where our system significantly outperforms a strong baseline. c 2013 Association for Computational Linguistics
In this paper we consider the problem of inducing causal relations from statistical data. Although it is well known that a correlation does not justify the claim of a causal relation between two measures, the question...
详细信息
Textual entailment is an asymmetric relation between two text fragments that describes whether one fragment can be inferred from the other. It thus cannot capture the notion that the target fragment is "almost en...
详细信息
Data analysis has become an integral part in many economic fields. In this paper, we present several real-world applications occurring in the fields of automobile development and manufacturing, finance, and online com...
详细信息
The FP-growth algorithm is currently one of the fastest approaches to frequent item set mining. In this paper I describe a C implementation of this algorithm, which contains two variants of the core operation of compu...
详细信息
ISBN:
(纸本)1595932100
The FP-growth algorithm is currently one of the fastest approaches to frequent item set mining. In this paper I describe a C implementation of this algorithm, which contains two variants of the core operation of computing a projection of an FP-tree (the fundamental data structure of the FP-growth algorithm). In addition, projected FP-trees are (optionally) pruned by removing items that have become infrequent due to the projection (an approach that has been called FP-Bonsai). I report experimental results comparing this implementation of the FP-growth algorithm with three other frequent item set mining algorithms I implemented (Apriori, Eclat, and Relim). Copyright 2005 ACM.
Real life transaction data often miss some occurrences of items that are actually present. As a consequence some potentially interesting frequent patterns cannot be discovered, since with exact matching the number of ...
详细信息
暂无评论