Measurement is a fundamental building block of numerous scientific models and their creation in data driven science. Due to the high complexity and size of modern data sets, it is necessary to develop understandable a...
详细信息
The 2014 edition of the Linked data Mining Challenge, conducted in conjunction with Know@LOD 2014, has been the third edition of this challenge. The underlying data came from two domains: public procurement, and resea...
详细信息
The 2014 edition of the Linked data Mining Challenge, conducted in conjunction with Know@LOD 2014, has been the third edition of this challenge. The underlying data came from two domains: public procurement, and researcher collaboration. Like in the previous year, when the challenge was held at the data Mining on Linked data workshop co-located with the European Conference on Machine Learning and Principles and Practice of knowledge Discovery in databases (ECML PKDD 2013), the response to the challenge appeared lower than expected, with only one solution submitted for the predictive task this year. We have tried to track the reasons for the continuously low participation in the challenge via a questionnaire survey, and principles have been distilled that could help organizers of future similar challenges.
We propose a radial user interface which supports phrasing and interactive visual refinement of vague queries in order to search and explore large document sets. The core idea is to provide an integrated view of queri...
详细信息
We propose a radial user interface which supports phrasing and interactive visual refinement of vague queries in order to search and explore large document sets. The core idea is to provide an integrated view of queries and related results, where both queries and results can be interactively manipulated and changes are immediately visualized. Furthermore, the relevance of queries and results can be gradually changed and thus it is possible for a user to explore effects even of slight query changes. Besides the interface itself, we present results of a first user study. The proposed interface can be applied in many interactive text retrieval scenarios. However, it can also be used to support decision making processes where an exploration and interpretation of complex data sets is required.
This paper describes the collaborative participation of Dublin City University and Trinity College Dublin in LogCLEF 2010. Two sets of experiments were conducted. First, different aspects of the TEL query logs were an...
详细信息
This paper describes the collaborative participation of Dublin City University and Trinity College Dublin in LogCLEF 2010. Two sets of experiments were conducted. First, different aspects of the TEL query logs were analysed after extracting user sessions of consecutive queries on a topic. The relation between the queries and their length (number of terms) and position (first query or further reformulations) was examined in a session with respect to query performance estimators such as query scope, IDF-based measures, simplified query clarity score, and average inverse document collection frequency. Results of this analysis suggest that only some estimator values show a correlation with query length or position in the TEL logs (e.g. similarity score between collection and query). Second, the relation between three attributes was investigated: the user's country (detected from IP address), the query language, and the interface language. The investigation aimed to explore the influence of the three attributes on the user's collection selection. Moreover, the investigation involved assigning different weights to the three attributes in a scoring function that was used to re-rank the collections displayed to the user according to the language and country. The results of the collection re-ranking show a significant improvement in Mean Average Precision (MAP) over the original collection ranking of TEL. The results also indicate that the query language and interface language have more inuence than the user's country on the collections selected by the users.
It is known that a (concept) lattice contains an n-dimensional Boolean suborder if and only if the context contains an n-dimensional contra-nominal scale as subcontext. In this work, we investigate more closely the in...
详细信息
When a knowledge Discovery from data (KDD) (Fayyad, Piatetsky-Shapiro, & Smyth, 1996) process is being applied to get knowledge, several methods could be used (Gibert, et al., 2018). A simple and fast way to obtai...
详细信息
ISBN:
(纸本)9781643685434
When a knowledge Discovery from data (KDD) (Fayyad, Piatetsky-Shapiro, & Smyth, 1996) process is being applied to get knowledge, several methods could be used (Gibert, et al., 2018). A simple and fast way to obtain preliminary insights from data before using KDD models is by generating a basic descriptive analysis. It is one of the most popular ways to describe experimental data and should be the beginning of all data projects. Nevertheless some of the main knowledge that can be extracted in a descriptive analysis is hidden due to underlying multivariate structures which could be elicited through multivariate analysis techniques. Moreover, the domain expert is key for a proper interpretation of descriptive results. At the same time, there is a lack of automatic reporting techniques that can report and help in the interpretation of complex patterns and the use of advanced multivariate techniques. This paper shows the tool developed to generate automatic interpretation of Multiple Correspondence Analysis (MCA) and Principal Components Analysis (PCA) by using RMarkdown. This tool generates a Word document which contains the automatic interpretation of the results, built on the basis of regular expressions ellaborating over the R analytical outputs (either numerical or graphical results). The proposal is being applied with some real data, like INSESS database on social vulnerabilities of the Catalan population. In conclusion, the developed tool contributes to facilitate the factorial methods results, avoiding the misinterpretation of the results and the involuntary skipping of conclusions due to the large amount of knowledge that can be extracted from a complete factorial analysis. Also, this software enables non-expert users to read multivariate analysis results in a friendly way. Moreover, this tool saves time in the interpretation step and is a basis to support the expert to start the report with the results, even the output of the software could become the report or
Despite the significant research over the last ten years, commercial ubiquitous computing environments and pervasive applications remain thin on the ground. This paper looks at the explosion in application creativity ...
详细信息
Formal Concept Analysis (FCA) provides a method called attribute exploration which helps a domain expert discover structural dependencies in knowledge domains that can be represented by a formal context (a cross table...
详细信息
Order diagrams allow human analysts to understand and analyze structural properties of ordered data. While an expert can create easily readable order diagrams, the automatic generation of those remains a hard task. In...
详细信息
暂无评论