Background knowledge from Linked Open data sources, such as DBpedia, Eurostat, and GADM, can be used to create both interpretations and advanced visualizations of statistical data. In this paper, we discuss methods of...
详细信息
Background knowledge from Linked Open data sources, such as DBpedia, Eurostat, and GADM, can be used to create both interpretations and advanced visualizations of statistical data. In this paper, we discuss methods of linking statistical data to Linked Open data sources and the use of the Explain-a-LOD toolkit. The paper further shows exemplary findings and visualizations created by combining the statistics datasets with Linked Open data.
Understanding the semantics of table elements is a prerequisite for many data integration and data discovery tasks. Table annotation is the task of labeling table elements with terms from a given vocabulary. This pape...
详细信息
Wikipedia is often used a source of surface forms, or alternative reference strings for an entity, required for entity linking, disambiguation or coreference resolution tasks. Surface forms have been extracted in a nu...
详细信息
Wikipedia is often used a source of surface forms, or alternative reference strings for an entity, required for entity linking, disambiguation or coreference resolution tasks. Surface forms have been extracted in a number of works from Wikipedia labels, redirects, disambiguations and anchor texts of internal Wikipedia links, which we complement with anchor texts of external Wikipedia links from the Common Crawl web corpus. We tackle the problem of quality of Wikipedia-based surface forms, which has not been raised before. We create the gold standard for the dataset quality evaluation, which reveales the surprisingly low precision of the Wikipedia-based surface forms. We propose filtering approaches that allowed boosting the precision from 75% to 85% for a random entity subset, and from 45% to more than 65% for the subset of popular entities. The filtered surface form dataset as well the gold standard are made publicly available.
Knowledge Graph completion deals with the addition of missing facts to knowledge graphs. While quite a few approaches exist for type and link prediction in knowledge graphs, the addition of literal values (also called...
详细信息
Column type annotation is the task of annotating the columns of a relational table with the semantic type of the values contained in each column. Column type annotation is an important pre-processing step for data sea...
详细信息
Determining the semantic relatedness (i.e., the strength of a relation) of two resources in DBpedia (or other Linked data sources) is a problem addressed by quite a few approaches in the recent past. However, there ar...
详细信息
Determining the semantic relatedness (i.e., the strength of a relation) of two resources in DBpedia (or other Linked data sources) is a problem addressed by quite a few approaches in the recent past. However, there are no large-scale benchmark datasets for comparing such approaches, and it is an open problem to determine which of the approaches work better than others. Furthermore, larget-scale datasets for training machine learning based approaches are not available. DBpedia-NYD is a large-scale synthetic silver standard benchmark dataset which supports contains symmetric and asymmetric similarity values, obtained using a web search engine.
Millions of HTML tables containing structured data can be found on the web. With their wide coverage, these ta-bles are potentially very useful for filling missing values and extending cross-domain knowledge bases suc...
详细信息
Large-scale cross-domain knowledge graphs, such as DBpedia or Wikidata, are some of the most popular and widely used datasets of the Semantic web. In this paper, we introduce some of the most popular knowledge graphs ...
详细信息
In this paper, we show how to model the matching problem as a problem of joint inference. In opposite to existing approaches, we distinguish between the layer of labels and the layer of concepts and properties. Entiti...
详细信息
Transformer-based models like BERT have pushed the state-of the-art for a wide range of tasks in natural language processing. General-purpose pre-training on large corpora allows Transformers to yield good performance...
详细信息
暂无评论