Dependency parses are an effective way to inject linguistic knowledge into many downstream tasks, and many practitioners wish to efficiently parse sentences at scale. Recent advances in GPU hardware have enabled neura...
详细信息
The knowledge graph(KG) composed of entities with their descriptions and attributes, and relationship between entities, is finding more and more application scenarios in various naturallanguageprocessing tasks. In a...
详细信息
Very recently, some studies on neural dependency parsers have shown advantage over the traditional ones on a wide variety of languages. However, for graph-based neural dependency parsing systems, they either count on ...
详细信息
This paper presents a novel neural machine translation model which jointly learns translation and source-side latent graph representations of sentences. Unlike existing pipelined approaches using syntactic parsers, ou...
详细信息
This paper studies a novel approach to processing massive uncertain graph data. In this approach, we propose a new framework to simultaneously process a query on a set of randomly sampled possible worlds of an uncerta...
详细信息
ISBN:
(纸本)9781509065431
This paper studies a novel approach to processing massive uncertain graph data. In this approach, we propose a new framework to simultaneously process a query on a set of randomly sampled possible worlds of an uncertain graph. based on this framework, we develop a series of algorithms to analyze massive uncertain graphs, including breadth-first search, shortest distance queries, triangle counting, and core decomposition. We implement this approach based on graphLab, one of the state-of-the-art graphprocessing frameworks. By sharing fine-grained internal processing steps on common substructures of sampled possible worlds, the new approach achieves tens to hundreds of times speedup in execution time on a cluster of 20 servers.
The task of answering naturallanguage questions over RDF data has received wide interest in recent years, in particular in the context of the series of QALD benchmarks. The task consists of mapping a naturallanguage...
详细信息
This thesis investigates the role of linguistically-motivated generative models of syntax and semantic structure in naturallanguageprocessing (NLP). Syn- tactic well-formedness is crucial in language generation, but...
This thesis investigates the role of linguistically-motivated generative models of syntax and semantic structure in naturallanguageprocessing (NLP). Syn- tactic well-formedness is crucial in language generation, but most statistical models do not account for the hierarchical structure of sentences. Many appli- cations exhibiting naturallanguage understanding rely on structured semantic representations to enable querying, inference and reasoning. Yet most semantic parsers produce domain-specific or inadequately expressive representations. We propose a series of generative transition-based models for dependency syn- tax which can be applied as both parsers and language models while being amenable to supervised or unsupervised learning. Two models are based on Markov assumptions commonly made in NLP: The first is a Bayesian model with hierarchical smoothing, the second is parameterised by feed-forward neu- ral networks. The Bayesian model enables careful analysis of the structure of the conditioning contexts required for generative parsers, but the neural network is more accurate. As a language model the syntactic neural model outperforms both the Bayesian model and n-gram neural networks, pointing to the complementary nature of distributed and structured representations for syntactic prediction. We propose approximate inference methodsbased on par- ticle filtering. The third model is parameterised by recurrent neural networks (RNNs), dropping the Markov assumptions. Exact inference with dynamic programming is made tractable here by simplifying the structure of the condi- tioning contexts. We then shift the focus to semantics and propose models for parsing sentences to labelled semantic graphs. We introduce a transition-based parser which in- crementally predicts graph nodes (predicates) and edges (arguments). This approach is contrasted against predicting top-down graph traversals. RNNs and pointer networks are key components in approaching graph parsing as an inc
In historical manuscripts, humans can detect handwritten words, lines, and decorations with lightness even if they do not know the language or the script. Yet for automatic processing this task has proven elusive, esp...
详细信息
In historical manuscripts, humans can detect handwritten words, lines, and decorations with lightness even if they do not know the language or the script. Yet for automatic processing this task has proven elusive, especially in the case of handwritten documents with complex layouts, which is why semiautomatic methods that integrate the human user into the process are needed. In this paper, we introduce a user-centered segmentation method based on document graphs and scribbling interaction. The graphs capture a sparse representation of the document's structure that can then be edited by the user with a stylus on a touch-sensitive screen. We evaluate the proposed method on a newly introduced database of historical manuscripts with complex layout and demonstrate, first, that the document graphs are already close to the desired segmentation and, second, that scribbling allows a natural and efficient interaction.
We define a conceptualization of the research process, the Scholarly Ontology (SO),based on which we show how different aspects of scholarly research behaviour canbe documented automatically as well as interactively a...
详细信息
We define a conceptualization of the research process, the Scholarly Ontology (SO),based on which we show how different aspects of scholarly research behaviour canbe documented automatically as well as interactively and relevant resources, possiblyfrom different fields, can be interconnected. We demonstrate the capacity of SO tosupport various research inquiries by formulating appropriate complex queries. Weexplore the role of type taxonomies, such as the Activity Type class, and explain howtheir instances can function as pivotal elements in modeling intentionality andfunctionality properties of various SO classes. We formulate semantic constraintsthat provide a framework for reasoning over SO. Furthermore, we investigate the useof semantic similarity in process classification via the Activity Type taxonomy. Wereview several graph-based similarity measures and we propose a new, highperformance one based on a novel definition of semantic specificity that leveragesstructural properties of graphs such as WordNet. We then turn to the creation of aknowledge base by information extraction from text, driven by SO. We do this usingtwo approaches: (1) devising specialized NLP rules that fully exploit the ontologyand structural and syntactic properties of scientific publications; and (2) usingstatistical machine learning methods. Along the first line we developed ResearchSpotlight, a system that leverages information from DBpedia, retrieves articles fromvarious sources (e-repositories, Web pages), extracts and interrelates various kindsof named and non-named entities by exploiting article metadata, the structure of textas well as syntactic, lexical and semantic constraints, and populates a knowledgebase in the form of RDF triples. Specialized rules were defined, implemented andembedded in the outputs provided by SPACY, an industrial-strength framework forNLP tasks such as tokenization, segmentation, POS tagging and dependency *** of the system is performed thro
暂无评论