A discourse containing one or more sentences describes daily issues and events for people to communicate their thoughts and opinions. As sentences are normally consist of multiple text segments, correct understanding ...
详细信息
A discourse containing one or more sentences describes daily issues and events for people to communicate their thoughts and opinions. As sentences are normally consist of multiple text segments, correct understanding of the theme of a discourse should take into consideration of the relations in between text segments. Although sometimes a connective exists in raw texts for conveying relations, it is more often the cases that no connective exists in between two text segments but some implicit relation does exist in between them. The task of implicit discourse relation recognition (IDRR) is to detect implicit relation and classify its sense between two text segments without a connective. Indeed, the IDRR task is important to diverse downstream natural language processing tasks, such as text summarization, machine translation, and so on. This article provides a comprehensive and up-to-date survey for the IDRR task. We first summarize the task definition and data sources widely used in the field. We categorize the main solution approaches for the IDRR task from the viewpoint of its development history. In each solution category, we present and analyze the most representative methods, including their origins, ideas, strengths, and weaknesses. We also present performance comparisons for those solutions experimented on a public corpus with standard data processing procedures. Finally, we discuss future research directions for discourse relation analysis.
Descriptive approaches to discourse (text) structure and coherence typically proceed either in a bottom-up or a top-down analytic way. The former ones analyze how the smallest discourse units (clauses, sentences) are ...
详细信息
ISBN:
(纸本)9783030583231;9783030583224
Descriptive approaches to discourse (text) structure and coherence typically proceed either in a bottom-up or a top-down analytic way. The former ones analyze how the smallest discourse units (clauses, sentences) are connected in their closest neighbourhood, locally, in a linear way. The latter ones postulate a hierarchical organization of smaller and larger units, sometimes also represent the whole text as a tree-like graph. In the present study, we mine a Czech corpus of 50k sentences annotated in the local coherence fashion (penn discourse treebank style) for indices signalling higher discourse structure. We analyze patterns of overlapping discourse relations and look into hierarchies they form. The types and distributions of the detected patterns correspond to the results for English local annotation, with patterns not complying with the treelike interpretation at very low numbers. We also detect hierarchical organization of local discourse relations of up to 5 levels in the Czech data.
暂无评论