In this paper, we give a summary of various dependency chart parsing algorithms in terms of the use of parsing histories for a new dependency arc decision. Some parsing histories are closely related to the target depe...
详细信息
ISBN:
(纸本)9781932432046
In this paper, we give a summary of various dependency chart parsing algorithms in terms of the use of parsing histories for a new dependency arc decision. Some parsing histories are closely related to the target dependency arc, and it is necessary for the parsing algorithm to take them into consideration. Each dependency treebank may have some unique characteristics, and it requires for the parser to model them by certain parsing histories. We show in experiments that proper selection of the parsing algorithm which reflect the dependency annotation of the coordinate structures improves the overall performance.
Formal grammars have been employed in biology to solve various important problems. In particular, grammars have been used to model and predict RNA structures. Two such grammars are Simple Linear Tree Adjoining Grammar...
详细信息
Formal grammars have been employed in biology to solve various important problems. In particular, grammars have been used to model and predict RNA structures. Two such grammars are Simple Linear Tree Adjoining Grammars (SLTAGs) and Extended SLTAGs (ESLTAGs). Performances of techniques that employ grammatical formalisms critically depend on the efficiency of the underlying parsing algorithms. In this paper, we present efficient algorithms for parsing SLTAGs and ESLTAGs. Our algorithm for SLTAGs parsing takes O(min{m, n(4)}) time and O(min{m, n(4)}) space, where m is the number of entries that will ever be made in the matrix M (that is normally used by TAG parsing algorithms). Our algorithm for ESLTAGs parsing takes O(nmin{m, n(4)}) time and O(min{m, n(4)}) space. We show that these algorithms perform better, in practice, than the algorithms of Uemura et al. [21].
We present new results on the relation between purely symbolic context-free parsing strategies and their probabilistic counterparts. Such parsing strategies are seen as constructions of push-down devices from grammars...
详细信息
We present new results on the relation between purely symbolic context-free parsing strategies and their probabilistic counterparts. Such parsing strategies are seen as constructions of push-down devices from grammars. We show that preservation of probability distribution is possible under two conditions, viz. the correct-prefix property and the property of strong predictiveness. These results generalize existing results in the literature that were obtained by considering parsing strategies in isolation. From our general results, we also derive negative results on so-called generalized LR parsing.
Earley's parsing algorithm is a general algorithm, able to handle any context-free grammar. As with most parsing algorithms, however, the presence of grammar rules having empty right-hand sides complicates matters...
详细信息
Earley's parsing algorithm is a general algorithm, able to handle any context-free grammar. As with most parsing algorithms, however, the presence of grammar rules having empty right-hand sides complicates matters. By analyzing why Earley's algorithm struggles with these grammar rules, we have devised a simple solution to the problem. Our empty-rule solution leads to a new type of finite automaton expressly suited for use in Earley parsers and to a new statement of Earley's algorithm. We show that this new form of Earley parser is much more time efficient in practice than the original.
We discuss weighted deductive parsing and consider the problem of finding the derivation with the lowest weight. We show that Knuth's generalization of Dijkstra's algorithm for the shortest-path problem offers...
详细信息
We discuss weighted deductive parsing and consider the problem of finding the derivation with the lowest weight. We show that Knuth's generalization of Dijkstra's algorithm for the shortest-path problem offers a general method to solve this problem. Our approach is modular in the sense that Knuth's algorithm is formulated independently from the weighted deduction system.
A Watson-Crick (WK) context-free grammar, a context-free grammar with productions whose right-hand sides contain nonterminals and double-stranded terminal strings, generates complete double-stranded strings under Wats...
详细信息
A Watson-Crick (WK) context-free grammar, a context-free grammar with productions whose right-hand sides contain nonterminals and double-stranded terminal strings, generates complete double-stranded strings under Watson-Crick complementarity. In this paper, we investigate the simplification processes of Watson-Crick context-free grammars, which lead to defining Chomskylike normal form for Watson-Crick context-free grammars. The main result of the paper is a modified CYK (Cocke-Younger-Kasami) algorithm for Watson-Crick context-free grammars in WK-Chomsky normal form, allowing to parse double-stranded strings in O (n(6)) time.
Flexible parsing algorithm, a two-steps-greedy parsing algorithm for text factorisation, has been proved to be an optimal parsing for LZ78-like compressors in the case of constant cost phrases [1,2]. Whilst in early i...
详细信息
Flexible parsing algorithm, a two-steps-greedy parsing algorithm for text factorisation, has been proved to be an optimal parsing for LZ78-like compressors in the case of constant cost phrases [1,2]. Whilst in early implementations of LZ78-like compressors the phrases have constant cost, in common modern implementations the cost of the k-th phrase is [log(2) k + C] where C is a real constant [3,4]. Indeed we show examples where Flexible parsing is not optimal under the above more realistic setting. In this paper we prove that, under the assumption that the cost of a phrase is block-wise constant and non-decreasing, the Flexible parsing is almost optimal. For almost optimal we mean that, for any text T, the difference between the sizes of the compressed text obtained by using a Flexible parsing and an optimal parsing is bounded by the maximal cost of a phrase in T, i.e. it is logarithmic in practical cases. Furthermore we investigate how an optimal parsing, and hence an almost optimal parsing, affects the rate of convergence to the entropy of LZ78-like compressors. We discuss some experimental results considering the ratio between the speed of convergence to the entropy of compressors with and without an optimal parsing. This ratio presents a kind of wave effect that increases as the entropy of a memoryless source decreases but it seems always to slowly converge to one. According to the theory, this wave can be a tsunami for some families of highly compressible strings and, although the optimal (and the almost optimal) parsing does not improve the asymptotical speed of convergence to the entropy, it can improve compression ratio, and hence the decoding speed, in many practical cases. (C) 2017 Elsevier B.V. All rights reserved.
K-depth grammars extend context-free grammars allowing k greater than or equal to 1 rewriting points for a single non-terminal at every step of a derivation. The family of languages generated by k-depth grammars is a ...
详细信息
K-depth grammars extend context-free grammars allowing k greater than or equal to 1 rewriting points for a single non-terminal at every step of a derivation. The family of languages generated by k-depth grammars is a proper extension of the family of context-free languages, while retaining many context-free properties, such as closure properties, a version of Chomsky-Schutzenberger theorem, the existence of an accepting device (the multi-pushdown automaton). Here a polynomial-time parsing algorithm for k-depth languages is defined, and its correctness is proved. (C) 1996 Academic Press, Inc.
Incremental integrated programming environments are a promising approach toward the goal of improved programmer productivity. Consistently rapid environment response is an obvious goal. Efficient use of storage has al...
详细信息
Incremental integrated programming environments are a promising approach toward the goal of improved programmer productivity. Consistently rapid environment response is an obvious goal. Efficient use of storage has also been determined to be an important issue. This paper presents the algorithms and techniques used for incremental scanning and parsing of the Galaxy language. The algorithms guarantee minimal rescanning and reparsing and are also space and time efficient. Notably, the methods are easily adapted to any language of equivalent class, i.e., RL(1) intersects LR(1), including such languages as C and Pascal.
This article outlines a proof-theoretic approach to developing correct and terminating monadic parsers. Using modified realizability, we extract formally verified and terminating programs from formal proofs. By extrac...
详细信息
This article outlines a proof-theoretic approach to developing correct and terminating monadic parsers. Using modified realizability, we extract formally verified and terminating programs from formal proofs. By extracting both primitive parsers and parser combinators, it is ensured that all complex parsers built from these are also correct, complete and terminating for any input. We demonstrate the viability of our approach by means of two case studies: we extract (i) a small arithmetic calculator and (ii) a non-deterministic natural language parser. The work is being carried out in the interactive proof system Minlog.
暂无评论