In this paper, we give a summary of various dependency chart parsing algorithms in terms of the use of parsing histories for a new dependency arc decision. Some parsing histories are closely related to the target depe...
详细信息
ISBN:
(纸本)9781932432046
In this paper, we give a summary of various dependency chart parsing algorithms in terms of the use of parsing histories for a new dependency arc decision. Some parsing histories are closely related to the target dependency arc, and it is necessary for the parsing algorithm to take them into consideration. Each dependency treebank may have some unique characteristics, and it requires for the parser to model them by certain parsing histories. We show in experiments that proper selection of the parsing algorithm which reflect the dependency annotation of the coordinate structures improves the overall performance.
Formal grammars have been employed in biology to solve various important problems. In particular, grammars have been used to model and predict RNA structures. Two such grammars are Simple Linear Tree Adjoining Grammar...
详细信息
Formal grammars have been employed in biology to solve various important problems. In particular, grammars have been used to model and predict RNA structures. Two such grammars are Simple Linear Tree Adjoining Grammars (SLTAGs) and Extended SLTAGs (ESLTAGs). Performances of techniques that employ grammatical formalisms critically depend on the efficiency of the underlying parsing algorithms. In this paper, we present efficient algorithms for parsing SLTAGs and ESLTAGs. Our algorithm for SLTAGs parsing takes O(min{m, n(4)}) time and O(min{m, n(4)}) space, where m is the number of entries that will ever be made in the matrix M (that is normally used by TAG parsing algorithms). Our algorithm for ESLTAGs parsing takes O(nmin{m, n(4)}) time and O(min{m, n(4)}) space. We show that these algorithms perform better, in practice, than the algorithms of Uemura et al. [21].
We present new results on the relation between purely symbolic context-free parsing strategies and their probabilistic counterparts. Such parsing strategies are seen as constructions of push-down devices from grammars...
详细信息
We present new results on the relation between purely symbolic context-free parsing strategies and their probabilistic counterparts. Such parsing strategies are seen as constructions of push-down devices from grammars. We show that preservation of probability distribution is possible under two conditions, viz. the correct-prefix property and the property of strong predictiveness. These results generalize existing results in the literature that were obtained by considering parsing strategies in isolation. From our general results, we also derive negative results on so-called generalized LR parsing.
A Watson-Crick (WK) context-free grammar, a context-free grammar with productions whose right-hand sides contain nonterminals and double-stranded terminal strings, generates complete double-stranded strings under Wats...
详细信息
A Watson-Crick (WK) context-free grammar, a context-free grammar with productions whose right-hand sides contain nonterminals and double-stranded terminal strings, generates complete double-stranded strings under Watson-Crick complementarity. In this paper, we investigate the simplification processes of Watson-Crick context-free grammars, which lead to defining Chomskylike normal form for Watson-Crick context-free grammars. The main result of the paper is a modified CYK (Cocke-Younger-Kasami) algorithm for Watson-Crick context-free grammars in WK-Chomsky normal form, allowing to parse double-stranded strings in O (n(6)) time.
Flexible parsing algorithm, a two-steps-greedy parsing algorithm for text factorisation, has been proved to be an optimal parsing for LZ78-like compressors in the case of constant cost phrases [1,2]. Whilst in early i...
详细信息
Flexible parsing algorithm, a two-steps-greedy parsing algorithm for text factorisation, has been proved to be an optimal parsing for LZ78-like compressors in the case of constant cost phrases [1,2]. Whilst in early implementations of LZ78-like compressors the phrases have constant cost, in common modern implementations the cost of the k-th phrase is [log(2) k + C] where C is a real constant [3,4]. Indeed we show examples where Flexible parsing is not optimal under the above more realistic setting. In this paper we prove that, under the assumption that the cost of a phrase is block-wise constant and non-decreasing, the Flexible parsing is almost optimal. For almost optimal we mean that, for any text T, the difference between the sizes of the compressed text obtained by using a Flexible parsing and an optimal parsing is bounded by the maximal cost of a phrase in T, i.e. it is logarithmic in practical cases. Furthermore we investigate how an optimal parsing, and hence an almost optimal parsing, affects the rate of convergence to the entropy of LZ78-like compressors. We discuss some experimental results considering the ratio between the speed of convergence to the entropy of compressors with and without an optimal parsing. This ratio presents a kind of wave effect that increases as the entropy of a memoryless source decreases but it seems always to slowly converge to one. According to the theory, this wave can be a tsunami for some families of highly compressible strings and, although the optimal (and the almost optimal) parsing does not improve the asymptotical speed of convergence to the entropy, it can improve compression ratio, and hence the decoding speed, in many practical cases. (C) 2017 Elsevier B.V. All rights reserved.
Incremental integrated programming environments are a promising approach toward the goal of improved programmer productivity. Consistently rapid environment response is an obvious goal. Efficient use of storage has al...
详细信息
Incremental integrated programming environments are a promising approach toward the goal of improved programmer productivity. Consistently rapid environment response is an obvious goal. Efficient use of storage has also been determined to be an important issue. This paper presents the algorithms and techniques used for incremental scanning and parsing of the Galaxy language. The algorithms guarantee minimal rescanning and reparsing and are also space and time efficient. Notably, the methods are easily adapted to any language of equivalent class, i.e., RL(1) intersects LR(1), including such languages as C and Pascal.
This paper presents a canonical form for context-sensitive derivations and a parsing algorithm which finds each context-sensitive analysis once and only once. The amount of memory required by the algorithm is essentia...
详细信息
This paper presents a canonical form for context-sensitive derivations and a parsing algorithm which finds each context-sensitive analysis once and only once. The amount of memory required by the algorithm is essentially no more than that required to store a single complete derivation. In addition, a modified version of the basic algorithm is presented which blocks infinite analyses for grammars which contain loops. The algorithm is also compared with several previous parsers for context-sensitive grammars and general rewriting systems, and the difference between the two types of analyses is discussed. The algorithm appears to be complementary to an algorithm by S. Kuno in several respects, including the space-time trade-off and the degree of context dependence involved.
This article outlines a proof-theoretic approach to developing correct and terminating monadic parsers. Using modified realizability, we extract formally verified and terminating programs from formal proofs. By extrac...
详细信息
This article outlines a proof-theoretic approach to developing correct and terminating monadic parsers. Using modified realizability, we extract formally verified and terminating programs from formal proofs. By extracting both primitive parsers and parser combinators, it is ensured that all complex parsers built from these are also correct, complete and terminating for any input. We demonstrate the viability of our approach by means of two case studies: we extract (i) a small arithmetic calculator and (ii) a non-deterministic natural language parser. The work is being carried out in the interactive proof system Minlog.
This article outlines a proof-theoretic approach to developing correct and terminating monadic parsers. Using modified realizability, we extract formally verified and terminating programs from formal proofs. By extrac...
详细信息
This article outlines a proof-theoretic approach to developing correct and terminating monadic parsers. Using modified realizability, we extract formally verified and terminating programs from formal proofs. By extracting both primitive parsers and parser combinators, it is ensured that all complex parsers built from these are also correct, complete and terminating for any input. We demonstrate the viability of our approach by means of two case studies: we extract (i) a small arithmetic calculator and (ii) a non-deterministic natural language parser. The work is being carried out in the interactive proof system Minlog.
This paper presents a scalable scene parsing algorithm based on image retrieval and superpixel matching. We focus on rare object classes, which play an important role in achieving richer semantic understanding of visu...
详细信息
ISBN:
(纸本)9781479951178
This paper presents a scalable scene parsing algorithm based on image retrieval and superpixel matching. We focus on rare object classes, which play an important role in achieving richer semantic understanding of visual scenes, compared to common background classes. Towards this end, we make two novel contributions: rare class expansion and semantic context description. First, considering the long-tailed nature of the label distribution, we expand the retrieval set by rare class exemplars and thus achieve more balanced superpixel classification results. Second, we incorporate both global and local semantic context information through a feedback based mechanism to refine image retrieval and superpixel matching. Results on the SIFTflow and LMSun datasets show the superior performance of our algorithm, especially on the rare classes, without sacrificing overall labeling accuracy.
暂无评论