Existing educational tools for language processors have varying capabilities. There is no single tool that covers every aspect of language processors. As a result, there is a possibility that the educator and students...
详细信息
Existing educational tools for language processors have varying capabilities. There is no single tool that covers every aspect of language processors. As a result, there is a possibility that the educator and students will need multiple tools, switching between different specification notations, organization and interpretation of outputs, which could result in a steep learning curve. PAMOJA is a Java-based component framework with a broader scope - it supports construction of grammar-aware applications using the rapid application development style found in modern IDEs such as NetBeans. This paper investigated the possibility of using PAMOJA to develop educational tools for language processing courses. We conducted two major studies. The first study, summarized six design considerations, identified in the literature, for language processor educational tools. Then, we identified three example applications and demonstrated how to construct tools for them using PAMOJA. These include building and visualizing lexical scanners and parsers, as well as constructing a front-end for a software language, using a subset of Java language as an example. The second study evaluated the PAMOJA approach in relation to the identified design considerations, and by analyzing student and educator perceptions. The results seem to demonstrate a positive reception and acceptance, concluding that its application would facilitate the design of language processor teaching tools and learning language processors. (c) 2023 Elsevier B.V. All rights reserved.
This article outlines a proof-theoretic approach to developing correct and terminating monadic parsers. Using modified realizability, we extract formally verified and terminating programs from formal proofs. By extrac...
详细信息
This article outlines a proof-theoretic approach to developing correct and terminating monadic parsers. Using modified realizability, we extract formally verified and terminating programs from formal proofs. By extracting both primitive parsers and parser combinators, it is ensured that all complex parsers built from these are also correct, complete and terminating for any input. We demonstrate the viability of our approach by means of two case studies: we extract (i) a small arithmetic calculator and (ii) a non-deterministic natural language parser. The work is being carried out in the interactive proof system Minlog.
This article outlines a proof-theoretic approach to developing correct and terminating monadic parsers. Using modified realizability, we extract formally verified and terminating programs from formal proofs. By extrac...
详细信息
This article outlines a proof-theoretic approach to developing correct and terminating monadic parsers. Using modified realizability, we extract formally verified and terminating programs from formal proofs. By extracting both primitive parsers and parser combinators, it is ensured that all complex parsers built from these are also correct, complete and terminating for any input. We demonstrate the viability of our approach by means of two case studies: we extract (i) a small arithmetic calculator and (ii) a non-deterministic natural language parser. The work is being carried out in the interactive proof system Minlog.
parsing is a key task in computer science, with applications in compilers, natural language processing, syntactic pattern matching, and formal language theory. With the recent development of deep learning techniques, ...
详细信息
parsing is a key task in computer science, with applications in compilers, natural language processing, syntactic pattern matching, and formal language theory. With the recent development of deep learning techniques, several artificial intelligence applications, especially in natural language processing, have combined traditional parsing methods with neural networks to drive the search in the parsing space, resulting in hybrid architectures using both symbolic and distributed representations. In this article, we show that existing symbolic parsing algorithms for context-free languages can cross the border and be entirely formulated over distributed representations. To this end, we introduce a version of the traditional Cocke-Younger-Kasami (CYK) algorithm, called distributed (D)-CYK, which is entirely defined over distributed representations. D-CYK uses matrix multiplication on real number matrices of a size independent of the length of the input string. These operations are compatible with recurrent neural networks. Preliminary experiments show that D-CYK approximates the original CYK algorithm. By showing that CYK can be entirely performed on distributed representations, we open the way to the definition of recurrent layer neural networks that can process general context-free languages.
Visual Programming Languages have been widely adopted in design and comprehension of sophisticated systems. Context-sensitive graph grammar formalisms are suitable tools for specifying these languages, since they are ...
详细信息
Visual Programming Languages have been widely adopted in design and comprehension of sophisticated systems. Context-sensitive graph grammar formalisms are suitable tools for specifying these languages, since they are intuitive and possess sufficient expressive power and usability. Nevertheless, some of the formalisms whose contexts are implicitly or incompletely represented in productions, called implicit context-sensitive graph grammars, suffer inherent weakness in intuitiveness and limitations in parsing algorithms. To address these issues, this paper formally presents a notion of context on the underlying concepts of partial and total precedence relations, characterizes their fundamental properties, and establishes a connection between contexts and their instances (also called context graphs elsewhere), based on the Reserved Graph Grammar formalism, a representative of implicit graph grammars. Moreover, three typical applications of contexts are illustrated, which show that contexts can both facilitate the comprehension and design of implicit graph grammars so as to enhance their intuitiveness, and make the existent efficient parsing algorithms more widely applicable.
A Watson-Crick (WK) context-free grammar, a context-free grammar with productions whose right-hand sides contain nonterminals and double-stranded terminal strings, generates complete double-stranded strings under Wats...
详细信息
A Watson-Crick (WK) context-free grammar, a context-free grammar with productions whose right-hand sides contain nonterminals and double-stranded terminal strings, generates complete double-stranded strings under Watson-Crick complementarity. In this paper, we investigate the simplification processes of Watson-Crick context-free grammars, which lead to defining Chomskylike normal form for Watson-Crick context-free grammars. The main result of the paper is a modified CYK (Cocke-Younger-Kasami) algorithm for Watson-Crick context-free grammars in WK-Chomsky normal form, allowing to parse double-stranded strings in O (n(6)) time.
Flexible parsing algorithm, a two-steps-greedy parsing algorithm for text factorisation, has been proved to be an optimal parsing for LZ78-like compressors in the case of constant cost phrases [1,2]. Whilst in early i...
详细信息
Flexible parsing algorithm, a two-steps-greedy parsing algorithm for text factorisation, has been proved to be an optimal parsing for LZ78-like compressors in the case of constant cost phrases [1,2]. Whilst in early implementations of LZ78-like compressors the phrases have constant cost, in common modern implementations the cost of the k-th phrase is [log(2) k + C] where C is a real constant [3,4]. Indeed we show examples where Flexible parsing is not optimal under the above more realistic setting. In this paper we prove that, under the assumption that the cost of a phrase is block-wise constant and non-decreasing, the Flexible parsing is almost optimal. For almost optimal we mean that, for any text T, the difference between the sizes of the compressed text obtained by using a Flexible parsing and an optimal parsing is bounded by the maximal cost of a phrase in T, i.e. it is logarithmic in practical cases. Furthermore we investigate how an optimal parsing, and hence an almost optimal parsing, affects the rate of convergence to the entropy of LZ78-like compressors. We discuss some experimental results considering the ratio between the speed of convergence to the entropy of compressors with and without an optimal parsing. This ratio presents a kind of wave effect that increases as the entropy of a memoryless source decreases but it seems always to slowly converge to one. According to the theory, this wave can be a tsunami for some families of highly compressible strings and, although the optimal (and the almost optimal) parsing does not improve the asymptotical speed of convergence to the entropy, it can improve compression ratio, and hence the decoding speed, in many practical cases. (C) 2017 Elsevier B.V. All rights reserved.
The unified scheduling language is a Chinese domain-specific language with both programming and natural language features for space mission scheduling. To create a functional and yet simple language for all space-miss...
详细信息
The unified scheduling language is a Chinese domain-specific language with both programming and natural language features for space mission scheduling. To create a functional and yet simple language for all space-mission-related staff, a hybrid framework for language defining and interpreting is proposed. This framework combines script programming languages and natural language processing technologies, and it creates a language that is easy to learn yet still powerful to fulfill the automating and extending requirements of mission scheduling. A coordinated natural language processing approach is designed to parse human-oriented languages, and a general-purpose script engine is integrated for processing the script aspect of the unified scheduling language. The translating mechanism maintains logic consistency between the two parsers. Related logic models and algorithms are introduced to illustrate the parsing mechanism. This framework achieves sound practical results in the unified scheduling language, considering a valid domain-specific language system is built and parsed correctly. The framework can be generalized to other Chinese domain-specific languages in various application fields.
This paper outlines the development of a new open-source plugin for collaboration workflow within the parametric design paradigm. Processes of informational exchange support the integration approaching workflows for p...
详细信息
This paper outlines the development of a new open-source plugin for collaboration workflow within the parametric design paradigm. Processes of informational exchange support the integration approaching workflows for processing performative design between engineers and architects.
NASA Technical Reports Server (Ntrs) 20070027745: Blurring the Inputs: a Natural Language Approach to Sensitivity Analysis by NASA Technical Reports Server (Ntrs); published by
NASA Technical Reports Server (Ntrs) 20070027745: Blurring the Inputs: a Natural Language Approach to Sensitivity Analysis by NASA Technical Reports Server (Ntrs); published by
暂无评论