Context-sensitive graph grammars have been intuitive and rigorous formalisms for specifying visual programming languages, as they are sufficient expressive and equipped with parsing mechanisms. parsing has been a fund...
详细信息
Context-sensitive graph grammars have been intuitive and rigorous formalisms for specifying visual programming languages, as they are sufficient expressive and equipped with parsing mechanisms. parsing has been a fundamental issue in the research of context-sensitive graph grammars. However, the existent parsing algorithms are either inefficient or confined to a minority of graph grammars. This paper proposes a general parsing algorithm with two embedded strategies, context matching and production-set partitioning. The two strategies can greatly narrow down the search space of redexes and thus significantly improve parsing efficiency, even though the worst-case time complexity is not theoretically reduced. Moreover, a detailed case study and an experiment are provided accordingly to demonstrate the paring process and performance of the proposed algorithm.
We describe an extension of Earley's parser for stochastic context-free grammars that computes the following quantities given a stochastic context-free grammar and an input string: a) probabilities of successive p...
详细信息
We describe an extension of Earley's parser for stochastic context-free grammars that computes the following quantities given a stochastic context-free grammar and an input string: a) probabilities of successive prefixes being generated by the grammar;b) probabilities of substrings being generated by the nonterminals, including the entire string being generated by the grammar;c) most likely (Viterbi) parse of the string;d) posterior expected number of applications of each grammar production, as required for reestimating rule probabilities. Probabilities (a) and (b) are computed incrementally in a single left-to-right pass over the input. Our algorithm compares favorably to standard bottom-up parsing methods for SCFGs in that it works efficiently on sparse grammars by making use of Earley's top-down control structure. If can process any context-free rule format without conversion to some normal form, and combines computations for (a) through (d) in a single algorithm. Finally, the algorithm has simple extensions for processing partially bracketed inputs, and for finding partial parses and their likelihoods on ungrammatical inputs.
This article aims to advance semantic analysis models, particularly in visualization, by proposing a novel semantic representation method utilizing the semantic Meta Network (MNet). MNet is a complex framework compris...
详细信息
This article aims to advance semantic analysis models, particularly in visualization, by proposing a novel semantic representation method utilizing the semantic Meta Network (MNet). MNet is a complex framework comprising semantic elements, internal and external relationships, and feature attributes, defined hierarchically through recursive processes, aiming to depict the comprehensive semantic space from phrase-level components to complete texts. The methodology involves the development of a general construction algorithm for MNet, encompassing meta relationships, tree structures, and network structures, and a parsing method for specific semantic analysis problems, including a bottom-up specification-based MNet semantic dependency tree construction algorithm and a network construction algorithm tailored for natural language interface parsing. Empirical experiments confirm the effectiveness of these algorithms, particularly in parsing natural language control interface instructions in Supervisory Control and Data Acquisition (SCADA) systems, bridging specific semantic analysis problems with the general construction and parsing processes of MNet, accounting for internal semantics concerning language unit structures and foreign language meanings in the linguistic context, thereby contributing significantly to the field of natural language semantic analysis.
In a companion paper [P.R.J. Asveld, Fuzzy context-free languages-Part 1: Generalized fuzzy context-free grammars, Theoret. Comput. Sci. (2005)] we used fuzzy context-free grammars in order to model grammatical errors...
详细信息
In a companion paper [P.R.J. Asveld, Fuzzy context-free languages-Part 1: Generalized fuzzy context-free grammars, Theoret. Comput. Sci. (2005)] we used fuzzy context-free grammars in order to model grammatical errors resulting in erroneous inputs for robust recognizing and parsing algorithms for fuzzy context-free languages. In particular, this approach enables us to distinguish between small errors ("tiny mistakes") and big errors ("capital blunders"). In this paper, we present some algorithms to recognize fuzzy context-free languages: particularly, a modification of Cocke-Younger-Kasami's algorithm and some recursive descent algorithms. Then we extend these recognition algorithms to corresponding parsing algorithms for fuzzy context-free languages. These parsing algorithms happen to be robust in some very elementary sense. (c) 2005 Elsevier B.V. All rights reserved.
A parsing algorithm visualizer is a tool that visualizes the construction of a parser for a given context-free grammar and then illustrates the use of that parser to parse a given string. parsing algorithm visualizers...
详细信息
A parsing algorithm visualizer is a tool that visualizes the construction of a parser for a given context-free grammar and then illustrates the use of that parser to parse a given string. parsing algorithm visualizers are used to teach the course on compiler construction which in invariably included in all undergraduate computer science curricula. This paper presents a new parsing algorithm visualizer that can visualize six parsing algorithms, viz. predictive parsing, simple LR parsing, canonical LR parsing, look-ahead LR parsing, Earley parsing and CYK parsing. The tool logically explains the process of parsing showing the calculations involved in each step. The output of the tool has been structured to maximize the learning outcomes and contains important constructs like FIRST and FOLLOW sets, item sets, parsing table, parse tree and leftmost or rightmost derivation depending on the algorithm being visualized. The tool has been used to teach the course on compiler construction at both undergraduate and graduate levels. An overall positive feedback was received from the students with 89% of them saying that the tool helped them in understanding the parsing algorithms. The tool is capable of visualizing multiple parsing algorithms and 88% students used it to compare the algorithms.
We introduce (1) a novel stochastic inversion transduction grammar formalism for bilingual language modeling of sentence-pairs, and (2) the concept of bilingual parsing with a variety of parallel corpus analysis appli...
详细信息
We introduce (1) a novel stochastic inversion transduction grammar formalism for bilingual language modeling of sentence-pairs, and (2) the concept of bilingual parsing with a variety of parallel corpus analysis applications. Aside from the bilingual orientation, three major features distinguish the formalism from the finite-state transducers more traditionally found in computational linguistics: it skips directly to a context-free rather than finite-state base, it permits a minimal extra degree of ordering flexibility, and its probabilistic formulation admits an efficient maximum-likelihood bilingual pausing algorithm. A convenient normal form is shown to exist. Analysis of the formalism's expressiveness suggests that it is particularly well suited to modeling ordering shifts between languages, balancing needed flexibility against complexity constraints. We discuss a number of examples of how stochastic inversion transduction grammars bring bilingual constraints to bear upon problematic corpus analysis tasks such as segmentation, bracketing, phrasal alignment, and parsing.
Plex structures specified by plex grammars are sets of symbols which interconnect in multidirection. We propose a new parsing scheme for plex grammars, which consists of two phases for symbols and their interconnectio...
详细信息
Plex structures specified by plex grammars are sets of symbols which interconnect in multidirection. We propose a new parsing scheme for plex grammars, which consists of two phases for symbols and their interconnections respectively in an input plex structure. The parsing of plex structures is simplified by our parsing scheme, because we are required to parse only a set of symbols instead of a plex structure. algorithms for the first phase are primarily discussed in this paper. Earley"s algorithm is extended to recognize a set of symbols. It is anticipated that the two-phase scheme of parsing may suggest a method for the parsing of other high-dimensional pattern grammars.
In this paper, we will investigate parsing algorithms for QTAGs and their extension called multifoot QTAGs (MFQTAGs). QTAG is a kind of tree adjoining grammars which generates the set of quadtrees. The complexity of o...
详细信息
In this paper, we will investigate parsing algorithms for QTAGs and their extension called multifoot QTAGs (MFQTAGs). QTAG is a kind of tree adjoining grammars which generates the set of quadtrees. The complexity of our parsing algorithms are O(N-2) by making use of good properties of quadtrees. In this case, the variable N is the diameter of the input image instead of the number of pixels. In other words, parsing for the languages of QTAGs has the linear time complexity. (C) 1999 Pattern Recognition Society. Published by Elsevier Science Ltd. All rights reserved.
Background: The Cancer Genome Anatomy Project (CGAP) xProfiler and cDNA Digital Gene Expression Displayer (DGED) have been made available to the scientific community over a decade ago and since then were used widely t...
详细信息
Background: The Cancer Genome Anatomy Project (CGAP) xProfiler and cDNA Digital Gene Expression Displayer (DGED) have been made available to the scientific community over a decade ago and since then were used widely to find genes which are differentially expressed between cancer and normal tissues. The tissue types are usually chosen according to the ontology hierarchy developed by NCBI. The xProfiler uses an internally available flat file database to determine the presence or absence of genes in the chosen libraries, while cDNA DGED uses the publicly available UniGene Expression and Gene relational databases to count the sequences found for each gene in the presented libraries. Results: We discovered that the CGAP approach often includes libraries from dependent or irrelevant tissues (one third of libraries were incorrect on average, with some tissue searches no correct libraries being selected at all). We also discovered that the CGAP approach reported genes from outside the selected libraries and may omit genes found within the libraries. Other errors include the incorrect estimation of the significance values and inaccurate settings for the library size cut-off values. We advocated a revised approach to finding libraries associated with tissues. In doing so, libraries from dependent or irrelevant tissues do not get included in the final library pool. We also revised the method for determining the presence or absence of a gene by searching the UniGene relational database, revised calculation of statistical significance and sorted the library cut-off filter. Conclusion: Our results justify re-evaluation of all previously reported results where NCBI CGAP expression data and tools were used.
An edNLC-graph grammar, introduced by Janssens,(4) is a strong formalism for generating scene representations. This grammar generates directed node- and edge-labelled graphs, EDG-graphs. A method of construction of un...
详细信息
An edNLC-graph grammar, introduced by Janssens,(4) is a strong formalism for generating scene representations. This grammar generates directed node- and edge-labelled graphs, EDG-graphs. A method of construction of unambiguous string EDG-graph representation is briefly described. The characteristics of edNLC-graph grammar for syntactic pattern recognition allows us to construct the parsing algorithm. The deterministic top-down syntax analyzer is constructed for the subfamily of an edNLC-graph grammar, called an ETL/1-graph grammar. An ETL/1-graph grammar is parallel to a finite state string grammar. The notions introduced in the paper are useful for researches in less restricted edNLC-graph grammars, for example grammars analogical to context-free string grammars.
暂无评论