This paper proposes a method for speeding up parsing process for recognizing online handwritten mathematical expressions (OHME). We prune infeasible partitions in the parsing table to reduce the time for the parsing p...
详细信息
ISBN:
(纸本)9781538635865
This paper proposes a method for speeding up parsing process for recognizing online handwritten mathematical expressions (OHME). We prune infeasible partitions in the parsing table to reduce the time for the parsing process. Low score partitions are candidates for pruning. Our method can be applied for any parsing algorithms that use score functions. In this paper, we use a stroke order free system as a baseline system. The method is as follows. First, we analyze the scores of partitions in each row of the parsing table. Then, we determine a threshold for each row to prune low score partitions. Finally, we employ these thresholds to prune low score partitions on the baseline recognition system. The results of evaluations of our method on the CROHME 2014 database show that the recognition process is speeded up by 3.46 times and 4.97 times while recognition rate is reduced only 0.31 point and 0.71 point, respectively.
Context-sensitive graph grammars have been suitable formalisms for specifying visual programming languages, as they are intuitive, sufficient expressive and equipped with parsing mechanisms. parsing has been a fundame...
详细信息
ISBN:
(纸本)9781450376266
Context-sensitive graph grammars have been suitable formalisms for specifying visual programming languages, as they are intuitive, sufficient expressive and equipped with parsing mechanisms. parsing has been a fundamental issue in the research of context-sensitive graph grammars. However, the existent parsing algorithms are either inefficient or confined to a minority of graph grammars. This paper presents two strategies for general parsing algorithms, one is context matching, and the other is partitioning of productions. Through narrowing down the searching space of potential redexex, the two strategies can considerably improve the parsing performance.
In this paper, we propose a feature-based Korean grammar utilizing the learned constraint rules in order to improve parsing efficiency. The proposed grammar consists of feature structures, feature operations, and cons...
详细信息
In this paper, we propose a feature-based Korean grammar utilizing the learned constraint rules in order to improve parsing efficiency. The proposed grammar consists of feature structures, feature operations, and constraint rules;and it has the following characteristics. First, a feature structure includes several features to express useful linguistic information for Korean parsing. Second, a feature operation generating a new feature structure is restricted to the binary-branching form which can deal with Korean properties such as variable word order and constituent ellipsis. Third, constraint rules improve efficiency by preventing feature operations from generating spurious feature structures. Moreover, these rules are learned from a Korean treebank by a decision tree learning algorithm. The experimental results show that the feature-based Korean grammar can reduce the number of candidates by a third of candidates at most and it runs 1.5 similar to 2 times faster than a CFG on a statistical parser.
A graph grammar is a formal tool for providing rigorous but intuitive ways to define visual languages. Based on an existing graph grammar, this paper proposes new context-sensitive graph grammar formalism called the E...
详细信息
A graph grammar is a formal tool for providing rigorous but intuitive ways to define visual languages. Based on an existing graph grammar, this paper proposes new context-sensitive graph grammar formalism called the Extension of Edge-based Graph Grammar, or E-EGG. The E-EGG introduces new mechanisms into grammatical specifications, productions, operations and so on in order to conveniently treat the bidirectional transformation between the Business Process Modeling Notation (BPMN) and the Business Process Execution Language (BPEL). Besides formal definitions of the E-EGG are provided, steps and algorithms to achieve the bidirectional transformation and to check the correctness of BPMN models' structure are presented. Finally, a case study on transformation from BPMN models to BPEL codes is provided to show how the parsing algorithm of the E-EGG works. (C) 2016 Elsevier Ltd. All rights reserved.
Background: Searching for members of characterized ncRNA families containing pseudoknots is an important component of genome-scale ncRNA annotation. However, the state-of-the-art known ncRNA search is based on context...
详细信息
Background: Searching for members of characterized ncRNA families containing pseudoknots is an important component of genome-scale ncRNA annotation. However, the state-of-the-art known ncRNA search is based on context-free grammar (CFG), which cannot effectively model pseudoknots. Thus, existing CFG-based ncRNA identification tools usually ignore pseudoknots during search. As a result, dozens of sequences that do not contain the native pseudoknots are reported by these tools. When pseudoknot structures are vital to the functions of the ncRNAs, these sequences may not be true members. Results: In this work, we design a pseudoknot search tool using multiple simple sub-structures, which are derived from knot-free and bifurcation-free structural motifs in the underlying family. We test our tool on a contiguous 22-Mb region of the Maize Genome. The experimental results show that our work competes favorably with other pseudoknot search methods. Conclusions: Our sub-structure based tool can conduct genome-scale pseudoknot-containing ncRNA search effectively and efficiently. It provides a complementary pseudoknot search tool to Infernal. The source codes are available at http://***/similar to chengy/knotsearch.
This study proposes a practical method for the construction of a more compact matrix structure of the precedence information used in a new weak precedence parsing. The parsing algorithm is different from the conventi...
详细信息
This study proposes a practical method for the construction of a more compact matrix structure of the precedence information used in a new weak precedence parsing. The parsing algorithm is different from the conventional weak precedence algorithm. It is possible to use the method for any weak precedence grammars without degrading the good error detection capability of the traditional weak precedence parsers. The empirical results serve to demonstrate that the obtained matrices are the very reasonable size and that the presented parsing algorithm is quite efficient.
As a useful formalism tool, graph grammars provide a rigorous but intuitive way to specify visual languages. This paper, based on the existing Edge-based Graph Grammar (EGG), proposes a new context-sensitive graph gra...
详细信息
As a useful formalism tool, graph grammars provide a rigorous but intuitive way to specify visual languages. This paper, based on the existing Edge-based Graph Grammar (EGG), proposes a new context-sensitive graph grammar formalism called the Temporal Edge-based Graph Grammar, or TEGG. TEGG introduces some temporal mechanisms to grammatical specifications, productions, operations and so on in order to tackle time-related issues. In the paper, formal definitions of TEGG are provided first. Then, a new parsing algorithm with a decidability proof is proposed to check the correctness of a given graph's structure, to analyze operations' timing when needed, and to make the computer simulation of the temporal sequence in the graph available. Next, the complexity of the parsing algorithm is analyzed. Finally, a case study on an application with temporal requirements is provided to show how the parsing algorithm of TEGG works.
This paper describes an object-oriented lexical representation language based on Unification Categorial Grammar (UCG) that encodes linguistic and semantic information uniformly as classes and objects and an efficient ...
详细信息
This paper describes an object-oriented lexical representation language based on Unification Categorial Grammar (UCG) that encodes linguistic and semantic information uniformly as classes and objects and an efficient bottom-up parsing method for UCG using selection sets technique. The lexical representation language, implemented in the logic and object-oriented programming language LIFE, introduces several new information sharing mechanisms to enable natural, declarative, modular and economial construction of large and complex computational lexicons. The selection sets are deduced from a transformation between UCG and Context-Free Grammar (CFG) and used to reduce search space for the table-driven algorithm. The experimental tests on a spoken English corpus show that the hierarchical lexicon achieves a dramatic reduction on redundant information and that selection sets significantly improve parsing UCG with a polynomial time complexity.
Graph grammars are a rigorous but intuitive way to define and handle graph languages. To tackle time-related issues, this paper proposes a new extension of temporal mechanism based on the existing Edge-based Graph Gra...
详细信息
ISBN:
(纸本)9781538649916
Graph grammars are a rigorous but intuitive way to define and handle graph languages. To tackle time-related issues, this paper proposes a new extension of temporal mechanism based on the existing Edge-based Graph Grammar (EGG), which includes grammatical specifications, productions, operations and so on. In the paper, formal definitions of temporal mechanism are provided first. Then, a new parsing algorithm is presented to check the correctness of a given graph's structure, and to analyze operations' timing when needed.
In this paper, we present a Machine Translation (MT) system from English to Indonesian by applying Link Grammar (LG) formalism. The Annotated Disjunct (ADJ) technique available in the LG formalism is utilized to map E...
详细信息
ISBN:
(纸本)9783540851097
In this paper, we present a Machine Translation (MT) system from English to Indonesian by applying Link Grammar (LG) formalism. The Annotated Disjunct (ADJ) technique available in the LG formalism is utilized to map English sentences into equivalent Indonesian sentences. The ADJ is a promising technique to deal with target languages that do not have grammar formalism, parser, and corpus available like Indonesian language. An experimental evaluation shows that the applicability of LG for Indonesian language worked as expected. We have also discussed some significant issues to be considered in future development.
暂无评论