This paper presents a new approach for synthesizing transformations on tree-structured data, such as Unix directories and XML documents. We consider a general abstraction for such data, called hierarchical data trees ...
详细信息
ISBN:
(纸本)9781450342612
This paper presents a new approach for synthesizing transformations on tree-structured data, such as Unix directories and XML documents. We consider a general abstraction for such data, called hierarchical data trees (HDTs) and present a novel example-driven synthesis algorithm for HDT transformations. Our central insight is to reduce the problem of synthesizing tree transformers to the synthesis of list transformations that are applied to the paths of the tree. The synthesis problem over lists is solved using a new algorithm that combines SMT solving and decision tree learning. We have implemented our technique in a system called HADES and show that HADES can automatically synthesize a variety of interesting transformations collected from online forums.
This paper presents a new approach for synthesizing transformations on tree-structured data, such as Unix directories and XML documents. We consider a general abstraction for such data, called hierarchical data trees ...
详细信息
This paper presents a new approach for synthesizing transformations on tree-structured data, such as Unix directories and XML documents. We consider a general abstraction for such data, called hierarchical data trees (HDTs) and present a novel example-driven synthesis algorithm for HDT transformations. Our central insight is to reduce the problem of synthesizing tree transformers to the synthesis of list transformations that are applied to the paths of the tree. The synthesis problem over lists is solved using a new algorithm that combines SMT solving and decision tree learning. We have implemented our technique in a system called HADES and show that HADES can automatically synthesize a variety of interesting transformations collected from online forums.
The problem of extracting knowledge from large volumes of unstructured textual information has become increasingly important. We consider the problem of extracting text slices that adhere to a syntactic pattern and pr...
详细信息
ISBN:
(纸本)9783319165011;9783319165004
The problem of extracting knowledge from large volumes of unstructured textual information has become increasingly important. We consider the problem of extracting text slices that adhere to a syntactic pattern and propose an approach capable of generating the desired pattern automatically, from a few annotated examples. Our approach is based on Genetic programming and generates extraction patterns in the form of regular expressions that may be input to existing engines without any post-processing. Key feature of our proposal is its ability of discovering automatically whether the extraction task may be solved by a single pattern, or rather a set of multiple patterns is required. We obtain this property by means of a separate-and-conquer strategy: once a candidate pattern provides adequate performance on a subset of the examples, the pattern is inserted into the set of final solutions and the evolutionary search continues on a smaller set of examples including only those not yet solved adequately. Our proposal outperforms an earlier state-of-the-art approach on three challenging datasets.
We present a method for example-guided synthesis of functional programs over recursive data structures. Given a set of input-output examples, our method synthesizes a program in a functional language with higher-order...
详细信息
ISBN:
(纸本)9781450334686
We present a method for example-guided synthesis of functional programs over recursive data structures. Given a set of input-output examples, our method synthesizes a program in a functional language with higher-order combinators like map and fold. The synthesized program is guaranteed to be the simplest program in the language to fit the examples. Our approach combines three technical ideas: inductive generalization, deduction, and enumerative search. First, we generalize the input-output examples into hypotheses about the structure of the target program. For each hypothesis, we use deduction to infer new input/output examples for the missing subexpressions. This leads to a new subproblem where the goal is to synthesize expressions within each hypothesis. Since not every hypothesis can be realized into a program that fits the examples, we use a combination of best-first enumeration and deduction to search for a hypothesis that meets our needs. We have implemented our method in a tool called lambda(2), and we evaluate this tool on a large set of synthesis problems involving lists, trees, and nested data structures. The experiments demonstrate the scalability and broad scope of lambda(2). A highlight is the synthesis of a program believed to be the world's earliest functional pearl.
Sketch-based synthesis, epitomized by the SKETCH tool, lets developers synthesize software starting from a partial program., also called a,sketch or template. This paper presents JSKETCH, a tool that brings sketcli-ba...
详细信息
ISBN:
(纸本)9781450336758
Sketch-based synthesis, epitomized by the SKETCH tool, lets developers synthesize software starting from a partial program., also called a,sketch or template. This paper presents JSKETCH, a tool that brings sketcli-based synthesis to Java. JSKETCH'S input is a partial Java program that may include holes, which are unknown constants, expression generators, which range over sets of expressions, and class generators, which are partial classes. JSKETCH then translates the synthesis problem into a SKETCH problem;this translation is complex because SKETCH is not object oriented. Finally, JSKETCH synthesizes an executable Java program by interpreting the output of SKETCH.
Motivated by applications in automating repetitive file manipulations, we present a tool called StriSynth, which allows end-users to perform transformations over data using examples. Based on provided examples, our to...
详细信息
ISBN:
(纸本)9781479919345
Motivated by applications in automating repetitive file manipulations, we present a tool called StriSynth, which allows end-users to perform transformations over data using examples. Based on provided examples, our tool automatically generates scripts for non-trivial file manipulations. Although the current focus of StriSynth are file manipulations, it implements a more general string transformation framework. This framework builds on and further extends the functionality of Flash Fill-a Microsoft Excel extension for string transformations. An accompanying video to this paper is available at the following website http://***/kkDZphqIdFM.
Despite decades of research on parsing, the construction of parsers remains a painstaking, manual process prone to subtle bugs and pitfalls. We present a programming-by-example framework called Parsify that is able to...
详细信息
ISBN:
(纸本)9781450334686
Despite decades of research on parsing, the construction of parsers remains a painstaking, manual process prone to subtle bugs and pitfalls. We present a programming-by-example framework called Parsify that is able to synthesize a parser from input/output examples. The user does not write a single line of code. To achieve this, Parsify provides: (a) an iterative algorithm for synthesizing and refining a grammar one example at a time, (b) an interface that provides immediate visual feedback in response to changes in the grammar being refined, and (c) a graphical mechanism for specifying example parse trees using only textual selections. We empirically demonstrate the viability of our approach by using Parsify to construct parsers for source code drawn from Verilog, SQL, Apache, and Tiger.
To programmatically interact with the user interface of a web application, element locators are used to select and retrieve elements from the Document Object Model (DOM). Element locators are used in JavaScript code, ...
详细信息
ISBN:
(纸本)9781509000258
To programmatically interact with the user interface of a web application, element locators are used to select and retrieve elements from the Document Object Model (DOM). Element locators are used in JavaScript code, Cascading stylesheets, and test cases to interact with the runtime DOM of the webpage. Constructing these element locators is, however, challenging due to the dynamic nature of the DOM. We find that locators written by web developers can be quite complex, and involve selecting multiple DOM elements. We present an automated technique for synthesizing DOM element locators using examples provided interactively by the developer. The main insight in our approach is that the problem of synthesizing complex multi-element locators can be expressed as a constraint solving problem over the domain of valid DOM states in a web application. We implemented our synthesis technique in a tool called LED, which provides an interactive drag and drop support inside the browser for selecting positive and negative examples. We find that LED supports at least 86% of the locators used in the JavaScript code of deployed web applications, and that the locators synthesized by LED have a recall of 98% and a precision of 63%. LED is fast, taking only 0.23 seconds on average to synthesize a locator.
There is an increasing interest in the development of techniques for automatic relation extraction from unstructured text. The biomedical domain, in particular, is a sector that may greatly benefit from those techniqu...
详细信息
ISBN:
(纸本)9781450334723
There is an increasing interest in the development of techniques for automatic relation extraction from unstructured text. The biomedical domain, in particular, is a sector that may greatly benefit from those techniques due to the huge and ever increasing amount of scientific publications describing observed phenomena of potential clinical interest. In this paper, we consider the problem of automatically identifying sentences that contain interactions between genes and proteins, based solely on a dictionary of genes and proteins and a small set of sample sentences in natural language. We propose an evolutionary technique for learning a classifier that is capable of detecting the desired sentences within scientific publications with high accuracy. The key feature of our proposal, that is internally based on Genetic programming, is the construction of a model of the relevant syntax patterns in terms of standard part-of-speech annotations. The model consists of a set of regular expressions that are learned automatically despite the large alphabet size involved. We assess our approach on two realistic datasets and obtain 74% accuracy, a value sufficiently high to be of practical interest and that is in line with significant baseline methods.
Web applications are growing fast in popularity and complexity. One of the major problems faced by web developers is writing JavaScript code that can retrieve Document Object Model (DOM) tree elements, and is consiste...
详细信息
ISBN:
(纸本)9781509000258
Web applications are growing fast in popularity and complexity. One of the major problems faced by web developers is writing JavaScript code that can retrieve Document Object Model (DOM) tree elements, and is consistent among multiple DOM states. We attempt to solve this problem by automatically synthesizing JavaScript code that interacts with the DOM. We present an automated tool called LED, to analyze the DOM elements, and synthesize code to select the DOM elements based on the DOM hierarchy as well as the nature of task that the user wants to perform. LED provides an interactive drag and drop support inside the browser for selecting positive and negative examples of DOM elements. We find that LED supports at least 86% of the locators used in the JavaScript code of deployed web applications, and that the locators synthesized by LED have a recall of 98% and a precision of 63%. LED is fast, taking only 0.23 seconds on average to synthesize a locator.
暂无评论