FlashFill++: Scaling Programming by Example by Cutting to the Chase

14Citations
Citations of this article
12Readers
Mendeley users who have this article in their library.

Abstract

Programming-by-Examples (PBE) involves synthesizing an "intended program"from a small set of user-provided input-output examples. A key PBE strategy has been to restrict the search to a carefully designed small domain-specific language (DSL) with "effectively-invertible"(EI) operators at the top and "effectively-enumerable"(EE) operators at the bottom. This facilitates an effective combination of top-down synthesis strategy (which backpropagates outputs over various paths in the DSL using inverse functions) with a bottom-up synthesis strategy (which propagates inputs over various paths in the DSL). We address the problem of scaling synthesis to large DSLs with several non-EI/EE operators. This is motivated by the need to support a richer class of transformations and the need for readable code generation. We propose a novel solution strategy that relies on propagating fewer values and over fewer paths. Our first key idea is that of "cut functions"that prune the set of values being propagated by using knowledge of the sub-DSL on the other side. Cuts can be designed to preserve completeness of synthesis; however, DSL designers may use incomplete cuts to have finer control over the kind of programs synthesized. In either case, cuts make search feasible for non-EI/EE operators and efficient for deep DSLs. Our second key idea is that of "guarded DSLs"that allow a precedence on DSL operators, which dynamically controls exploration of various paths in the DSL. This makes search efficient over grammars with large fanouts without losing recall. It also makes ranking simpler yet more effective in learning an intended program from very few examples. Both cuts and precedence provide a mechanism to the DSL designer to restrict search to a reasonable, and possibly incomplete, space of programs. Using cuts and gDSLs, we have built FlashFill++, an industrial-strength PBE engine for performing rich string transformations, including datetime and number manipulations. The FlashFill++ gDSL is designed to enable readable code generation in different target languages including Excel's formula language, PowerFx, and Python. We show FlashFill++ is more expressive, more performant, and generates better quality code than comparable existing PBE systems. FlashFill++ is being deployed in several mass-market products ranging from spreadsheet software to notebooks and business intelligence applications, each with millions of users.

Cite

CITATION STYLE

APA

Cambronero, J., Gulwani, S., Le, V., Perelman, D., Radhakrishna, A., Simon, C., & Tiwari, A. (2023). FlashFill++: Scaling Programming by Example by Cutting to the Chase. Proceedings of the ACM on Programming Languages, 7, 952–981. https://doi.org/10.1145/3571226

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free