Krzysztof Krawiec | Research / ProgramSynthesis

In program synthesis, the task is to automatically generate a computer program that meets a given set of requirements, provided either as a set of tests (examples), or constraints imposed on program input and output (a.k.a. contracts). I work with the generate-and-test approaches to program synthesis, of which the most common is nowadays genetic programming (GP), a bio-inspired methodology of program induction based on the metaheuristic of evolutionary computation.

I am particularly interested in:

The role of semantics in genetic programming

Together with Alberto Moraglio and Colin Johnson we co-authored this paper that introduced the new family of exact geometric semantic operators.

Using program behavior to guide synthesis process

Program execution leaves a trace, i.e., a list of states of execution environments (e.g., machine registers). Traces may reveal not only information about the actual characteristics of a given program, but also about its prospective performance. It is so because a program may feature a subprogram/procedure that is essential for solving a given synthesis task, but is not effective in that given program. By inspecting only program output, we would never know that a program contains such a piece of code. Traces, to the contrary, reveal that information. Behavioral programming is a new approach to generate-and-test program synthesis where execution traces are being exploited to make synthesis more efficient. In the papers prepared with Jerry Swan and Una-May O'Relly, we proposed Pattern-Guided Evolutionary Algorithm and its extension, where the latter one won us the best paper award at the GECCO'14 conference. In this paper, we provide bird's-eye view of this idea.

Using alternative measures (search drivers) to assess actual and potential program quality

Useful insight about program behavior can be gained by juxtaposing program traces obtained for different input data. In a nutshell, if a program should return the same value for two different inputs, the corresponding traces should meet at some point of execution. And reversibly: if for a given two inputs a program should respond with different outputs, the corresponding traces should never meet (because once they do, they cannot diverge anymore and are destined to end up with the same final output of entire program). Together with Armando Solar-Lezama, we proposed an extension of GP that takes these observations into account and guides program synthesis in a more thoughtful way.

Automatic exploitation of domain- and problem-specific information

The problems we approach in program synthesis are not any problems. Such a problem dwells in a specific domain, determined by a given programming language and semantics of instructions available in that language. This makes it possible to exploit the domain-specific information. This does not have to be hard and can be often automated. For an example, see this paper of using Groebner bases for synthesis of programs in the real-valued domain (so-called symbolic regression).

Modularity in program synthesis

Together with Bartosz Wieloch, we came up with the concept of functional semantics described in this paper.

Software

I'm currently in the process of making publicly available two software libraries related to the above research, both written in Scala:

FUEL, generic and minimalistic framework for evolutionary computation,
ScaPS, a library for stochastic program synthesis using genetic programming, built atop of ScEvo.

From Krzysztof Krawiec

Research: ProgramSynthesis