Software Engineering and Architecture Group (SEARCH) > CS > JBI > FWN > RUG

Name analysis by abstract interpretation of interpreters

An important part of developing a new language is 'name analysis': figuring out the definition site (e.g., a declaration) corresponding to the use of an identifier (e.g., a variable). The results of name analysis is used in compilation, but also to provide IDE support in the form of "jump to definition" editor services.

In a typical compiler pipe line the name analysis phase is separately implemented for a language. In this project the goal is to not have to write name analysis at all, but derive it from an interpreter. We'll start with an interpreter of the language, and then use abstract interpretation of that very interpeter to compute use-define relations.

Such an approach would be possible since the interpreter needs to perform look-ups of identifiers at runtime, for instance, in environments containing declarations of variables, classes or data types. The idea is thus to "run" the interpreter on some program, but with a "special" semantics. This special semantics does not compute the result of the program, but instead computes a representation of the use-define relation between the identifiers in the program.

The essence of the approach is to retarget the language the interpreter has been defined in, to obtain a static analysis result on the a program written in the language the interpreter is meant to execute. In this specific case, we'll use a (small) subset of the Rascal meta programming language as the "host" language that we'll retarget.

More abstractly, the goal can be expressed as follows:

eval-H(eval-L-in-H)(L-program, input): result of L-program
name-H(eval-L-in-H)(L_program): name analysis of L-program

An interpreter for a language L is implemented in host language H. Different interpretations of this language (eval-H for normal evaluation, and name-H for name analysis), evaluate programs in L to different kinds of results.

If successful, the language engineer would simply write an interpreter for her language, and get name analysis for free!

Steps:

  • Identify a small/minimal subset of Rascal as a vehicle for experimentation. This should include, e.g., basic pattern matching, conditionals, basic data types, AST data typess etc.
  • Define an interpreter for the language LM (see the paper on scope graphs) in the identified subset of Rascal
  • Define an alternative, abstract interpreter for the subset of Rascal (within Rascal!), to record name analysis information.
  • Evaluate by executing LM programs on the special semantics of the subset.
  • Perform a case-study using a different, more realistic language (e.g., Tiger, MiniML, or Caml Light).

Resources:

Contact: Tijs van der Storm.