Deriving Syntax Highlighters from Context-Free Grammars
Language workbenches provide an integrated experience for developing software languages. Language users are supported by editor services, such as syntax highlighting, jump-to-definition, outlines etc. The goal of this project is to leverage existing editors for providing such services. More specifically, the project is about generating syntax highlighting support based on context-free grammars.
Many editors (e.g., VS Code, Textmate, SublimeText, Atom, ACE, CodeMirror etc.) or highlighters (e.g., Highlight.js, Github) accept state-based language definitions for defining syntax highlighting. In the context of the Rascal language workbench, however, coloring is derived from context-free grammars. How can we derive state-based highlighters from Rascal's context-free grammars?
As a starting point, you'll take the approach detailed in this paper: Mohri, Nederhof, Regular approximation of context-free grammars through transformation, Robustness in language and speech technology, 2001, Springer, [pdf].
Resources on state-based language tokenizers:
- A prototype for transforming grammars into highlighters (in Rascal).
- A precise description of the algorithm, including limitations and trade-offs.
- Evaluation of the prototype on several Rascal-defined languages, including Rascal itself.
Contact: Tijs van der Storm.