Latte 3: an impressive leap

about a month ago by David Grudl  

The fanfare resounds through the hall and Latte 3 comes on the scene. With a completely rewritten compiler. The new version represents the biggest evolutionary leap ever made in Latte.

Why Latte?

Latte has a funny history. It wasn't originally meant to be serious. It was meant to demonstrate that no templating system was needed in PHP. The twist came with the idea that a templating system could understand an HTML page. To be clear, for other templating systems, the text around the {{...}} tags is just a noise without any meaning. It doesn't matter if it's an HTML page, a CSS style, or even text in Markdown, the templating engine sees only a clump of characters. Latte, on the other hand, understands the document. Which brings a lot of major advantages. From convenience in the form of nifty features like n:attributes, to ultimate security.

So Latte knows what escaping function to use (which most programmers don't know, but thanks to Latte it doesn't matter and they don't create a cross-site scripting security hole). It prevents outputting a string that would be unsafe at a certain point. It can even prevent misinterpretation of mustache brackets by the frontend framework. …And security experts will have nothing to eat 🙂

I wouldn't expect Latte to outperform other systems by at least 10 years with this idea, because to date only Latte and Google's Soy work this way. Latte and Soy are the only truly secure templating systems for the web. (Although Soy has only the escaping feature mentioned above.)

Latte's other key feature is that it uses PHP for expressions inside tags. That's a syntax familiar to the programmer. So the developer doesn't have to learn another new language. He doesn't have to look up how to write this or that in Latte. He just writes it the way he already knows it. On the other hand, the popular Twig templating system uses Python syntax, where even very basic constructs are written differently. For example, foreach ($people as $person) is written as for person in people in Python (and thus in Twig), which makes the brain switch between two opposite conventions quite unnecessarily.

Latte rolls the competition with its features. And that's why it deserved a new compiler.

The original compiler

Latte and its syntax was created 14 years ago (2008), the compiler (used until now) three years later. It already supported everything that is used till today, including blocks, inheritance, snippets, etc.

The compiler worked as a single-pass, which means that it parsed the template and directly transformed it into the resulting PHP code, which was then cached. The PHP language used in the tags was tokenized and then passed through several processes that modified the token stream. For example, they added syntactic tweaks that PHP didn't know then or doesn't know even now and are specific to Latte (shortened ternary operator, ($var|upper|truncate) filters, etc).

In the eleven years of Latte's development, there have been situations where a single-pass compiler was not sufficient. The ideal would be to move to a two-step compilation, i.e., first parse the template into an intermediate state, the AST tree, and then only generate the class code from it.

Also, as the PHP-like language used in tags was gradually enhanced, the representation in tokens was no longer sufficient, and it would be ideal to parse it to an AST tree as well. For example, a sandbox built on top of an AST tree can better guarantee that it will be truly bulletproof.

The new compiler is rocket science 🙂

It took me five years to get around to rewriting the compiler because I knew it would be extremely difficult. Just tokenizing the template is a challenge, as it has to run in parallel with parsing. This is because the parser needs to be able to influence the tokenization when, for example, it encounters the n:syntax=off attribute.

Support for running two codes in parallel was introduced by Fibers in PHP 8.1, but Latte does not yet use it to run on PHP 8.0. Instead, it uses similar coroutines (you won't find anything about them in the PHP documentation, so at least there's an awe-inspiring presentation). So there is a magic going on under the hood of Latte.

I thought it would be an even more challenging task to write a lexer and parser for a language as complex as the PHP dialect used in tags. Essentially, this meant creating something like a nikic/PHP-Parser for Latte. And also the need to formalize the grammar of that language.

After a few months of hard work, it was done! I consider the new compiler a dream come true. Latte is entering a new era.

See how to migrate to Latte 3

Further reading

Comments (RSS)

  1. Good job! Thanks! ⭐️

    about a month ago

Sign in to submit a comment