Quiz: Can you defend against XSS vulnerability?

2 years ago by David Grudl  

Put your security knowledge to the test with this quiz! Can you prevent an attacker from taking control of an HTML page?

In all tasks, you will address the same question: how to properly display the variable $str in an HTML page without creating an XSS vulnerability. The basis of defense is escaping, which means replacing characters with special meanings with corresponding sequences. For example, when outputting a string to HTML text, where the character < has a special meaning (indicating the beginning of a tag), we replace it with the HTML entity &lt;, and the browser correctly displays the symbol <.

Be vigilant, as XSS vulnerability is very serious. It can allow an attacker to take control of a page or even a user's account. Good luck, and may you succeed in keeping the HTML page secure!

The first trio of questions

Specify which characters need to be handled and how in the first, second, and third examples:

1) <p><?= $str ?></p>
2) <input value="<?= $str ?>">
3) <input value='<?= $str ?>'>

If the output was not treated in any way, it would become part of the displayed page. If an attacker managed to insert the string 'foo" onclick="evilCode()' into a variable and the output was not treated, it would cause their code to be executed when clicking on the element:

$str = 'foo" onclick="evilCode()'
❌ not treated: <input value="foo" onclick="evilCode()">
✅ treated:     <input value="foo&quot; onclick=&quot;evilCode()">

Solutions for each example:

  1. characters < and & represent the beginning of an HTML tag and entity; replace them with &lt; and &amp;
  2. characters " and & represent the end of an attribute value and the beginning of an HTML entity; replace them with &quot; and &amp;
  3. characters ' and & represent the end of an attribute value and the beginning of an HTML entity; replace them with &apos; and &amp;

You get a point for each correct answer. Of course, in all three cases, you can replace other characters with entities as well; it doesn't cause any harm, but it is not necessary.

Question No. 4

Moving on, which characters need to be replaced when displaying a variable in this context?

<input value=<?= $str ?>>

Solution: As you can see, the quotes are missing here. The easiest approach is to simply add the quotes and then escape as in the previous question. There is also a second solution, which is to replace spaces and all characters that have special meaning inside a tag, such as >, /, =, and some others with HTML entities.

Question No. 5

Now it's getting more interesting. Which characters need to be treated in this context:

<script>
	let foo = '<?= $str ?>';
</script>

Solution: Inside the <script> tag, the escaping rules are determined by JavaScript. HTML entities are not used here, but there is one special rule. So which characters do we escape? Inside a JavaScript string, we naturally escape the ' character that delimits it, using a backslash, replacing it with \'. Since JavaScript doesn't support multi-line strings (except as template literals), we also need to escape newline characters. However, be aware that in addition to the usual \n and \r characters, JavaScript also considers the Unicode characters \u2028 and \u2029 as line terminators, which we must escape as well. Finally, the mentioned special rule: the string must not contain </script. This can be prevented, for example, by replacing it with <\/script.

If you knew all this, congratulations.

Question No. 6

The following context seems to be just a variation of the previous one. Do you think the treatment will be different?

<p onclick="foo('<?= $str ?>')"></p>

Solution: Again, the escaping rules for JavaScript strings apply here, but unlike the previous context where HTML entities were not used, here they are essential. So first, we escape the JavaScript string using backslashes and then replace the special characters (" and &) with HTML entities. Be careful, the correct order is important.

As you can see, the same JavaScript literal may be encoded differently in a <script> element compared to when it appears in an attribute!

Question No. 7

Let's return from JavaScript back to HTML. Which characters do we need to replace inside the comment and how?

<!-- <?= $str ?> -->

Solution: Inside an HTML (and XML) comment, all traditional special characters, such as <, &, " and ', can appear without issues. What is forbidden, and this may surprise you, is the pair of characters --. Escaping this sequence is not specified in standards, so it is up to you how to replace it. You can intersperse them with spaces. Or, for example, replace them with ==.

Question No. 8

We are approaching the end, so let's try a different angle. Consider what you need to be careful about when outputting a variable in this context:

<a href="<?= $str ?>">...</a>

Solution: In addition to escaping, it is crucial to also verify that the URL does not contain a dangerous scheme like javascript:, because a URL composed in this way would execute the attacker's code when clicked.

Question No. 9

Finally, a treat for real connoisseurs. This is an example of an application using a modern JavaScript framework, specifically Vue. Let's see if you can figure out what to be careful about when outputting a variable inside the #app element:

<div id="app">
    <?= $str ?>
</div>

<script src="https://cdn.jsdelivr.net/npm/vue/dist/vue.js"></script>
<script>
const app = new Vue({
    el: '#app',
    ...
})
</script>

This code creates a Vue application that will be rendered into the #app element. Vue interprets the content of this element as its template. And within the template, it interprets double curly braces, which represent variable output or JavaScript code execution (e.g., {{ foo }}).

So, within the #app element, besides the characters < and &, the pair of {{ also has a special meaning, which we need to replace with another appropriate sequence to prevent Vue from interpreting it as its own template syntax. Replacing with HTML entities doesn't help in this case. How to handle this? There's a clever trick: insert an empty HTML comment between the braces {<!-- -->{, and Vue will ignore this sequence.

Quiz Results

How well did you perform on the quiz? How many correct answers did you get? If you answered at least 4 questions correctly, you're among the top 8% of participants – congratulations!

However, ensuring the security of your website requires properly handling output in all situations.

If you were surprised by how many different contexts can appear on a typical HTML page, know that we haven't mentioned all of them by far. That would make the quiz much longer. Nevertheless, you don't have to be an expert in escaping in every context if your templating system can handle it competently.

So, let's put them to the test.

How do templating systems perform?

All modern templating systems boast an autoescaping feature that automatically escapes all outputted variables. If they do it correctly, your website is secure. If they do it poorly, the site is exposed to the risk of XSS vulnerability with all its serious consequences.

We will test popular templating systems against the scenarios from this quiz to determine the effectiveness of their auto-escaping. Let the assessment of PHP templating systems begin.

Twig ❌

First up is the Twig templating system (version 3.5), most commonly used in conjunction with the Symfony framework. We'll task it with handling all the quiz scenarios. The variable $str will always be filled with a potentially dangerous string, and we'll see how it manages the output. You can see the results on the right. You can also explore its responses and behavior on the playground.

   {% set str = "<'\"&" %}
1) <p>{{ str }}</p>
2) <input value="{{ str }}">
3) <input value='{{ str }}'>

   {% set str = "foo onclick=evilCode()" %}
4) <input value={{ str }}>

   {% set str = "'\"\n\u{2028}" %}
5) <script>	let foo = '{{ str }}'; </script>
6) <p onclick="foo('{{ str }}')"></p>

   {% set str = "-- ---" %}
7) <!-- {{ str }} -->

   {% set str = "javascript:evilCode()" %}
8) <a href="{{ str }}">...</a>

   {% set str = "{{ foo }}" %}
9) <div id="app"> {{ str }} </div>

✅ <p>&lt;&#039;&quot;&amp;</p>
✅ <input value="&lt;&#039;&quot;&amp;">
✅ <input value='&lt;&#039;&quot;&amp;'>


❌ <input value=foo onclick=evilCode()>


❌ <script> let foo = &#039;&quot;u{2028}; </script>
❌ <p onclick="foo(&#039;&quot;u{2028})"></p>


❌ <!-- -- --- -->


❌ <a href="javascript:evilCode()">...</a>


❌ <div id="app"> {{ foo }} </div>

Twig failed in six out of nine tests!

Unfortunately, Twig's automatic escaping works only in HTML text and attributes, and even then only when they are enclosed in quotes. As soon as the quotes are missing, Twig does not report any error and creates an XSS security hole.

This is particularly problematic since this is how attribute values are written in popular libraries like React or Svelte. A programmer who uses both Twig and React might quite naturally forget about the quotes.

Twig's autoescaping also fails in all other examples. In contexts (5) and (6), manual escaping is needed using {{ str|escape('js') }}, while for other contexts, Twig does not even offer an escaping function. It also lacks protection against outputting a malicious link (8) or support for Vue templates (9).

Blade ❌❌

The second contestant is the Blade templating system (version 10.9), which is tightly integrated with Laravel and its ecosystem. Again, we will test its capabilities on our quiz scenarios. You can also explore its responses on the playground.

   @php($str = "<'\"&")
1) <p>{{ $str }}</p>
2) <input value="{{ $str }}">
3) <input value='{{ $str }}'>

   @php($str = "foo onclick=evilCode()")
4) <input value={{ $str }}>

   @php($str = "'\"\n\u{2028}")
5) <script> let foo = {{ $str }}; </script>
6) <p onclick="foo({{ $str }})"></p>

   @php($str = "-- ---")
7) <!-- {{ $str }} -->

   @php($str = "javascript:evilCode()")
8) <a href="{{ $str }}">...</a>

   @php($str = "{{ foo }}")
9) <div id="app"> {{ $str }} </div>

✅ <p>&lt;&#039;&quot;&amp;</p>
✅ <input value="&lt;&#039;&quot;&amp;">
✅ <input value='&lt;&#039;&quot;&amp;'>


❌ <input value=foo onclick=evilCode()>


❌ <script>	let foo = &#039;&quot; ; </script>
❌ <p onclick="foo(&#039;&quot; )"></p>


❌ <!-- -- --- -->


❌ <a href="javascript:evilCode()">...</a>


❌❌ <div id="app"> &lt;?php echo e(foo); ?&gt; </div>

Blade failed in six out of nine tests!

The result is similar to Twig. Again, automatic escaping works only in HTML text and attributes and only if they are enclosed in quotes. Blade's autoescaping also fails in all other examples. In contexts (5) and (6), manual escaping is needed using {{ Js::from($str) }}. For other contexts, Blade does not even offer an escaping function. It also lacks protection against outputting a malicious link (8) or support for Vue templates (9).

What's particularly surprising is the failure of the @php directive in Blade, which causes its own PHP code to be directly rendered in the output, as seen in the last line.

Smarty ❌❌❌

Now, let's test the oldest templating system for PHP, which is Smarty (version 4.3). To great surprise, this system does not have active automatic escaping by default. Thus, when outputting variables, you either have to specify the filter {$var|escape} every time, or activate automatic HTML escaping. Information about this is somewhat buried in the documentation.

   {$str = "<'\"&"}
1) <p>{$str}</p>
2) <input value="{$str}">
3) <input value='{$str}'>

   {$str = "foo onclick=evilCode()"}
4) <input value={$str}>

   {$str = "'\"\n\u{2028}"}
5) <script>	let foo = {$str}; </script>
6) <p onclick="foo({$str})"></p>

   {$str = "-- ---"}
7) <!-- {$str} -->

   {$str = "javascript:evilCode()"}
8) <a href="{$str}">...</a>

   {$str = "{{ foo }}"}
9) <div id="app"> {$str} </div>

✅ <p>&lt;&#039;&quot;&amp;</p>
✅ <input value="&lt;&#039;&quot;&amp;">
✅ <input value='&lt;&#039;&quot;&amp;'>


❌ <input value=foo onclick=evilCode()>


❌ <script> let foo = &#039;&quot;\u2028; </script>
❌ <p onclick="foo(&#039;&quot;\u2028)"></p>


❌ <!-- -- --- -->


❌ <a href="javascript:evilCode()">...</a>


❌ <div id="app"> {{ foo }} </div>

Smarty failed in six out of nine tests!

At first glance, the result is similar to the previous libraries. Smarty can only automatically escape in HTML text and attributes, and only when the values are enclosed in quotes. It fails everywhere else. In contexts (5) and (6), you need to manually escape using {$str|escape:javascript}. However, this is only possible when automatic HTML escaping is not active, as these escape methods conflict with each other. From a security perspective, Smarty is a complete failure in this test.

Latte ✅

The trio is concluded by the Latte templating system (version 3.0). We will test its autoescaping capabilities. You can also explore its responses and behavior on the playground.

   {var $str = "<'\"&"}
1) <p>{$str}</p>
2) <input value="{$str}">
3) <input value='{$str}'>

   {var $str = "foo onclick=evilCode()"}
4) <input value={$str}>

   {var $str = "'\"\n\u{2028}"}
5) <script>	let foo = {$str}; </script>
6) <p onclick="foo({$str})"></p>

   {var $str = "-- ---"}
7) <!-- {$str} -->

   {var $str = "javascript:evilCode()"}
8) <a href="{$str}">...</a>

   {var $str = "{{ foo }}"}
9) <div id="app"> {$str} </div>

✅ <p>&lt;'"&amp;</p>
✅ <input value="&lt;&apos;&quot;&amp;">
✅ <input value='&lt;&apos;&quot;&amp;'>


✅ <input value="foo onclick=evilCode()">


✅ <script> let foo = "'\"\n\u2028"; </script>
✅ <p onclick="foo(&quot;&apos;\&quot;\n\u2028&quot;)"></p>


✅ <!--  - -  - - -  -->


✅ <a href="">...</a>


✅ <div id="app"> {<!-- -->{ foo }} </div>

Latte excelled in all nine tasks!

It successfully handled missing quotes in HTML attributes, processed JavaScript both in the <script> element and in attributes, and properly managed the forbidden sequence in HTML comments.

What's more, it prevented a situation where clicking on a malicious link provided by an attacker could execute their code. And it skillfully handled the escaping of tags for Vue.

Bonus test

One of the essential capabilities of all templating systems is working with blocks and the related template inheritance. Therefore, let's give all tested templating systems one more challenge. We will create a description block, which we will output in an HTML attribute. In the real world, the block definition would, of course, be located in the child template and its output in the parent template, such as the layout. This is just a simplified version, but it's sufficient to test the autoescaping when outputting blocks. How did they perform?

Twig: failed ❌ when outputting blocks, characters are not properly escaped

{% block description %}
	rock n' roll
{% endblock %}

<meta name='description'
	content='{{ block('description') }}'>




<meta name='description'
	content=' rock n' roll '> ❌

Blade: failed ❌ when outputting blocks, characters are not properly escaped

@section('description')
	rock n' roll
@endsection

<meta name='description'
	content='@yield('description')'>




<meta name='description'
	content=' rock n' roll '> ❌

Latte: passed ✅ when outputting blocks, it correctly handled problematic characters

{block description}
	rock n' roll
{/block}

<meta name='description'
	content='{include description}'>




<meta name='description'
	content=' rock n&apos; roll '> ✅

Why are so many websites vulnerable?

Autoescaping in systems like Twig, Blade, or Smarty works by simply replacing five characters <>"'& with HTML entities and does not distinguish between contexts. Therefore, it only works in some situations and fails in all others. Naive autoescaping is a dangerous feature because it creates a false sense of security.

It is not surprising, then, that currently more than 27% of websites have critical vulnerabilities, mainly XSS (source: Acunetix Web Vulnerability Report). How can we address this problem? By using a templating system that distinguishes between different contexts.

Latte is the only PHP templating system that does not perceive a template as just a string of characters but truly understands HTML. It recognizes tags, attributes, and other elements. It distinguishes between different contexts. And therefore, it correctly escapes in HTML text, differently inside HTML tags, differently inside JavaScript, and so on.

Latte therefore stands out as the only templating system in our test that provides comprehensive security.


Moreover, thanks to its understanding of HTML, it offers wonderful n:attributes, which users love:

<ul n:if="$menu">
	<li n:foreach="$menu->getItems() as $item">{$item->title}</li>
</ul>