PHP 8.0: New Functions, Classes and JIT (4/4)

3 years ago by David Grudl translated by Pavel Linhart  

PHP version 8.0 has been released. It's full of new features like no other version before. Their introduction deserved four separate articles. In the last part we'll take a look at new functions and classes and introduce the Just in Time Compiler.

New Functions

The standard PHP library has hundreds of functions and in version 8.0 six new one have appeared. It doesn’t seem much, but most of them remedy weak points of the language. Which nicely lines up with the whole version 8.0 concept, that tightens and consolidates PHP like no version before. Overview of all the new functions and methods can be found in the migration guide.

str_contains() str_starts_with() str_ends_with()

Functions to determine whether a string begins, ends, or contains a substring.

if (str_contains('Nette', 'te')) {
	...
}

With the advent of this trinity, PHP defines how to handle an empty string while searching, which is what all other related functions adhere to, and that is an empty string is found everywhere:

str_contains('Nette', '')     // true
str_starts_with('Nette', '')  // true
strpos('Nette', '')           // 0 (previously false)

Thanks to this, the behavior of the trinity is completely identical to the Nette analogues:

str_contains()      # Nette\Utils\String::contains()
str_starts_with()   # Nette\Utils\String::startsWith()
str_ends_with()     # Nette\Utils\String::endsWith()

Why are these functions so important? Standard libraries of all languages are always burdened by historical development; inconsistencies and missteps can’t be avoided. But at the same time it’s a testimonial of the respective language. Surprisingly, the 25-year-old PHP lacks functions for such basic operations as returning the first or last element of an array, escaping HTML without nasty surprises (htmlspecialchars does not escape an apostrophe), or just searching for a string in a string. It doesn’t hold that it can be somehow bypassed, because the result is not legible and understandable code. This is a lesson for all the API authors. When you see that much of the function's documentation is taken up by explanations of pitfalls (such as the return values of strpos), it's a clear sign to modify the library and add str_contains.

get_debug_type()

Replaces the now obsolete get_type(). Instead of long types like integer, it returns the today used int, in the case of objects it directly returns the type:

Value gettype() get_debug_type()
'abc' string string
[1, 2] array array
231 integer int
3.14 double float
true boolean bool
null NULL null
new stdClass object stdClass
new Foo\Bar object Foo\Bar
function() {} object Closure
new class {} object class@anonymous
new class extends Foo {} object Foo@anonymous
curl_init() resource resource (curl)
curl_close($ch) resource (closed) resource (closed)

Resource to Object Migration

The resource type values ​​come from a time when PHP didn’t yet have objects, but actually needed them. That's how resources were born. Today we have objects and, compared to resources, they work much better with the garbage collector, so the plan is to gradually replace them all with objects.

As of PHP 8.0, resource images, curl joins, openssl, xml, etc. are changed to objects. In PHP 8.1, FTP connections, etc. will follow.

$res = imagecreatefromjpeg('image.jpg');
$res instanceof GdImage  // true
is_resource($res)        // false - BC break

These objects don’t yet have any methods, nor can you instantiate them directly. So far, it's really just a matter of getting rid of obsolete resources from PHP without changing the API. And that's good, because creating a good API is a separate and challenging task. No one wishes for the creation of new PHP classes such as SplFileObject with methods named fgetc() or fgets().

PhpToken

The tokenizer and the functions around token_get_all are also migrated to objects. This time it's not about getting rid of resources, but we get a full-fledged object representing one PHP token.

<?php
$tokens = PhpToken::tokenize('<?php $a = 10;');
$token = $tokens[0];         // instance PhpToken

echo $token->id;             // T_OPEN_TAG
echo $token->text;           // '<?php'
echo $token->line;           // 1
echo $token->getTokenName(); // 'T_OPEN_TAG'
echo $token->is(T_STRING);   // false
echo $token->isIgnorable();  // true

Method isIgnorable() returns true for tokens T_WHITESPACE, T_COMMENT, T_DOC_COMMENT, and T_OPEN_TAG.

Weak Maps

Weak maps are related to the garbage collector, which releases all objects and values that are no longer used from memory (i.e. there’s no variable or property containing them). Because PHP threads are short-lived and we have plenty of memory available on our servers, we usually don’t address issues concerning effective memory freeing at all. But for longer-running scripts, they’re essential.

The WeakMap object is similar to SplObjectStorage Both use objects as keys and allow arbitrary values ​​to be stored under them. The difference is that WeakMap doesn’t prevent the object from being released by the garbage collector. I.e. if the only place, where the object currently exists, is a key in the weak map, it will be removed from the map and memory.

$map = new WeakMap;
$obj = new stdClass;
$map[$obj]  = 'data for $obj';

dump(count($map));  // 1
unset($obj);
dump(count($map));  // 0

What is it good for? For example, for caching. Let's have a loadComments() method that we pass a blog article and it returns all its comments. Since the method is called repeatedly for the same article, we will create another getComments(), which will cache the result of the first method:

class Comments
{
	private WeakMap $cache;

	public function __construct()
	{
		$this->cache = new WeakMap;
	}

	public function getComments(Article $article): ?array
	{
		$this->cache[$article] ??= $this->loadComments($article);
		return $this->cache[$article]
	}

	...
}

The point is that when the $article object is released (for example, the application starts working with another article), its entry is also released from the cache.

PHP JIT (Just in Time Compiler)

You may know that PHP is compiled into so-called opcode, which are low-level instructions that you can see here, for example and that are executed by a PHP virtual machine. And what is a JIT? JIT can transparently compile PHP directly into machine code, which is executed directly by the processor, so that slower execution by the virtual machine is bypassed.

JIT is therefore intended to speed up PHP.

The effort to implement JIT into PHP dates back to 2011 and is backed by Dmitry Stogov. Since then, he has tried 3 different implementations, but none of them got into a final PHP release for three reasons: the result has never been a significant increase in performance for typical web applications; complicates PHP maintenance (i.e. no one but Dmitry understands it 😉); there were other ways to improve performance without having to use a JIT.

The jump increase in performance observed in PHP version 7 was a by-product of the work on JIT, although paradoxically it was not deployed. This is only happening now in PHP 8. But I'll be holding back exaggerated expectations: you won’t probably see any speed-up.

So why is JIT entering PHP? First, other ways to improve performance are slowly running out, and JIT is simply the next step. In common web applications, it doesn’t bring any speed improvements, but it significantly speeds up, for example, mathematical calculations. This opens up the possibility of starting to write these things in PHP. In fact, it would be possible to implement some functions directly in PHP that previously required a direct C implementation due to speed.

JIT is part of the opcache extension and is enabled together with it in php.ini (read the documentation about those four digits):

zend_extension=php_opcache.dll
opcache.jit=1205              ; configuration using four digits OTRC
opcache.enable_cli=1          ; in order to work in the CLI as well
opcache.jit_buffer_size=128M  ; dedicated memory for compiled code

You can verify that JIT is running, for example, in the Tracy Bar information panel.

JIT works very well if all variables have clearly defined types and can’t change even when calling the same code repeatedly. I’m therefore wondering if we'll be declaring types in PHP for variables one day as well: string $s = 'Bye, this is the end of the series';