Nette Utils: Performance and Efficiency Generators

6 months ago by David Grudl  

The Nette Framework brings performance benefits and memory optimization by using so-called generators. These innovations allow you to work with large data sets more elegantly and efficiently without having to change the way you write your code. Let's look at three areas where Nette utilizes the power of generators in ways that might surprise even experienced developers.

Reading Files Line by Line with FileSystem::readLines()

Working with large files has always been a challenge, especially in terms of memory consumption. Nette provides an elegant solution with the FileSystem::readLines() method.

use Nette\Utils\FileSystem;

$lines = FileSystem::readLines('large_file.txt');
foreach ($lines as $number => $line) {
    echo "Line $number: $line\n";
    // Process the line...
}

At first glance, this code might seem to do the same as the native PHP function file(). However, appearances can be deceiving. While file() loads the entire file into memory at once, which can be problematic for large files, readLines() uses a generator to read the file incrementally.

Key benefits:

  • Low memory consumption: The file is read gradually, with only one line in memory at a time.
  • Immediate processing: You don't have to wait for the entire file to be loaded.
  • Handling arbitrarily large files: The file size is not limited by available memory.
  • Natural interface: The code is written just as easily from the programmer's perspective.

This solution is ideal for processing logs, large data exports, or any extensive text files.

Lazy Regular Expressions with Strings::matchAll()

The Strings::matchAll() method received an interesting upgrade in the new version of Nette Utils 4.0.5. The new $lazy parameter allows for incremental processing of the string when searching for matches with a regular expression.

use Nette\Utils\Strings;

$text = file_get_contents('very_long_text.txt');
$pattern = '/\b\w{5,}\b/'; // words with 5 or more characters

$matches = Strings::matchAll($text, $pattern, lazy: true);
foreach ($matches as $match) {
    echo "Found word: {$match[0]}\n";
    // We can interrupt the processing at any time
    if ($match[0] === 'end') {
        break;
    }
}

Without the lazy parameter, the entire text would be searched at once, and all matches would be stored in memory. With lazy: true, the text is processed incrementally, and matches are found on-the-fly.

Advantages:

  • Performance distribution: Processing occurs gradually, not in one large block.
  • Possibility of early termination: You can stop searching as soon as you find what you need.
  • Natural interface: The code is still written the same way.

This is particularly useful when analyzing large texts, parsing logs, or searching for specific patterns in extensive data.

Efficient Work Using the Iterables Class

The new Iterables class provides a set of methods for working with iterable structures, similar to what Arrays offers for arrays. However, it includes methods optimized for incremental processing. Let's see how you can use them in your code.

use Nette\Utils\Iterables;

$numbers = // large data set

// filtering elements
$evenNumbers = Iterables::filter($numbers, fn($n) => $n % 2 === 0);

// transforming elements
$squared = Iterables::map($numbers, fn($n) => $n * $n);

// transforming both keys and values
$squaredKeys = Iterables::mapWithKeys($fruits, fn($v, $k) => [$k * $k, $n]);

The key advantage is lazy evaluation, where transformations are performed only when an element is actually needed. You can terminate the iteration at any time and save computational time:

foreach ($evenNumbers as $number) {
    if ($number > 20) break; // We can terminate at any time
}

These methods are particularly useful when working with large collections of data, such as extensive database results or processing data streams. Furthermore, they are completely transparent, allowing you to start using them immediately without having to fundamentally change the structure of your code. Nette strives to bring advanced features in a way that is accessible and easy to use for developers of all levels.