Get your loops right first

A good rule of thumb to keep in mind is "90% of your scripts execution time is taken up in 10% of the code". Granted that is a vast over-generalisation and it is probably not very true, either, however the principle is correct: you can spend hours and hours optimising your program randomly and find it runs no faster because you have been optimising bits that are not run very often. Granted, from a purely programmatic point of view it is never a bad thing to optimise even rarely used code, however few people and even fewer companies have the time to spend optimising everything, so you should generally try to focus your efforts on what will give the greatest return.

One of the best places to look for optimisation opportunities is in loops, as they clearly embody code that needs to be executed multiple times. There are a number of ways you can speed up loop execution, of which two are the most common: calling a function as part of the loop condition, and loop-invariant code.

Consider this first example:

for ($i = 1; $i < count($myarr); ++$i) {
    //
}

Here we have some code that does an action once for every element in an array - quite a common action, as you well know. However, due to the way the count() function is implemented, it will be executed in each loop iteration. That is, if we set $myarr to an array holding 100 items, PHP will count it 100 times - once for each time the loop goes around. Sometimes this is the desired behaviour. For example, if you are changing the number of elements in the array inside the loop, you will of course need to recalculate its size with each iteration. However, more often that not the size is fixed, so why bother making PHP fetch its size every time?

Internally PHP actually caches the number of elements in the array, so technically calling count() does not make PHP count each individual element. However, it does still need to jump through extra hoops thanks to the function call, which is why there is such a noticeable speed difference here.

A much better solution is to calculate the array size just once, outside of the loop, like this:

$len = count($myarr);

for ($i = 1; $i < $len; ++$i) {
    //
}

Naturally the benefits of changing to this faster method depend largely on how large the array is. No matter what, it is better to calculate the array size outside of the loop as it will always be about twice as fast, however twice as fast as "very fast" is not that noticeable! Try this script out to give yourself an idea:

<?php
    for ($i = 1; $i < 3000000; ++$i) {
        $myarr[] = rand();
    }

    echo "Done setting up array.\n";

    $START = time();
    for ($i = 1; $i < count($myarr); ++$i) {
        //
    }
    $END = time() - $START;
    echo "Using count() took $END seconds\n";

    $START = time();
    $len = count($myarr);
    for ($i = 1; $i < $len; ++$i) {
        //
    }
    $END = time() - $START;
    echo "Not using count() took $END seconds\n";
?>

On my test computer using count() in the loop took eight seconds and not using count in the loop took four seconds - quite a substantial difference, but it was on an array of 3,000,000 items!

The second "must avoid" loop problem is loop-invariant code, which is the practice of calculating things inside a loop that could be done, all or in part, outside the loop. For example, consider the following code:

<?php
    $n = 100;

    for ($i = 1; $i < 100; ++$i) {
        for ($j = 1; $j < 100; ++$j) {
            $somearray[$i][$j] = ($n * 100) + ($i * $n) + ($j * $n);
        }
    }
?>

Inside our inner loop ($j), you will see values are put into a two-dimensional array using three added calculations: $n * 100, $i * $n, and $j * $n. Of these three, $n * 100 will always be the same value because $n does not change and 100 is always 100. However, that calculation will be done 10,000 times, each time coming up with the same answer, which makes it loop-invariant. Similarly, $i * $n does change, but not inside the inner loop, so this is loop-invariant for the inner loop only. The final calculation, $j * $n, does change, and so is not loop-invariant.

The loops should be re-written like this:

<?php
    $n = 100;
    $n_mult_100 = $n * 100;

    for ($i = 0; $i < 100; ++$i) {
        $i_mult_n = $i * $n;

        for ($j = 0; $j < 100; ++$j) {
            $somearray[$i][$j] = $n_mult_100 + $i_mult_n + ($j * $n);
        }
    }
?>

This new code has removed the loop-invariance, thus making the script run faster. Loop-invariant code is usually easy to spot, and, unlike in the example above, usually yields hefty performance increases - particularly if a complex function is being called.

 

Want to learn PHP 7?

Hacking with PHP has been fully updated for PHP 7, and is now available as a downloadable PDF. Get over 1200 pages of hands-on PHP learning today!

If this was helpful, please take a moment to tell others about Hacking with PHP by tweeting about it!

Next chapter: Pre-increment and post-increment aren't the same >>

Previous chapter: Read the manual carefully

Jump to:

 

Home: Table of Contents

Copyright ©2015 Paul Hudson. Follow me: @twostraws.