Strict Equality for Better Code

A major focus of my day job right now is cleaning up the PHP in a decades-old monolith. This includes tests written for two different test runners (PHPUnit and an in-house library) by hundreds of engineers over the years.

I could write a book on the horrors I’ve seen (and currently have at least half a dozen blog posts in draft state), but I’m not interested in raking anyone over the coals for past engineering decisions—honestly, it’s to be expected with any project this size and age. Instead, I wanted to take a moment to talk about one of the most prevalent oversights made by engineers of all levels: strict equality.

Not all equals are made equal

In PHP, JavaScript, and many other loosely-typed languages, there are two ways to determine if something is equal:

Loose equality (e.g. ==, or “double equals”)
Strict equality (e.g. ===, or “triple equals”)

How do these compare? The main difference is that loose equality checks don’t account for types when making a comparison; when comparing a string against an integer, the integer will be treated as a string, so 123 == '123'. In fact, all of the following evaluate as true when using loose equality:

# Truthy values
true == 'some string';
true == 1;
true == 1.0;
true == ['some', 'array'];
true == new stdClass();

# Falsey values
false == '';
false == '0';
false == 0;
false == 0.0;
false == null;
false == [];

# Truthy values

true == 'some string';

true == 1;

true == 1.0;

true == ['some', 'array'];

true == new stdClass();

# Falsey values

false == '';

false == '0';

false == 0;

false == 0.0;

false == null;

false == [];

If you’re familiar with PHP’s empty() function, those “falsey” values should look familiar, as all of them are considered to be empty.

Now, if we replaced the double equal sign with a triple equals, all of those evaluations would be false:

var_dump(true === 'some string');
#=> bool(false)

var_dump(true === 1);
#=> bool(false)

var_dump(true === ['some', 'array']);
#=> bool(false)

// ...and so on.

var_dump(true === 'some string');

#=> bool(false)

var_dump(true === 1);

#=> bool(false)

var_dump(true === ['some', 'array']);

#=> bool(false)

// ...and so on.

This is because strict equality accounts for the type of a variable: an object is not the same as an array, which is not the same as an integer nor a boolean.

When we think about equality, this makes sense: should an object—even one without any properties assigned to it—be equivalent to the boolean true?

Loose equality laughs in the face of the Transitive Property

If you think back to grade school math classes, you might recall the Transitive Property of Equality, which states:

If two values are equal, and either of those two values is equal to a third value, that all the values must be equal.

Written algebraically, if a = b and b = c, then a = c.

Sadly, loose equality must have been out sick the day this was taught because the Transitive Property does not apply to loose equality:

123 == true;
#=> bool(true)

true == 'hello, there!';
#=> bool(true)

123 == 'hello, there!';
#=> bool(false)

123 == true;

#=> bool(true)

true == 'hello, there!';

#=> bool(true)

123 == 'hello, there!';

#=> bool(false)

In a world where fundamental mathematical principles don’t apply, you can see how bugs might sneak in!

Loose equality and conditionals

Conditional statements (e.g. if/else) are a foundational part of [nearly] every programming language, and for good reason: if some expression evaluates as true, do this; otherwise, do that:

if ($expr) {
    // $expr is truthy
} else {
    // $expr is falsey
}

if ($expr) {

// $expr is truthy

} else {

// $expr is falsey

}

It’s worth noting that PHP handles conditionals with loose equality, so anything loosely equal to true will evaluate as such:

if (1) {
    // This will always evaluate as true
}

if ('some string') {
    // This will always evaluate as true
}

if (['some', 'array']) {
    // This will always evaluate as true
}

if (1) {

// This will always evaluate as true

}

if ('some string') {

// This will always evaluate as true

}

if (['some', 'array']) {

// This will always evaluate as true

}

However, we can also make our conditionals use strict equality by including strict comparisons in our expressions:

if (true === 1) {
    // This will never be reached
}

if (true === 'some string') {
    // This will never be reached
}

if (true === ['some', 'array']) {
    // This will never be reached
}

if (true === 1) {

// This will never be reached

}

if (true === 'some string') {

// This will never be reached

}

if (true === ['some', 'array']) {

// This will never be reached

}

This works because true === 'some string' is an expression itself, which evaluates as false.

Why is this important? Let’s consider PHP’s strpos() function, which returns the index of a substring within a string, or false if the substring is not found:

$string = 'Hello, there!';

strpos($string, 'there');
#=> int(7)

strpos($string, 'Hello');
#=> int(0)

strpos($string, 'world');
#=> bool(false)

$string = 'Hello, there!';

strpos($string, 'there');

#=> int(7)

strpos($string, 'Hello');

#=> int(0)

strpos($string, 'world');

#=> bool(false)

Observant readers might already see the problem here: the string uses a zero-based index, so the first character has an index of 0. If we’re not careful, we could easily fail to detect the string:

# Try this for yourself: https://3v4l.org/S0eW2
$string = 'Hello, there!';

if (!strpos($string, 'Hello')) {
    echo 'Did not detect "Hello" in string (default)' . PHP_EOL;
}

if (false == strpos($string, 'Hello')) {
    echo 'Did not detect "Hello" in string (loose equality)' . PHP_EOL;
}

if (false === strpos($string, 'Hello')) {
    echo 'Did not detect "Hello" in string (strict equality)' . PHP_EOL;
}

# Try this for yourself: https://3v4l.org/S0eW2

$string = 'Hello, there!';

if (!strpos($string, 'Hello')) {

echo 'Did not detect "Hello" in string (default)' . PHP_EOL;

}

if (false == strpos($string, 'Hello')) {

echo 'Did not detect "Hello" in string (loose equality)' . PHP_EOL;

}

if (false === strpos($string, 'Hello')) {

echo 'Did not detect "Hello" in string (strict equality)' . PHP_EOL;

}

When we run this code, we’ll get the following output:

Did not detect "Hello" in string (default)
Did not detect "Hello" in string (loose equality)

1 2	Did not detect "Hello" in string (default) Did not detect "Hello" in string (loose equality)

Wait, what happened? Since when does “Hello, there!” not contain the substring “Hello”?!

Remember: the integer 0 (the index of “Hello” in the string) is loosely equal to the boolean false, so the conditional statement fails. However, when our conditional is specifically “is the return value strictly equal to the boolean false?” then the condition fails (and thus we don’t print a “Did not detect” message).

Loose equality makes for poor tests

One of the places where loose equality can be most dangerous is in tests, because you’d better be sure that the result you’re getting is exactly what you expected.

Imagine you’re writing tests for the following User model:

<?php

class User
{
    private ?string $email = null;

    /**
     * Set the user's email address.
     *
     * @param string $email The user's email address.
     *
     * @return void
     */
    public function setEmail(string $email): void
    {
        $this->email = $email;
    }

    /**
     * Return the user's email address (if one has been set),
     * otherwise null.
     *
     * @return ?string The user's email address or null.
     */
    public function getEmail(): ?string
    {
        return $this->email ?? null;
    }
}

<?php

class User

{

private ?string $email = null;

/**

* Set the user's email address.

* @param string $email The user's email address.

* @return void

public function setEmail(string $email): void

{

$this->email = $email;

}

/**

* Return the user's email address (if one has been set),

* otherwise null.

* @return ?string The user's email address or null.

public function getEmail(): ?string

{

return $this->email ?? null;

}

Passing around email addresses as strings? Did the author of that code not read my post about value objects?!

Anyway, it’s not uncommon to find tests for a class like this written in this way:

use PHPUnit\Framework\TestCase;

final class UserTest extends TestCase
{
    public function testEmailMethods(): void
    {
        $user = new User();

        $this->assertEquals(
            null,
            $user->getEmail(),
            'Email address should be null by default.'
        );

        $user->setEmail('testuser@example.com');

        $this->assertEquals(
            'testuser@example.com',
            $user->getEmail(),
            'The user email address should have been set.'
        );
    }

    public function testEmailMethodsWithEmptyEmail(): void
    {
        $user = new User();
        $user->setEmail('');

        $this->assertEquals(
            null,
            $user->getEmail(),
            'The user email should be unchanged'
        );
    }
}

use PHPUnit\Framework\TestCase;

final class UserTest extends TestCase

{

public function testEmailMethods(): void

{

$user = new User();

$this->assertEquals(

null,

$user->getEmail(),

'Email address should be null by default.'

);

$user->setEmail('[email protected]');

$this->assertEquals(

'[email protected]',

$user->getEmail(),

'The user email address should have been set.'

);

}

public function testEmailMethodsWithEmptyEmail(): void

{

$user = new User();

$user->setEmail('');

$this->assertEquals(

null,

$user->getEmail(),

'The user email should be unchanged'

);

}

Here we have two tests:

testEmailMethods() ensures that the user’s email starts off null and, after we set one, getEmail() will give us back that value.
testEmailMethodsWithEmptyEmail() verifies that if we try to set an empty email address we still get null (presumably because of validation somewhere)

However, there’s a small but significant difference here: in the second test, the user’s email is not unchanged, as it went from null to an empty string. As it turns out, our User model doesn’t have any sort of validation, so while an empty string is not a valid email address it is a valid string.

While this may seem trivial, perhaps you need to run a report that returns all users that don’t have an email address (e.g. SELECT COUNT(*) FROM users WHERE email IS NOT NULL); be aware that users with empty (but not null) email addresses will not be counted!

Similarly, you may find yourself running into database errors while saving records, especially if you have a unique index on the email column—multiple null values are permitted, but only one user can have the email address of "".

We can save ourselves the need to troubleshoot this later by using PHPUnit’s assertSame() constraint:

use PHPUnit\Framework\TestCase;

final class UserTest extends TestCase
{
    public function testEmailMethods(): void
    {
        $user = new User();

        $this->assertSame(
            null,
            $user->getEmail(),
            'Email address should be null by default.'
        );

        $user->setEmail('testuser@example.com');

        $this->assertSame(
            'testuser@example.com',
            $user->getEmail(),
            'The user email address should have been set.'
        );
    }

    public function testEmailMethodsWithEmptyEmail(): void
    {
        $user = new User();
        $user->setEmail('');

        $this->assertSame(
            null,
            $user->getEmail(),
            'The user email should be unchanged'
        );
    }
}

use PHPUnit\Framework\TestCase;

final class UserTest extends TestCase

{

public function testEmailMethods(): void

{

$user = new User();

$this->assertSame(

null,

$user->getEmail(),

'Email address should be null by default.'

);

$user->setEmail('[email protected]');

$this->assertSame(

'[email protected]',

$user->getEmail(),

'The user email address should have been set.'

);

}

public function testEmailMethodsWithEmptyEmail(): void

{

$user = new User();

$user->setEmail('');

$this->assertSame(

null,

$user->getEmail(),

'The user email should be unchanged'

);

}

Note: Instead of assertSame(null, $actual), you can take advantage of PHPUnit’s assertNull() assertion. In fact, PHPUnit has a whole series of specialized assertions, but that’s a topic for another blog post!

When we run the updated test suite, we’ll see a failure:

1) UserTest::testEmailMethodsWithEmptyEmail
The user email should be unchanged
Failed asserting that '' is identical to null.

1) UserTest::testEmailMethodsWithEmptyEmail

The user email should be unchanged

Failed asserting that '' is identical to null.

By swapping out assertEquals() for assertSame(), our test suite is able to tell us that the code is not behaving how we expect!

Of course, this is just one example. Think about how many places in your codebase a function might return null or false when an error occurs. Should these have the same semantic meaning as 0, "", or []?

As a general rule, you should always default to strict equality checks! With few exceptions, strict equality will produce more-resilient and less-buggy code.

Caveat: strict equality of objects

Given what we’ve discussed so far, what are you having for lunch?:

if (new stdClass() === new stdClass()) {
    echo "It's pizza day!";
} else {
    echo "Who's feeling tacos?";
}

if (new stdClass() === new stdClass()) {

echo "It's pizza day!";

} else {

echo "Who's feeling tacos?";

}

If you were hoping for pizza, I’m sorry to disappoint you: two objects—even if they contain the exact same values—are only strictly equal if they represent the same instance.

Without getting too far into details, each time you instantiate an object (e.g. via the new keyword), a new instance of that class is created, which has its own unique ID and memory allocation:

$obj1 = new stdClass();
$obj2 = new stdClass();

var_dump(
    spl_object_id($obj1),
    spl_object_id($obj2),

    // Different instances are not strictly equal,
    // but *are* loosely equal.
    $obj1 === $obj2,
    $obj1 == $obj2,

    // An instance is strictly equal it itself
    $obj1 === $obj1,
    $obj2 === $obj2
);

#=> int(1)
#=> int(2)
#=> bool(false)
#=> bool(true)
#=> bool(true)
#=> bool(true)

$obj1 = new stdClass();

$obj2 = new stdClass();

var_dump(

spl_object_id($obj1),

spl_object_id($obj2),

// Different instances are not strictly equal,

// but *are* loosely equal.

$obj1 === $obj2,

$obj1 == $obj2,

// An instance is strictly equal it itself

$obj1 === $obj1,

$obj2 === $obj2

);

#=> int(1)

#=> int(2)

#=> bool(false)

#=> bool(true)

This is unique to objects: two empty arrays are strictly equal to one another, as are integers, floats, booleans, strings, and nulls.

However, since no two objects are the same you need to be careful in how you make assertions against them in tests. A few strategies that work well include:

# 1. Compare specific properties (like IDs)
$this->assertSame($expected->getID(), $actual->getID());

# 2. Compare only the underlying data
$this->assertSame(json_encode($expected), json_encode($actual));
$this->assertSame(serialize($expected), serialize($actual));
$this->assertSame($expected->toArray(), $actual->toArray());

# 3. Just use assertEquals()
$this->assertEquals($expected, $actual);

# 1. Compare specific properties (like IDs)

$this->assertSame($expected->getID(), $actual->getID());

# 2. Compare only the underlying data

$this->assertSame(json_encode($expected), json_encode($actual));

$this->assertSame(serialize($expected), serialize($actual));

$this->assertSame($expected->toArray(), $actual->toArray());

# 3. Just use assertEquals()

$this->assertEquals($expected, $actual);

Whichever route you take, remember that strict comparisons produce better code!

Steve Grunwell

Open-source contributor, speaker, and electronics tinkerer

Strict Equality for Better Code

Not all equals are made equal

Loose equality laughs in the face of the Transitive Property

Loose equality and conditionals

Loose equality makes for poor tests

Caveat: strict equality of objects

Leave a Reply Cancel reply

Topics

Latest Posts

Upcoming Talks

Connect

Steve Grunwell

Open-source contributor, speaker, and electronics tinkerer

Strict Equality for Better Code

Not all equals are made equal

Loose equality laughs in the face of the Transitive Property

Loose equality and conditionals

Loose equality makes for poor tests

Caveat: strict equality of objects

Decouple Your Application Code with the Adapter Pattern

For a Great Time, Make it a DateTime

Leave a Reply Cancel reply

Topics

Latest Posts

Upcoming Talks

Connect