Steve Grunwell

Open-source contributor, speaker, and electronics tinkerer

A typewriter with a sheet of paper reading "Equality"

Strict Equality for Better Code

A major focus of my day job right now is cleaning up the PHP in a decades-old monolith. This includes tests written for two different test runners (PHPUnit and an in-house library) by hundreds of engineers over the years.

I could write a book on the horrors I’ve seen (and currently have at least half a dozen blog posts in draft state), but I’m not interested in raking anyone over the coals for past engineering decisions—honestly, it’s to be expected with any project this size and age. Instead, I wanted to take a moment to talk about one of the most prevalent oversights made by engineers of all levels: strict equality.

Not all equals are made equal

In PHP, JavaScript, and many other loosely-typed languages, there are two ways to determine if something is equal:

  1. Loose equality (e.g. ==, or “double equals”)
  2. Strict equality (e.g. ===, or “triple equals”)

How do these compare? The main difference is that loose equality checks don’t account for types when making a comparison; when comparing a string against an integer, the integer will be treated as a string, so 123 == '123'. In fact, all of the following evaluate as true when using loose equality:

If you’re familiar with PHP’s empty() function, those “falsey” values should look familiar, as all of them are considered to be empty.

Now, if we replaced the double equal sign with a triple equals, all of those evaluations would be false:

This is because strict equality accounts for the type of a variable: an object is not the same as an array, which is not the same as an integer nor a boolean.

When we think about equality, this makes sense: should an object—even one without any properties assigned to it—be equivalent to the boolean true?

Loose equality laughs in the face of the Transitive Property

If you think back to grade school math classes, you might recall the Transitive Property of Equality, which states:

If two values are equal, and either of those two values is equal to a third value, that all the values must be equal.

Written algebraically, if a = b and b = c, then a = c.

Sadly, loose equality must have been out sick the day this was taught because the Transitive Property does not apply to loose equality:

In a world where fundamental mathematical principles don’t apply, you can see how bugs might sneak in!

Loose equality and conditionals

Conditional statements (e.g. if/else) are a foundational part of [nearly] every programming language, and for good reason: if some expression evaluates as true, do this; otherwise, do that:

It’s worth noting that PHP handles conditionals with loose equality, so anything loosely equal to true will evaluate as such:

However, we can also make our conditionals use strict equality by including strict comparisons in our expressions:

This works because true === 'some string' is an expression itself, which evaluates as false.

Why is this important? Let’s consider PHP’s strpos() function, which returns the index of a substring within a string, or false if the substring is not found:

Observant readers might already see the problem here: the string uses a zero-based index, so the first character has an index of 0. If we’re not careful, we could easily fail to detect the string:

When we run this code, we’ll get the following output:

Wait, what happened? Since when does “Hello, there!” not contain the substring “Hello”?!

Remember: the integer 0 (the index of “Hello” in the string) is loosely equal to the boolean false, so the conditional statement fails. However, when our conditional is specifically “is the return value strictly equal to the boolean false?” then the condition fails (and thus we don’t print a “Did not detect” message).

Loose equality makes for poor tests

One of the places where loose equality can be most dangerous is in tests, because you’d better be sure that the result you’re getting is exactly what you expected.

Imagine you’re writing tests for the following User model:

Passing around email addresses as strings? Did the author of that code not read my post about value objects?!

Anyway, it’s not uncommon to find tests for a class like this written in this way:

Here we have two tests:

  1. testEmailMethods() ensures that the user’s email starts off null and, after we set one, getEmail() will give us back that value.
  2. testEmailMethodsWithEmptyEmail() verifies that if we try to set an empty email address we still get null (presumably because of validation somewhere)

However, there’s a small but significant difference here: in the second test, the user’s email is not unchanged, as it went from null to an empty string. As it turns out, our User model doesn’t have any sort of validation, so while an empty string is not a valid email address it is a valid string.

While this may seem trivial, perhaps you need to run a report that returns all users that don’t have an email address (e.g. SELECT COUNT(*) FROM users WHERE email IS NOT NULL); be aware that users with empty (but not null) email addresses will not be counted!

Similarly, you may find yourself running into database errors while saving records, especially if you have a unique index on the email column—multiple null values are permitted, but only one user can have the email address of "".

We can save ourselves the need to troubleshoot this later by using PHPUnit’s assertSame() constraint:

When we run the updated test suite, we’ll see a failure:

By swapping out assertEquals() for assertSame(), our test suite is able to tell us that the code is not behaving how we expect!

Of course, this is just one example. Think about how many places in your codebase a function might return null or false when an error occurs. Should these have the same semantic meaning as 0, "", or []?

As a general rule, you should always default to strict equality checks! With few exceptions, strict equality will produce more-resilient and less-buggy code.

Caveat: strict equality of objects

Given what we’ve discussed so far, what are you having for lunch?:

If you were hoping for pizza, I’m sorry to disappoint you: two objects—even if they contain the exact same values—are only strictly equal if they represent the same instance.

Without getting too far into details, each time you instantiate an object (e.g. via the new keyword), a new instance of that class is created, which has its own unique ID and memory allocation:

This is unique to objects: two empty arrays are strictly equal to one another, as are integers, floats, booleans, strings, and nulls.

However, since no two objects are the same you need to be careful in how you make assertions against them in tests. A few strategies that work well include:

Whichever route you take, remember that strict comparisons produce better code!

Previous

Decouple Your Application Code with the Adapter Pattern

Next

For a Great Time, Make it a DateTime

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Be excellent to each other.