Steve Grunwell

Open-source contributor, speaker, and electronics tinkerer

Using Git Checksums to Invalidate Browser Caches

I’ve been having a lot of fun with the new, open-source version of this site and have been looking for opportunities to experiment with things I’d have little reason to do on client sites. For instance, I wanted to see if I could get the current git checksum as a string using PHP (without resorting to shell commands) so that I could display it in the comments of my WordPress template. Once I was able to achieve that, I took it one step further and used it to invalidate browsers’ cached versions of my scripts and styles anytime I made an update to the theme.

Getting the hash

Getting the current revision out of your .git directory isn’t hard, you just need to know where to look. As I understand it, .git/refs/heads is used to track the current HEAD of each branch of a git repository. If you open .git/refs/heads/master, you should see the SHA-1 checksum (hash) for the current commit within the master branch. An easy way to read this into a string would be:

Cache busting

Typically when a browser caches an asset like a javascript file or a stylesheet it depends on that cached file having the same URI. Since these types of files rarely uses query string arguments (?foo=bar), developers commonly append version numbers (ideal), the current date (not-so-ideal), or other data (ideality varies) that can be changed to invalidate a cache, forcing the browser to download the latest version. In my opinion the version number would be the ideal query string amendment; if a new version of the script is pushed, the browser should download that version and forget about its cached copy. Realistically a lot of sites probably don’t have numbered versions for site files – that’s where our git commit can come in handy.

Using the query string concept and our previous example, we could append the current git commit to the end of our assets – in this case, a stylesheet called mystyles.css – so that browsers will download a fresh copy of these assets anytime the current HEAD changes:

The only major drawback here is that every asset with the current commit appended to it will be invalidated whenever the current HEAD changes, even if that file was unaffected. I’d argue that it’s still better than a datestamp (which would cause browsers to re-download whenever the date changes). Like anything on the web, do what will be best for you and your visitors.

Tying this into WordPress

I’ve expanded our little file_get_contents() call into something a little more robust, which allows us to specify a branch and optionally truncate the returned commit to a particular length:

This will see if the GRUNWELL_CURRENT_GIT_COMMIT constant is defined and, if it isn’t, will read .git/refs/heads/{BRANCH NAME} and save the value to the aforementioned constant. Using the constant allows us to a) avoid reading the file every time we want the hash, b) not have to use global variables, and c) prevent the value from being overwritten – if the hash changes mid-page load, something much more serious is up. As before, if you don’t keep your .git directory in the document root, you may need to adjust this to work in your environment.

To make this useful within WordPress, we can pass this hash as the $version argument in the script/style registration and enqueue functions (wp_register_script(), wp_enqueue_script(), etc.). This will append ?ver={HASH} to the end of external scripts and styles, which should be enough to force browsers to refresh (unless they’ve already grabbed the latest).

If our “mystyles.css” file were in our WordPress theme “my-theme”:

Current commit shortcode

Another fun (and I mean that in the geekiest sense of the word) thing you could do is build a “current commit” shortcode so you can dynamically output the current site revision:

Wrapping up

There are a few practical (and a million geeky) things to do with the current commit’s hash, cache busting just happens to be one of them. What else can you think to do (or have you done before) with the hash?

Additional resources

Previous

Using Advanced Custom Fields for WordPress

Next

Quick Tip: Set the Default Display Name for WordPress Users

2 Comments

  1. Great post. We will be testing out these techniques in the coming days. Another option is to use file mod time, rather than the current time.

  2. Thanks for this idea. An improved version of it may be using last commit hash of the file. This way you don’t flush cache if you commit something else. See http://stackoverflow.com/questions/4784575/how-do-i-find-the-most-recent-git-commit-that-modified-a-file

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Be excellent to each other.