When is filename based browser cachebusting actually needed ?

Question

Currently, we're using a method to bust browser cached resources like css and js in a way similar to SE: https://meta.stackexchange.com/questions/112182/how-does-se-determine-the-css-and-js-version-parameter

Anyways, after doing some testing with the HTTP Headers, i'm wondering when this is actually necessary. Is this just a relic left over from the 90s, or are there modern browsers that can't read the Last-Modified or ETags HTTP headers ?

It does come in handy when your data is pushed out to a CDN, and it's tough to invalidate a large, distributed cache when data changes. — Mike Christensen, Dec 07 '12 at 18:46

Andrew Martinez · Accepted Answer · 2012-12-08T22:01:09.513

Caching Issues

When you are attempting to server JS or CSS that is volatile and you don't want to/can't (e.g. using a CDN) rely on HTTP cache directive headers to make the browser request the new files. Some older browsers don't respond to HTTP cache directives; so if you are targeting them you have limited options. Baring older browsers some proxy servers strip or invalidate or ignore proxy information because they are buggy or they are acting as aggressive caches. As such using HTTP cache control headers will not work. In this case you are just ensuring your end users don't get odd functionality until they hit F5.

Volatile JS/CSS resources can come from files/resources that are editable through an administration/configuration panel. Some reasons for this are theming, layout editing, or language definition files for internationalization.

HTTP 1.0

There are legacy systems out there that use it. Consider that Oracle's built-in HTTP server (the EGP gateway) in their RDBMS solution still uses it. Some proxies translate 1.1 requests to 1.0. Ancient browsers still only support 1.0, but that should be a relatively non-issue these days.

Whatever the case, HTTP 1.0 uses a different set of control mechanisms that were "primitive" compared to HTTP 1.1's offering. They included a lot of heuristics testing that wasn't specified in the RFC to get caching to work reasonably well. In either case, caching would often cause odd behavior due to stale content being delivered or the same content being requests with no change.

A note on pragma:no-cache

Works only on REQUESTS not RESPONSES; a common thing people don't know. It was meant to keep intermediate systems from caching sensitive information. It still has backwards support in HTTP 1.1, but shouldn't be used because it is deprecated.

...except where Microsoft says IE doesn't do that: http://support.microsoft.com/kb/234067

Input For Generated Content

Yet another reason is JS or CSS that is generated based on input parameters. Just because the URL includes somefile.js does not mean that needs to be a real file on a file system. It could just be JS that is output from a process. Should that process need to output different content based on parameter, GET parameters are good way to make that happen.

Consider page versioning. In large applications where pages may be kept for historical or business requirements, it allows the same named resource to exist, but should a specific version be needed it can be served as needed. You could just save each version in a different file or you could just create a process that outputs the right content with the correct version changes.

Old Browser Issues

In IE6, AJAX requests would be subject to the browser cache. If you were requesting a service you did not have control over with a URL that didn't change, adding a trivial random string to the URL would circumvent that issue.

Browser Cache Options

If we consider the RFC on HTTP 1.1 for user agent cache settings we also see this:

Many user agents make it possible for users to override the basic caching mechanisms. For example, the user agent might allow the user to specify that cached entities (even explicitly stale ones) are never validated. Or the user agent might habitually add "Cache- Control: max-stale=3600" to every request. The user agent SHOULD NOT default to either non-transparent behavior, or behavior that results in abnormally ineffective caching, but MAY be explicitly configured to do so by an explicit action of the user.

Altering the URL for versioning of resources could be considered a counter measure to such an issue. Whether you believe it is worthwhile I will leave up to the reader.

Conclusion

There are reasons to add GET parameters to a file request, but realistically the only reason to do that now (writing as of 2012) is to supply input parameters for dynamically generated scripts and overcoming issues where you can't control the cache headers.

Personally I only use for providing input parameters to scripts that dynamically output initialization scripts, but like everything in development there is always some edge case that adds reason.

Found a List of browsers that don't respect the cache (http://stackoverflow.com/questions/49547/making-sure-a-web-page-is-not-cached-across-all-browsers) Internet Explorer versions 6-8 ; FireFox versions 1.5 - 3.0 ; Safari version 3 ; Opera 9 ; — Wade Williams, Dec 07 '12 at 19:35

When is filename based browser cachebusting actually needed ?

1 Answers1