43

Github pages sets very aggressive cache headers (Cache-Control: max-age=86400 1 day, Expires 1 month ahead) on all served content.

If you update your pages and push to github, people revisiting the pages who have already got cached copies will not get the new pages without actually cleaning their browser cache.

How can a script running in a page determine that it is stale and force an update?

The steps might be:

  1. determine you are running on github pages: easy, parse window.location for github.com/
  2. determine current version of page: hard, git doesn't let you embed the sha1 in a commited page; no RCS $id$. So how do you know what version you are?
  3. get the current version in github; hard, github got rid of non-authenticated v2 API. And there's a time disconnect between pushing to github and github getting around to publishing too. So how do you know what version you could get?
  4. having determined you're stale, how do invalidate a page and force reload? hard, window.location.reload(true) doesn't work in Safari/Chrome, for example...

So its solve-these-steps; of course there may be another way?

Will
  • 68,898
  • 35
  • 156
  • 231
  • IIRC GitHub only uses caching for the actual pages, not if you access the code of the gh-pages repository through the normal repository view. You might be able to design a script that loads a file from the repository (there are the "raw" versions available, only the code now extra HTML and stuff). That file then should have the timestamp of the lastupdate included. Compare that timestamp to the timestamp shipped through the actual page. If it differs, force reload. I might be wrong on the caching though. – clentfort Sep 23 '12 at 22:26
  • @clentfort exactly as I was imagining; I've broken down what the steps might be in the question now, making clearer what the hurdles are – Will Sep 24 '12 at 05:35
  • Doesn't `max-age=86400` means 1 *day* ahead (and not one *month* ahead)? – JB Nizet Sep 24 '12 at 06:00
  • @JBNizet yes, the cache-control is set to 1 day ahead and the Expires header to 1 month. – Will Sep 24 '12 at 06:07
  • 2
    I'm getting **one day** `Expires` headers for everything. – J. K. Oct 28 '12 at 00:01
  • @IanKuca I'm happily corrected :) – Will Oct 28 '12 at 08:31
  • You can serve pages on your own domain, grepping `window.location` is not a great option. – richo Oct 28 '12 at 13:20

1 Answers1

38

To have a better control of the caching of your website you can use the HTML5 cache manifest. See:

You can use the window.applicationCache.swapCache() to update the cached version of your website without the need for manually reloading the page.

This is a code example from HTML5 Rocks explaining how to update users to the newest version of your site:

// Check if a new cache is available on page load.
window.addEventListener('load', function(e) {

  window.applicationCache.addEventListener('updateready', function(e) {
    if (window.applicationCache.status == window.applicationCache.UPDATEREADY) {
      // Browser downloaded a new app cache.
      // Swap it in and reload the page to get the new hotness.
      window.applicationCache.swapCache();
      if (confirm('A new version of this site is available. Load it?')) {
        window.location.reload();
      }
    } else {
      // Manifest didn't changed. Nothing new to server.
    }
  }, false);

}, false);

To avoid some confusion I'll add that GitHub sets the correct HTTP headers for cache.manifest files:

Content-Type: text/cache-manifest
Cache-Control: max-age=0
Expires: [CURRENT TIME]

so your browser knows that it's a cache manifest and that it should always be checked for new versions.

rsp
  • 91,898
  • 19
  • 176
  • 156
  • this looks exceedingly promising; has this been tested on ghpages? does it work for javascript and artwork hosted there too? – Will Oct 28 '12 at 08:32
  • 1
    @IanKuca Have you actually test it that you say it is useless? Read the Offline Web applications at WHATWG, especially sections [6.7.4 Downloading or updating an application cache](http://www.whatwg.org/specs/web-apps/current-work/multipage/offline.html#downloading-or-updating-an-application-cache) and [6.7.5 The application cache selection algorithm](http://www.whatwg.org/specs/web-apps/current-work/multipage/offline.html#the-application-cache-selection-algorithm). As I understand it, the code from HTML5 Rocks should work. If it doesn't then please provide some more details. Thanks. – rsp Oct 28 '12 at 13:18
  • I'm saying it's useless for what the OP would use it. I love offline web apps and develop them myself actually. – J. K. Oct 28 '12 at 15:49
  • 2
    @IanKuca I'd love to hear **why** exactly it is useless to do what the OP wants to do, which is to force an update of a stale website. Are you saying that the examples from [HTML5 Rocks](http://www.html5rocks.com/en/) are not working **despite** being consistent with [the spec](http://www.whatwg.org/specs/web-apps/current-work/multipage/offline.html) **or** that the spec in sections 6.7.4 and 6.7.5 should be understood differently than I understand it? You didn't answer. – rsp Oct 28 '12 at 16:25
  • Github sets the `Expires` header for the cache manifest which makes the whole app cache usage pointless as the manifest gets updated (from the browser's point of view) at the same minimal rate as would the resources listed in it without the use of the appcache. Let me know if you still need a clarification. (The code is totally fine if you can control the manifest headers but in this case, you cannot.) – J. K. Oct 28 '12 at 16:31
  • 5
    @IanKuca GitHub sets the Expires header of the cache manifest to **the current time** and more importantly sets Cache-Control to **max-age=0** (the max-age directive always overrides the Expires header in HTTP/1.1 - see RFC 2616) so it has already expired when it gets delivered. There's no need to control the manifest headers because they are fine already, including the Content-Type: text/cache-manifest – rsp Oct 28 '12 at 17:14
  • 3
    Oh, my bad. I figured the headers would be same for all non-HTML files. The script is fine then. Sorry. – J. K. Oct 28 '12 at 17:23
  • 1
    Why should you add the listeners on load and not immediately? – netAction Feb 11 '14 at 15:03
  • To anyone running across this in the modern era, as I did: This API is deprecated. If you try to use it in Chrome, it will say: > [Deprecation] Application Cache API use is deprecated and will be removed in M82, around April 2020. See https://www.chromestatus.com/features/6192449487634432 for more details. That webpage will in turn advise you that: > New Web applications should be built around Service Workers. Existing applications that use AppCache should migrate to Service Workers. – Tyler Rick May 13 '20 at 04:37