Lessons learned on offline capabilities with service workers using Workbox

Lessons learned on offline capabilities with service workers using Workbox

Posted on

Target audience

This post is targeted to webdevelopers that are creating offline-capable applications and more specifically offline-capable applications that can be categorized as SPAs or Single Page Applications. I will share some of my lessons learned, not just to share them, but also to maybe get feedback from interested readers that might have found other ways or used other methods to solve the problems described.

What will we discuss in this post?

  1. Serve the index.html for all deeplinks (while offline)
  2. Don't just blindly use skipWaiting() or clients.claim()
  3. Be careful with what you cache in the browser
  4. Avoid using global state in the service worker

All the example code used in this post, just as this complete website, is available on Github here as an example project for you to test and try-out.

Serve the index.html for all deeplinks (while offline)

While developing SPA applications we often, if not always, leverage the power of routing. When we go offline we need to make sure that all routes keep working. An example:

  1. The entry point of a visitor of your application was the /about page. The server knows that for every hit on your website, it has to return the index.html file.
  2. After a while, this user returns to your website, but he enters directly via the /contact page with a refresh or via some link on any other website.
  3. For some reason your application is now offline, or the user has lost connectivity during this navigation. Your service worker kicks in, tries to lookup the /about page in the cache, but has no hit, because the only static file cached is the index.html file.
  4. Your service worker somehow needs to know which file to service for these type of requests

Using Workbox this can easily be done by checking if the request was of mode "navigate" and if so, serve the single index.html file. This index.html file should already be cached with workbox as a static asset so we request it from the precache.

sw.js

// default page handler for offline usage,
// where the browser does not how to handle deep links
// it's a SPA, so each path that is a navigation should default to index.html
workbox.routing.registerRoute(
  ({ event }) => event.request.mode === 'navigate',
  async () => {
    const defaultBase = '/index.html';
    return caches
      .match(workbox.precaching.getCacheKeyForURL(defaultBase))
      .then(response => {
          return response || fetch(defaultBase);
      })
      .catch(err => {
        return fetch(defaultBase);
      });
  }
);

This logic gives you quicker offline and online usages, because even when your user is online the service worker will use the cached index.html, when for example opening a new tab with a different page of your application.

Shortcut available

Workbox gives you a shortcut and extra options for this behaviour, like whitelisting and blacklisting by exposing the registerNavigationRoute handler on the routing module.

sw.js

workbox.routing.registerNavigationRoute(
  // Assuming '/single-page-app.html' has been precached,
  // look up its corresponding cache key.
  workbox.precaching.getCacheKeyForURL('/single-page-app.html'), {
    whitelist: [
      new RegExp('/blog/')
    ],
    blacklist: [
      new RegExp('/blog/restricted/'),
    ]
  }
);

Please also keep in mind that this reflects the behavior of the server in case of a SPA setup. For all requests not matching a static resource like an image, or a dynamic resource like an API, the server should return the index.html file!

Don't just blindly use skipWaiting() or clients.claim()

Using skipWaiting() or clients.claim() should not be done blindly and without thought. Unfortunately there are many articles on the internet that just instruct you to do it without much more information. The fact that you have to be careful about this is also clearly mentioned in "Service Worker Lifecycle", an article by Jake Archibald.

Note: I see a lot of people including clients.claim() as boilerplate, but I rarely do so myself. It only really matters on the very first load, and due to progressive enhancement the page is usually working happily without service worker anyway.

What could possibly go wrong?

There are several scenarios where just calling skipWaiting() or clients.claim() blindly in your service worker could be the cause of unexpected and harmful side effects.

One possible scenario is that the code functionality in your application could not match anymore with the logic in your service worker. So it might impose a breaking change. Let's consider a short step-by-step example:

  1. The browser has your application code, including the code that registers your service worker and the service worker, cached
  2. When loading the application, this cached content is served and loads your new service worker with updated functionality, updates the precached content and finally skips the waiting process and claims the clients available
  3. Your user now has the old application loaded in his browser, but using the new service worker functionality, which is potentially problematic!

So it's important to always make sure that the loaded version of the logic of your application is working nicely together with the loaded version of the logic of your service worker.

Service worker active during first-time usage of your application

Not calling skipWaiting() or clients.claim() could be problematic for your usecase, especially if you want your service worker to be active from the very first time you load your application. Let's take a basic example to explain this situation:

  1. Your application imports a 3rd party library that sends requests to a server. The URLs these requests have to be sent to are configurable, but the headers used for these requests aren't. You may want to add an Authorization header to each of these requests made.
  2. One of the possible strategies to use could be to setup a service worker that acts as a proxy and adds this header to all relevant requests.
  3. This functionality has to work from the very first time, but you might think this is not possible without skipWaiting() and/or clients.claim().

The only way I found so far is by checking if your page already has a service worker that is activated, or is in the process of changing his state to activated, but no new version of the service worker is waiting to be activated. In that case we can assume there was no previous service worker available. Now we can automatically reload the page as soon as the service worker activated.

service-worker-registration.ts

// Check that service workers are available
if ('serviceWorker' in navigator) {

    // Use the window load event to keep the page load performant
    window.addEventListener('load', () => {

    navigator.serviceWorker
      .register('/sw.js')
      .then(registration => {

        if (navigator.serviceWorker.controller) {
          // let the application know our service worker is ready
          window['serviceWorkerReady'] = true;
          window.dispatchEvent(new CustomEvent('service-worker-ready'));
        }

        // A wild service worker has appeared in reg.installing and maybe in waiting!
        const newWorker = registration.installing;
        const waitingWoker = registration.waiting;

        if (newWorker) {
          if (newWorker.state === 'activated' && !waitingWoker) {
            // reload to avoid skipWaiting and clients.claim()
            window.location.reload();
          }
          newWorker.addEventListener('statechange', (e) => {
            // newWorker.state has changed
            if (newWorker.state === 'activated' && !waitingWoker) {
              // reload to avoid skipWaiting and clients.claim()
              window.location.reload();
            }
          });
        }

      })
      .catch(err => {
        console.log('service worker could not be registered', err);
      });

  });

}

The drawback is that the first time-use might show a flicker. This side-effect can be minimized by delaying the bootstrapping of your application until you know that a service worker is active and show a loading / splash screen instead. This is anyway already a requirement, as explained before.

Please let me know if you have other suggestions! I'm curious for other methods because of the refresh I'm not very happy about..

Be careful with what you cache in the browser

When building your application and thinking about its offline strategies you should definitely consider what you are gonna cache offline. Why? Because not all browser and devices have the same capabilities in terms of storage. The following table shows what the available quota is per browser and per origin.

BrowserLimit
Chrome<6% of free space
Firefox<10% of free space
Safari<50MB
IE10<250MB
EdgeDependent on volume size

All credits for this overview go to the authors of the article Offline Storage for Progressive Web Apps

This will most probably always be enough space for applications running in a browser on a desktop or laptop. But if you're building a content-heavy application on a tablet or phone with limited storage, this might quickly become a problem..

Imagine a tablet with 32GB of space, while there is only 3GB left as available space, because of for example pictures, music and captured videos. This means that for your application, on this tablet there is only 6% available of 3GB, which is ~184 MB.

While 184 MB might still seem like a great enough amount of space, you have to consider the possibility it might not be enough, especially if you are caching large files or assets from other domains (CORS) or opaque responses. Dealing with opaque requests can be tricky as highlighted in the article When 7 KB Equals 7 MB by Gerardo Rodriguez

In general, we want to avoid hitting a DOMException: Quota exceeded as this might potentially break our complete service-worker functionality or offline capabilities.

Workbox allows us to cleanup our caches automatically using various methods. We can invalidate caches for a specific route handler after an amount of time and we can gracefully purge our cache on quota errors.

sw.js

// Cache the underlying font files with a cache-first strategy for 1 year.
workbox.routing.registerRoute(
  /^https:\/\/fonts\.gstatic\.com/,
  new strategies.CacheFirst({
    cacheName: 'google-fonts-webfonts',
    plugins: [
      new workbox.cacheableResponse.Plugin({
        statuses: [0, 200]
      }),
      new workbox.expiration.Plugin({
        maxAgeSeconds: 60 * 60 * 24 * 365,
        maxEntries: 30,
        purgeOnQuotaError: true // Automatically cleanup if quota is exceeded.
      })
    ]
  })
);

Avoid using global state in the service worker

I've learned this one the hard way, by debugging and debugging and debugging till I got crazy and started searching the internet. Let me first explain my usecase, which is in fact the origin of the Authorization header example above.

  1. The application that we are developing includes 3rd party libraries like openlayers to give us very nice dynamic map functionalities.
  2. The URLs this library uses to load the map-tiles is configurable, but the headers aren't. We needed to be able to add an Authorization header on the requests made to these URLs and using a service-worker for this seemed reasonable.
  3. As soon as we log in, we pass the token to the service worker, using the BroadcastChannel API and we store it in a global variable let oAuthToken.
  4. Every time our service worker sees a request for the map-tiles, we proxy the requests and add the token as a header to the request so our server/backend can successfully authorize the request.

Now what was going wrong?

It seems that service workers are potentially stopped and restarted multiple times in their lifetime because of several optimizations like battery saving algorithms. Stopping and restarting destroys and reinitializes the global scope of our service worker! So our global variable holding the token was gone, resulting in requests that did not authorize properly.

So how to correctly share the OAuthToken with the service worker?

The only way to successfully share the OAuthToken from our application logic with our service worker logic is by using the IndexedDB API. The library I use to simplify IndexedDB access in my service worker is idb-keyval. If you need more advanced logic you can checkout idb.

In our example below, we assume that somehow our application has saved a token in the IndexedDB. Using workbox we register a RegExp for any route we want to authenticate using our token. Let's go trough the example, step by step:

  1. We register a route, in this case to trap all the requests going to map.png
  2. In the capture function of the route registration, the first parameter, a check is done to see if we have a token available in the IndexedDB.
  3. If a token is available, we modify the request in the handler, the second parameter, and add the token as an Authorization header.
  4. If no token is available, we return a replacement resource, the not_authorized.png image, which has been cached already in our precaching setup.
// OAuth header interceptor
workbox.routing.registerRoute(
  ({ url }) => {
    return /map\.png/.test(url);
  },
  async ({ event, url }) => {

    // get the eventual token
    const customStore = new Store('swl-db', 'swl-db-store');
    const oAuthToken = await get('token', customStore);

    // if token available, set it as the Authorization header
    if (!!oAuthToken) {
      const modifiedHeaders = new Headers(event.request.headers);
      modifiedHeaders.set('Authorization', oAuthToken);
      const overwrite = {
        headers: modifiedHeaders
      };
      const modifiedRequest = new Request(url.toString(), overwrite);
      return fetch(modifiedRequest);
    }

    const defaultNotAuthedBase = '/assets/not_authorized.png';
    return caches
      .match(workbox.precaching.getCacheKeyForURL(defaultNotAuthedBase))
      .then(response => {
        return response || fetch(defaultNotAuthedBase);
      })
      .catch(err => {
        return fetch(defaultNotAuthedBase);
      });

  }
);

A possible extension for this functionality would be to check if the status code of the response coming from the server is as expected, for example a 200 OK. If not a valid or expected response code, return the cached not_authorized.png image.

Conclusion

When developing SPAs with offline capabilities, we need to make sure that our service worker is acting the same way as our server. For routing requests or navigations, the service worker should always return the index.html to keep deeplinking working when offline.

Don't just blindly use skipWaiting() or clients.claim(), but think about the possible side effects it could introduce. Instead, let the user decide when to upgrade the functionality or let a new version be applied automatically on a sub-sequent load of your application.

Be careful with what you cache offline and which devices your application will be used upon. The available space for your application depends on the free space of the device and the browser where the application is loaded.

Don't depend on global variables or state in your service worker as service workers are potentially destroyed and bootstrapped several times during their lifetime. Use IndexedDB to share state from your application with your service worker.

Further reading

  1. Service Worker Caching Strategies Based on Request Types
  2. Workbox documentation
  3. The Service Worker Lifecycle
  4. Service workers tips
  5. Offline storage for Progressive Web Apps
  6. When 7 KB Equals 7 MB
  7. When does code in a service worker outside of an event handler run?
  8. High-performance service worker loading

Special thanks to

for reviewing this post and providing valuable and much-appreciated feedback! I'm open for any other feedback so please let me know!

By reading this article I hope you can find a solution for your problem. If it still seems a little bit unclear, you can hire me for helping you solve your specific problem or usecase. Sometimes even just a quick code review or second opinion can make a great difference.