Chrome, the Background Sync API and exponential backoff

The Background Sync API promises to dramatically improve the web browsing experience for users who go offline or are on crappy connections. Where ServiceWorker has given us the ability to cache GET requests already, this newer API extends the offering to help with offline user input also.

When looking at using this feature, you might be interested in making use of retries with exponential backoff, handled by the browser. In this post, we’re going to explore how this behaves.

If you’re just looking for the conclusions, not how I got there, you can jump straight to the end.

Wait, isn’t this documented anyway?

As far as I can see the answer is no. When searching for information on what Chrome’s retry strategy¹ entails, I always land back at the page linked above, which tells us about the existence of this functionality, but nothing about how it actually behaves:

1
2
3
4
5
self.addEventListener("sync", function(event) {
  if (event.tag == "myFirstSync") {
    event.waitUntil(doSomeStuff());
  }
});
 
And that’s it! In the above, doSomeStuff() should return a promise indicating the success/failure of whatever it’s trying to do. If it fulfills, the sync is complete. If it fails, another sync will be scheduled to retry. Retry syncs also wait for connectivity, and employ an exponential back-off.

Why does it matter?

Let’s say a user on a reliable but seriously slow connection is a trying to use a messaging web app. You can make use of Background Sync, and its retries, to handle the submission of a message failing due to a timeout (or similar):

User attempts to send a message;
The message is intercepted by the app’s JavaScript, which calls ServiceWorker’s sync;
As the user has a connection, the sync event fires immediately, calling waitUntil();
The attempt to POST the message fails due to the user’s slow connection;
This request is retried until successful, with an exponential backoff.

Now, for this particular app, we might care about how that backoff behaves. Unless we know this, it’s hard for us to say whether this behaviour will fit our requirements:

Perhaps messages should ideally come through in a timely manner; when a request fails, it should be retried quickly;
Even if the delay ends up being large, due to many failed attempts, we might always want the message to arrive eventually.

This being the case, in order to figure out how retries actually behave, let’s experiment with a basic adaptation of the code shown in Google’s example, and draw conclusions from that.

Test application

We’re going to put together an app which expands only slightly on the example code shown in Google’s post, in order to keep things simple. To do so we need only a small HTML file, and accompanying js for the service worker:

index.html

1
2
3
4
5
6
7
8
9
10
11
12
13
14
<html>
  <head>
    <script>
      navigator.serviceWorker.register("/sw.js");

      navigator.serviceWorker.ready.then(function(swRegistration) {
        return swRegistration.sync.register('myFirstSync');
      });
    </script>
  </head>
  <body>
    <p>Service worker waitUntil() test</p>
  </body>
</html>

sw.js

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
self.addEventListener('activate', function(event) {
  console.log("activate");
});

self.addEventListener("sync", function(event) {
  console.log(event);

  event.waitUntil(doSomeStuff());
});

function doSomeStuff() {
  console.log(new Date(Date.now()));

  return Promise.reject(new Error("fail"));
};

Here, we’ve got the setup outlined in the BackgroundSync introduction, with some added logging. We’ve defined doSomeStuff() to always return a failed Promise, which should bubble up to waitUntil(), and so schedule a retry.

We can run this with Python’s SimpleHTTPServer:

1
python -m SimpleHTTPServer

Results

Now when loading this up in the browser, we can watch the log to see the retries in action. After waiting a while, the console shows the following:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
// Navigated to http://localhost:8000
activate
SyncEvent {isTrusted: true, tag: "myFirstSync", lastChance: false, type: "sync", target: ServiceWorkerGlobalScope, …}
Mon Feb 12 2018 11:04:58 GMT-0600 (CST)
Uncaught (in promise) Error: fail
    at doSomeStuff (sw.js:14)
    at sw.js:8
SyncEvent {isTrusted: true, tag: "myFirstSync", lastChance: false, type: "sync", target: ServiceWorkerGlobalScope, …}
Mon Feb 12 2018 11:09:58 GMT-0600 (CST)
Uncaught (in promise) Error: fail
    at doSomeStuff (sw.js:14)
    at sw.js:8
SyncEvent {isTrusted: true, tag: "myFirstSync", lastChance: true, type: "sync", target: ServiceWorkerGlobalScope, …}
Mon Feb 12 2018 11:24:58 GMT-0600 (CST)
Uncaught (in promise) Error: fail
    at doSomeStuff (sw.js:14)
    at sw.js:8

Here we see our 1st sync upon opening the page, which fails. 5 minutes later, we have a 2nd attempt, which again fails. Finally, 15 minutes after this, we see the 3rd attempt.

Looking more closely at this 3rd attempt, we can see it has lastAttempt set to true. This tells us that if we attempt a single sync 3 times, and fail each one, we will never attempt it again.

This is what I came to find out. However, I’m keen to see how this plays with sync’s offline handling also.

Backoff and offline handling

I’ll run the test once more, first loading the page while disconnected from the network, and then reconnecting:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
// wifi off
// Navigated to http://localhost:8000
activate
// wifi on
SyncEvent {isTrusted: true, tag: "myFirstSync", lastChance: false, type: "sync", target: ServiceWorkerGlobalScope, …}
Mon Feb 12 2018 11:37:49 GMT-0600 (CST)
Uncaught (in promise) Error: fail
    at doSomeStuff (sw.js:14)
    at sw.js:8
SyncEvent {isTrusted: true, tag: "myFirstSync", lastChance: false, type: "sync", target: ServiceWorkerGlobalScope, …}
Mon Feb 12 2018 11:42:50 GMT-0600 (CST)
Uncaught (in promise) Error: fail
    at doSomeStuff (sw.js:14)
    at sw.js:8
SyncEvent {isTrusted: true, tag: "myFirstSync", lastChance: true, type: "sync", target: ServiceWorkerGlobalScope, …}
Mon Feb 12 2018 11:57:50 GMT-0600 (CST)
Uncaught (in promise) Error: fail
    at doSomeStuff (sw.js:14)
    at sw.js:8

Initially, we see no event logged. Shortly after turning the wifi back on, however, we see the 3 events, as before.

However, I’m still not completely satisfied. What happens if I’m on a dodgy connection, the 1st sync fails, then I lose my connection entirely while the retry is scheduled?

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
// Navigated to http://localhost:8000
activate
SyncEvent {isTrusted: true, tag: "myFirstSync", lastChance: false, type: "sync", target: ServiceWorkerGlobalScope, …}
Mon Feb 12 2018 12:18:00 GMT-0600 (CST)
Uncaught (in promise) Error: fail
    at doSomeStuff (sw.js:14)
    at sw.js:8
// wifi off
// Wait until sometime after 12:23:00, then...
// wifi on
SyncEvent {isTrusted: true, tag: "myFirstSync", lastChance: false, type: "sync", target: ServiceWorkerGlobalScope, …}
Mon Feb 12 2018 12:23:11 GMT-0600 (CST)
Uncaught (in promise) Error: fail
    at doSomeStuff (sw.js:14)
    at sw.js:8
SyncEvent {isTrusted: true, tag: "myFirstSync", lastChance: true, type: "sync", target: ServiceWorkerGlobalScope, …}
Mon Feb 12 2018 12:38:33 GMT-0600 (CST)
Uncaught (in promise) Error: fail
    at doSomeStuff (sw.js:14)
    at sw.js:8

As we can see, no matter whether the connection is out during an explicitly triggered sync, or an automatic retry, the event fires once we have a connection and any subsequent events fire at the same 5 or 15-minute intervals after this.

In conclusion

So here’s the rundown of how background sync’s exponential backoff currently behaves in Chrome:

When sync fires initially, upon failure a 2nd sync is scheduled for 5 minutes later;
If a 2nd sync fails, a 3rd sync is scheduled for 15 minutes later;
If a 3rd sync fails, it will not be reattempted;
If Chrome is offline when a sync is expected, it will hold off and fire again once it has a connection.

What does this mean in practice?

My feeling is that this might make waitUntil() unsuitable for certain applications. With the messaging app example given above, a user on a slow connection might see a very large delay between submitting a message and it being received, or they might never see the latter occur.

Perhaps that’s OK. Likely it depends on your application, and likely there are workarounds you can build in while still making use of this API. In order to make that call though, you’ll need to know how it will behave out of the box—if you succeed where I’ve failed in finding that information somewhere, I’d be glad to know!

At time of writing, this API is still in development for Firefox and Edge. It’s possible their implementations here will behave differently. ↩