Predictive Pre-fetch

Browser hints like prefetch enable you to get critical resources in advance and save valuable (next) render time. These speculative optimizations integrate the developers assumptions about the users route. Speculative pre-fetching can be wasteful due to incidences of fetching resources that will never be used.

Leaning on advances in machine learning and analytics data allows us to significantly increase the efficacy of our fetches. Let’s explore techniques that move predictive prefetching from idea to reality.

Predictive Pre-fetch

Divya Sasidharan, Developer Experience Engineer Netlify

Divya currently lives in the United States where people love their french fries… and you usually get two condiments: ketchup or mayo. Since most Americans prefer ketchup, you get it by default – you don’t have to ask for it.

It’s a frictionless, nice experience… and that’s what we want on the web too. If apply it to users browsing the web, they should just get the content they want without having to constantly ask. This is prefetching – having the browser fetch information before the user specifically requests it.

To set the scene, a run through of how websites load – from request, through the connection dance, so data can be transferred. That’s one level of latency. The next level is fetching the resources – HTML comes down first, then other resources. This all adds time to rendering that the user can see. This is basically the status quo.

Divya will be focusing on…

  • DNS Prefetching
  • Link Prefetching
  • Prerendering

DNS Prefetching is the first and simplest level. It establishes a connection for future requests to use.

Link Prefetching goes one step further, fetching priority resources that the developer has specified are important to load ahead of time.

Prerendering is the most extreme form of prefetching – it fetches an entire page and rendering it ahead of time, in a different virtual layer. This makes navigation instantaneous… assuming they go to that page next.

Hint Cost if wrong Benefit if correct
DNS Prefetch Very low Low
Link Prefect Mid high High
Prerender Very high Very high

It is useful for us to consider the both the cost and benefits of these techniques.

Some use cases give more predictability; eg. the top results on a search page, or the post-login screen of an application.

These techniques all assume that we as devs know what the user is going to do… but we are speculating. Users aren’t really that predictable.

So a predictive approach is better.

An illustrative example is predicting weather – you predict future days based on past days. If you know that 80% of the time a cloudy day is followed by another cloudy day, you can make a reasonable prediction if today is cloudy.

Translating this to a website, you can look at user statistics to find the most common sequences. On a restaurant website you might find 50% of users go to the menu.

Google Analytics gives you a navigation summary that is useful for this; but Divya finds viewing raw data is more useful than pre-processed data, where people have assumed they know what you wanted.

You can consider not just pages people visited; but also where people exited and didn’t load another page at all.

So how to integrate this with your page? Build automation – Divya has it set up to query GA during the build and update the values in 11ty. It’s better than permanently hard-coding them.

Blog post: The subtle art of predictive prefetching

The result is a JSON asset that gives a set of paths and certainty values. So the prefetching can be set according to certainty thresholds.

(Demo showing that menu is served from prefetched cache.)

This isn’t a new concept, this was pulled from guess.js – a project by the Google Chrome team. It makes a lot of this much easier to implement.

This is still a naive implementation with some hard-coded paths. It would be better to have more data, to make better predictions.

  • Thinking of the weather example, a month’s data will give different results than just a week’s data.
  • Cookie based tracking can enable predictions customised to a specific user’s habits.
  • Looking at more levels of navigation will reveal more detailed patterns.

It’s important to note with cookie-based tracking you might run afoul of things like the GDPR. You will need to ensure you handle all the required opt-ins and so on.

But if it is an option, it can shift from compile-time predictions to real-time predictions; which will be a better experience.

Another lens to help decisions is the user’s connection, including disabling prefetch if people are on data saver.

Bandwith Threshold Recommended
Slow 2G 0.9 DNS/Link Prefetch
2G 0.9 DNS/Link Prefetch
3G 0.5 DNS/Link Prefetch
4G 0.2 Link Prefetch
Data Saver 0 null

Also referring back to the cost table from earlier provides good guidelines.

@shortdiv