Deep dive into third-party performance

Simon Hearne at performance.now() 2019

Transcript

(audience claps) - [Tammy] Thank you.

- All right.

I used to think I had two mortal fears.

One of them is speaking at a conference directly after Tim Kadlec, fear faced.

The other is going on stage without any shoes on, no that's not a fear.

The other fear thalassophobia, anyone know what thalassophobia is? It's the fear of deep open water.

So I was originally going to make this talk a themed talk going deeper into water, looking at different fishes, guppies, squids. But as I'm now facing my third, new found fear, of speaking at a conference with a terrible head cold, I think I'm gonna leave the nautical theme here. If I do collapse on stage, don't be too surprised, but please do call a doctor. (audience laughs)

So let's talk about third parties.

We've been teed up quite well in this conference, there's been a lot of discussion around third parties at a very kind of small level, touching the surface, another nautical pun, yes.

But what I want to talk about is third parties as content that's served from a domain or origin that's outside of your control.

That's how I would define a third party.

You'll have them on your sites.

You'll have analytics, advertising, optimisation, tracking, all sorts of weird and wonderful stuff.

Tag managers that load tag managers, because why not. But this I what I'm focusing on.

Again, a bit like Tim, I looked at the Almanac and then copied the query and then ran it for October's data. One of the queries I used in this talk cost me 50 euros. I'll leave it as an experiment for you to work out which one it was.

But the median website is, or has 37% third party requests. That's not that's terribly surprising.

When you see websites that have multiple hundreds of third party requests that gets a bit more exciting. So Pat mentioned earlier, cnn.com, so I quickly updated my slides to include edition.cnn.com. What you can see down here is 191 requests. Now as Harry knows, it's almost impossible to screenshot a waterfall that long, so I didn't bother. If you can see the scroll bar, oh you can, so the scroll bar up there, gives you an indication of how far down this goes. That's a single run test.

And I want you to note the first CPU idle time. So down here we've got 19 seconds first CPU idle. Who thinks that's a good time? (audience laughs) No publishers in the house, good.

So one thing that'll you will notice relatively recently in the last year or so, is RequestMap.

I wrote six years ago.

It was a Christmas break, I had a day, where I didn't have any work to do, so I wrote this horrible hack, that looks good, but it depends entirely on the data that we get from webpage test, and it sticks them together in a diagram.

So this is cnn.com's RequestMap.

A pro tip is that little cog icon up there, lets you change the theme.

I'm very much a dark theme leet hacksaur, so I have to have it in dark.

But what we'll notice is these little clusters. And those clusters are tags calling other tags, and that makes to really tricky to manage as a software engineer, front end developer, delivery, SRE. And we've been talking about this for quite a while. Andy and I were just discussing we gave a talk on third part performance at Velocity Barcelona in 2014. Five years ago.

And back then it felt kind of novel.

Don't worry about TTFB, that's done, Steve's written that book.

Don't worry about JavaScript it can't get any bigger, surely.

(audience laughs) Third parties are where we should focus.

Or at least should be part of our focus.

And it doesn't feel like much has changed, if you look at the stats over time, we're including more third party content.

That query failed to run because it took so long. But it can feel a bit like us versus them.

And it's something that Tammy mentioned right at the beginning of the conference, we act like police sometimes, as performance engineers, consultants, front end engineers, and we tell people off rather than providing better solutions.

And us and them in this case are people that are writing code on the website versus people who are adding tags on the website.

Now there's a reason that tag managers were invented. It's to get marketing tags, that business process, get the engineers outside of that critical process. So a tag manager allows us to stick stuff on the website outside of a release.

Don't need to ship new code, just ship a new tag. But what that kind of means is that as everyone who works in a web product shares ownership of security and performance, we're giving people in non technical backgrounds access to technical stuff. And that's not necessarily a bad thing, but people have to know that they have this responsibility. We don't have good tooling in place to help them. So you are here, which means you have at least a passing interest in web perf.

But a question for the audience, how many of you touch code as part of your job? Front end engineer, developer, whoa, 99%? I hesitate to ask, anyone here not touch code as part of their job? One, two, three, four, five.

You're the them, welcome.

(audience laughs) But we seem to have split the industry a bit, which isn't necessarily a good thing.

You're here, you care about performance.

What I want to do today is to give you tools, experience, to help improve the way we manage third parties. Because we kind of act like this.

So you may be curious one day, you open dev tools on your company website, your personal blog, probably not your personal blog, and see a console error. You see a 401, a 404 from a resource on a domain that you've never seen before.

And it's another one? And then you kind of have two options.

You either go into the tag management, the source code, whatever, and try to work out what's going on. Or you raise a ticket in a queue that has a backlog that could get from here to London, and call it a day, move on, 'coz it's not your responsibility. Anyone here done that before? Something's broken on the site, we'll just raise a ticket. (indistinct comment from audience) (laughs) Maybe to do with third parties would be good for this talk.

What's nice for me, and great for an industry, is GDPR, CCPA.

I will not say the acronym for the Personal Information Security Standard that China has. But these regulations are not only protecting us as individuals on the web.

They're also helping us as developers deliver better experiences.

Unless you're an American website who doesn't want to show news in the EU, so you just say sorry, we're not showing news in the, we will one day. But GDPR's been in force for what, a year? More than a year, it was April wasn't it? And there are still websites that will not render content in the EU.

Which is better than the ones that do and track you illegally, but that's another story. So if you look at cnn.com from the Netherlands, which is in the EU, unlike me next year.

(audience laughs) Something you can't see is the bottom of the waterfall, but it's about here.

The number of requests is 58, which is still, you know, big. It's a 1.8 megabyte page.

We have lots of web fonts with some strange prioritization going on.

But the first CPU idle is 4.6 seconds, compared to 18. So those third party tracking scripts, and we know they're tracking, because they've been blocked in the EU, or they're not being loaded in the EU until you click the little button that says you cannot view this site until you opt in, which is illegal. They're the ones that are consuming 14 seconds of CPU time. And this is on two virtual CPU four gigabyte BUN-2 server in Amsterdam, which has pretty good connectivity, reasonably, Amsterdam's quite well connected, and has a massive CPU, you know, throttle down and run on a browser, but it's a server CPU. If you run this on a web build device it gets slower. But when you do the RequestMap it's such a more joyful experience.

You can actually see the names of the hosts, and you can see they do have the cdn.cookie law third party script that manages the opt in to other third party scripts. And remember that that's a third party cookie script. We'll come to that later.

So what we've come to now is that tag managers are enablers. And I mean enablers in the good way, that they enable non technical people to ship important stuff to browsers. We need to know how many visitors we have, we need to know whether they're spending money, we need to know whether they're having a good experience. But I also mean enablers in a bad way.

We've given non technical people access to our delivery. And what I want to do today is enable you with a super power.

Who doesn't want a super power? As humans one of our most valuable things is experience. That's why we tend to pay older people more money if they stay in the same role.

Senior means two things.

So what I want to do is share experiences that I've had, that our customers have had, that Andy's had, to give you the forewarning, the knowledge to see that if you see something that looks a bit strange, you'll have the curiosity to dig in a bit deeper. And maybe it'll be a similar issue to something other people have seen before. So with that, let's talk about a 30 second wait. Who here would wait 30 seconds for a page to become interactive? If you had to load your, well if you're doing performance testing that's one.

If you had to get your boarding pass because you're about to miss your flight, then you would wait 30 seconds. This was a UK retailer who had seen a lot of their purchases online being sent to drop boxes. Not the software company, but shipping places. So people in other countries who couldn't buy online from this retailer, were buying online, shipping it to a third party who would then ship it to them. And this is obviously a massive hole because why don't you sell direct? So they launched in China.

China is a massive market if you can be successful there that's obviously a good thing.

As long as you don't fall foul of PISS.

(audience laughs) So in this process they localized.

It's a lot of money.

They did dev testing in the UK and then they did QA in China and they hired an external agency to do it. The problem was the app was totally unresponsive for around 30 seconds. And they discovered this after it was launched. Any ideas why that might have been? (indistinct chatter from audience) Sorry? You can't all talk at one, take turns.

(audience laughs) - [Audience] Firewall of China.

- Great Firewall of China.

I did have a picture of the Great Wall of China, but I thought that was too on the nose.

Yeah, the problem was it's relatively obscure. So a very simplified version of their waterfall, they had the html, css, some images, a web font, woohoo! And then they had this social script, which is on a third party domain.

And in dev that script doesn't do anything. In prod it loads some other stuff like Facebook's JavaScript script.

And we know that Facebook and Google Analytics, and a few other third parties are totally blocked in China. But in QA this wasn't a problem, because they were on a VPN. Why would you test in QA on a VPN back to the UK? (audience laughs) The challenge of course is when you have a really hooky solution to instrument your app which relies on a document ready event to attach event listeners to your buttons.

Which is fine in testing until your DNS starts to get really slow.

And after a while the browser will give up, but that time varies between 20 seconds and too long to wait. So the app had actually rendered, but you could not do anything.

Which has gotta be the most frustrating experience. I don't mind looking at a blank screen for a few seconds, but seeing it and not being allowed to interact with it? That's awful.

But we know that async attributes on scripts are fantastic and solve all of our problems. Anyone here not believe that? Yay! But we have defer, and that really does fix everything, and then you put it at the bottom of the script, and then you put it in comments, and then you put it in a little bit of script, and then you inject it with another script. The problem is in the world wide web we still have more sync scripts than async for third party. And not least because async doesn't actually fix these problems.

This is the arithmetic mean, and someone said earlier that we should never use the mean. That was Harry wasn't it? So let's just change that to p50.

The median number of sync scripts on the http Archive, so 4.5 million pages, is eight.

Eight sync scripts, and at the 95th percentile, too many, right.

So to fix this, the defense that they put in place, was a feature flag for China.

So they just don't add that social script.

That's the right way to do it, they should have done it in the first place. But they shouldn't be blocking on onload.

I think Pat coined the term onload spuff.

So you have all these event listeners that listen for the document ready, the onload domInteractive events. If you have third parties that load before those events they can push them out. And you use defer, every loading strategy is different for every website, but defer is a good way to get your scripts out of the critical path. Especially if they're third party.

But this is a nav timing API.

I like the fact that a nav timing API is navigation up to there, and then it gives us processing information. Has anyone seen the navigation timing API before? Woohoo, performance conference.

So if we look kind of around here, these are our interesting things.

So domLoading, we're starting to get content to build a page, loadEventEnd, all of our async scripts have finished loading, the page is complete. Where do you think a deferred script gets injected into the page or starts loading? Right at the end? To me defer means don't worry about it.

Do it when you can.

It actually means as soon as domInteractive event fires. I've done all my blocking activity, we're gonna start running our deferred JavaScript now. So any event after that can be pushed back, which is weird. So tag managers help somewhat.

The default trigger for Goggle Tag Manager is Page View. Anyone know when Page View fires? As soon as GTM loads.

Because that's the important thing to catch metrics and stuff.

domReady is one of those events that will get pushed out. Window Loaded is after the onload event.

So if you have a tag that doesn't need to be executed in the critical path use Window Loaded.

And all tag managers should have this kind of capability. If not, build your own.

So after discussion with Andy last night I think it's important to be able to categorize your tags into where they should load in the user experience. Immediate stuff, who here has some kind of AB testing framework? Optimizely, Maximizer, excellent.

Who here has deferred that script? Oo, some people who aren't quite sure.

So the point is it blocks render, right? It blocks render, says I'm going to give this user a green button, not a red button, download some experimentation stuff, and then renders the page. Now the challenge with that of course is if it's slow the page will not render.

So that's important that we get it in as soon as possible. Whereas ads, if you speak to someone in ad ops, they'll say ads have to load first, they're more important than content, no one cares about the article. Just get an ad up as soon as possible.

Thank you very much.

If you speak to anyone else in the world they'll say, I don't care about ads, I'm not gonna look at them anyway, so put them late.

And somewhere in the middle is the session tracking, the session cam, the visualization stuff, the performance monitoring. I come from a performance monitoring company, and I like to be somewhere early in that page view so we get data. But anything that requires user experience to have completed, shove it late.

Okay, that was about performance, they hadn't launched yet, when they launched they didn't know how much money they were going to make, so they didn't know how much they'd lost. This incident they actually lost money.

And the problem was a conversion rate drop of 30% overnight. This is quite significant when there'd been no release, no changes, no product updates.

So there was a bit of a war room.

And the discovery was that a third party script, which was loaded async, yay.

It didn't have a version number in it, and this third party had a different release schedule from every single one of its customers, as you'd imagine. It was loaded on tens of millions of sites. Now rather innocuously they pushed an update which added another piece of jQuery into the bundle. Unfortunately the customer had hacked together some horrible patch which shared the function name. And that event listener, that function was the event listener for their quick checkout button. So when this script updated they broke quick checkout. And most customers didn't say, oh quick checkout's not working, I'll try the normal checkout button, they went away.

So sometimes this revenue's recovered, sometimes you've lost the moment.

But that's quite a common occurrence, scripts updating without letting you know.

So I set this up yesterday to track a customer who I know has a lot of third party scripts. This is an mPulse Dashboard, it's gonna be quite hard to see.

But this little yellow line here is a spike in script errors.

And on the scale 650,000 errors in one minute. And immediately after that, it's kind of like the lights go really bright before the electricity cuts out, they lose almost all their traffic.

It just drops off a cliff, and the load time that we're measuring from that small amount of traffic goes all over the place.

And this was a third party update.

It actually broke multiple third party scripts. So having visibility, like Emily said yesterday, you can't manage what you don't measure.

This kind of observability is extremely important. 'Coz it would have saved a lot of time of them in a board room trying to work out why they'd lost 30%. Anyone here used Subresource Integrity? Ah cool, maybe 10% of the room.

So Subresource Integrity, if you have to load a third party asset from a third party domain, and it's important that you know when it changes, the integrity attribute allows us to take a hash of a file as we know it, as we've tested with it, and put that in the script tag itself.

So when the browser loads that file it'll say, okay, my hash against this file matches, I'm going to let this through. If it doesn't match it won't execute and you get a console error.

You can also report that through content security policy. 5.15% of pages have at least one integrity attribute. It's a weird one, there's a discussion on Slack this morning, or yesterday, about whether Subresource Integrity, there's any point to it. Because if you're loading a critical third party dependency, and it changes, you'll stop loading it, which means you'll lose money, you'll lose insight, something will go wrong. If it's jQuery from a CDN, and you haven't got a backup, your site will stop working.

And this is, Barry's in the room I think, this is an idea to go back to the fallback. So Facebook gives you a fallback image pixel. Any tracking stuff always has a fallback for when scripts don't work.

They collect slightly less data, but the impact is so much less.

Loading an image of a network is almost free if it's hidden, and it's late.

As opposed to arbitrarily executing third party JavaScript. So if it's possible, and it's worth trying this out, you could switch your third party JavaScript into image tags. Just like 1999, woohoo! I put links at the bottom right hand corner when I reference someone.

The slides are online and I tweeted them, so you can go and see this now.

I also put my speaker notes in there for the first time ever, which is kind of revealing. (audience laughs) Now this is an embarrassing one.

A performance tool. (audience member laughs)

Are you laughing at the embarrassing bit, or the fact I said tool? So this performance tool was implemented and customers started complaining that the site feels laggy. How much more irritating can you get in terms of feedback, the site feels laggy. Awesome, thanks.

So we did a bit of diagnosis and if you turned on the CPU throttling, like Tim suggested, every time you clicked anything on the page that moved away you didn't get the left hand experience, you got the right. Click, and then the page would navigate.

There was this feeling of input latency, that you click on something, nothing happens, for somewhere around half a second up to two seconds. And so your immediate response is, oh well, there's time to first byte issues.

You've clicked on a link, the page doesn't load for two seconds, so it must be a two second TTFB. Where in fact it was almost instant.

Anyone know what this might be? (indistinct audience comment) The handle touch, oh the fast click issue from six years ago, the 300 millisecond double tap. It wasn't that, it was worse than that.

It was third parties attaching to the before unload event. So this is a click handler on the left hand side, which triggers both a beacon send for clicking on something, well done Clicktale, they really are telling the tale of a click.

And then the before unload event fires.

You know those websites where you browse around, but you don't actually do anything, you immediately click back and it says, don't go? You haven't read my article and I haven't got enough ad revenue from you, please stay. That's the before unload event.

It allows us as web developers to say to the user there's something important on the page, if you leave it'll be lost. So Google docs for example will have this kind of feature. We also use it to send data.

Because if you try to send an asynchronous xhr and the page navigates away the browser will cancel that. So from a performance analytics vendor like me it's really important we get that data.

So we send it synchronous or using the send beacon API, which is now pretty well supported.

In this case that was a 1.6 second delay.

So from the beginning of the click, in fact that's the mouse up, so I'd already clicked, doing all these events. Lovely flame chart.

To the actual navigation starting, which is that blue dot, which as I said was really quick took 1.6 seconds, in this case 4x throttled.

Now this is really pernicious.

It's really hard to measure.

Anyone track the unload event apart from Honeycomb, 'coz Emily spoke about it yesterday.

Two people, you don't work for Honeycomb do you? Excellent.

So the unload event is something we get in the navigation timing API, it was right on the left hand side. Unfortunately we get it from the previous navigation, so the navigation timing API data for that 1.6 second delay would be in this page. But you know, we can deal with it, we can track it over time.

I looked at he 95th percentile data for another customer, this is a publisher.

And I like the 95th percentile 'coz people can kind of comprehend a five percent. 5% seems like a nice number, so if you say 5% conversion people understand it, 5% bounce rate, people understand it and don't believe it.

5% of users have an experience worse than this. What I find interesting is that we will not often see iOS being much faster than Android.

More powerful devices, closed ecosystem, but if you compare iOS to Android in terms of the unload time, iOS is at 800 milliseconds.

This is a live production website, I took this data yesterday.

I did let them know before I used it.

But what that tells me is there's something going on in that unload handler which is really bad on Safari. And if you drill down to browser it is Safari that jumps out.

And if you don't track it you won't know it until a customer complains.

We can't rely on synthetic tests for this, it's very hard to track this synthetically. So we should track unload duration.

Simple, problem solved.

But actually we kind of need to track everything, 'coz it may not be unload next time.

And one of my favorite quotes of all time is from Etsy. If it moves, we track it.

Sometimes we draw a graph of something that isn't moving yet, just in case it decides to make a run for it. (audience laughs) Okay, number four, this is a review provider, this is a talk that Andy has given on a client that we shared. And it's an interesting one.

The challenge was that Android was twice as slow as iOS, and iOS was already pretty slow at six seconds median page load time.

Now this is a confusing chart, we're not showing the distribution of traffic, we're showing the average session length across, or the height is the average session length, across the x axis, which is the average page load. What I love about this is that even though iOS users get much faster experience at the median, they consume less content on the site than Android users. So the peak of the blue line is higher than the black. So Android users like me are cool, and we're more patient and forgiving, and we like to hang around. So a bit of analysis was conducted, and we found a 1.16 second script execution. There's a lot going on in this timeline.

This third party wasn't responsible for all of it, they had many issues.

And what I love about this, especially after Tim's talk, I think, oh you can't see it here.

Oh yeah, there we go.

jQuery low dash, it's all bundled in to that common static assets bundle.

So this is quite a big bit of JavaScript to ship as a third party.

It's also quite a big bit of JavaScript to ship to an Android device to pass, compile and execute. This script gave ratings and reviews.

Customers love ratings and reviews.

Customers love ratings and reviews when they're from an independent third party.

Which is how this company makes its money.

So what do you do to fix it? Turn it off.

Take off a script.

You lose the ratings and reviews for those users, but the page load time drops dramatically.

I'd recommend watching Andy's talk on this from Delta V last year.

But what I love about it is Android users generated 26% more revenue after the script was removed. And I trust Andy's statistics, 'coz he calculated this. It was base lined against the revenue generated by other devices in the same time period.

So taking away a feature made more money.

Who would of thought it.

And so something that React has announced recently is the ability to adaptively import things, and deliver things based on device.

You can do this using your CDN at the edge, you can do it using a tag manager.

But if you've got something like ratings and reviews, maybe you can build in some adaptive tag loading. Save data has been mentioned a few times already, it's an opt in on the browser, and if someone wants to save data don't automatically show them all this third party stuff, which may not add value, but let them opt in by clicking on something later. Understanding the impact I think is the most critical thing. Oftentimes when we come to look at a website everything's already there, so it's hard to understand what's happened as they've been added.

But Harry's talk from last year goes into a lot of detail about how you can block things in WebPageTest, remove them from the site, and estimate the impact that they're having. And it's been said at the conference already, third parties actually like hearing this stuff. In general, apart from a couple, they want to get better. Font Foundry's not withstanding.

But for example, Optimizely we've worked with quite extensively to try to help them improve the way they deliver their content.

And we'll talk about that shortly.

Incident five, who loves a bit of malvertising? So advertising networks are kinda like tag managers. You put one thing on your site and it makes calls out to multiple different third parties.

And it's even more complex than tag manager, because those third parties change based on the user profile, the cookies, opt in status, location, what kind of ad you want to show to that user. So if you wanted to get some malicious code in front of a lot of people an ad network is a good place to infect. And this happened, oh that's what it looks like, obviously. That's the image from malvertising from Wikipedia. I quite like it.

Content security policy I mentioned briefly, we'll go into a bit more detail.

But it says to the website only load code from these domains.

Only execute code from these domains.

Only load fonts from these domains.

So if someone tries to do something malicious, block it and let me know.

This is another mPulse Dashboard, mPulse is pretty configurable, you can throw anything into it. And this is content security policy reports. This is what the header looks like.

It's a response header, and there are tools to help you build it, but basically it says, don't allow in line scripts by default, you can opt in to inline scripts.

And which domains you want to have different things happen from.

I've put the table here, oo, I haven't put the table there. Currently on 6% of pages, which I think is small. The risk with CSP is if you have it in block mode and something goes wrong, you break your website. Just like servers work.

It has to be tested well.

And it means you can't have really dynamic third parties. Which isn't necessarily a bad thing.

Here's the table for reference.

You can control almost everything that you can do in HTML and JavaScript with a content security policy. There's also the content security policy report only header, it's a bit of a mouthful.

But that means it won't block, but you can send data back to an end point as a nice little JSON blob to say I would have blocked this 'coz it doesn't match your CSP. So you can start with it.

Oh, what a great story.

I think this story broke about a week before I gave a third party talk and I thought it was a blessing. This is the Monero icon, which is the more dodgy version of Bitcoin, if there was such a thing.

The problem was that people's CPU got pegged because a Cryptojacking script had been injected into their page, into the page they were viewing, and it's using their CPU to mine Bitcoin for a third party. My favorite thing about this is infections lasted for weeks, and they made like $6. (audience laughs) It turns out tryna run stuff on JavaScript is not great. I think some of them are using WZM now, but when they launched it was all pure JavaScript so it was slow as hell. And of course it's kind of tricky, 'coz again customers might give you feedback that the site feels laggy, because the CPU is pegged.

But they navigate away and it all gets fixed. The worst thing was the discovery was not responsibly disclosed I don't think.

I think the ICO found out about this through Twitter. For those that don't know the ICO, the Information Commissioner's Office, is the department in the UK responsible for enforcing GDPR.

(audience laughs) So the irony is very strong with this one.

Talking about cookie scripts here is cookiescript.info infected with CryptoLoot. It blows my mind.

So this is a cookie opt in script that you put on your site so people can click okay to cookies, and then it makes you compliant with GDPR, with a CryptoMiner built in.

This was a compromise, it wasn't there on purpose. CryptoMiner use, now this is absolute stats, and the size of the HTTP Archive changes over time, 4.5 million at the beginning of the year, so the number jumped up.

But the actual number itself, I didn't put percentages there 'coz they're so small, seems a bit dull.

0.2% of pages on the HTTP Archive have some kind of CryptoMiner detected by Wappalyzer. And I was kinda disappointed when I saw that, I was hoping to see like 10% of pages are mining Bitcoin, and throwing CPUs off the wall.

And then I discovered that they're quite clever now. Since the first stories broke Procter Gamble's FirstAidBeauty.com has had a payment skimmer since May the fifth.

Oh, payment skimmer, not what I thought.

Still the same applies to CryptoMiners, the malware doesn't activate for people who might be techy. It won't activate on Linux.

As a Brit I hate the fact they've used ie security researchers, not only security researchers use Linux, it shoulda been eg.

But people are getting better at hiding stuff, and you can do it at the edge.

CloudPlayer has this great product called EdgeWorkers, where you can stick code at the edge that's dynamic, based on the user agent, the experience the page that they're going to.

So you could really cleverly inject CryptoMining scripts, just when you're pretty sure it's not WebPageTest. So people could hide from HTTP Archive, and security researchers.

The defense for this, it's kinda tough.

Subresource Integrity helps a little bit, 'coz if someone hacks the bundle, injects stuff into it the hash will change, it will not execute. But then you won't have your cookie opt in script in that case.

Content Security Policy to stop the CryptoMiner sending the data back doesn't stop the CryptoMiner executing. So I think here observability is the most important thing we can do.

As Tim said mPulse, SpeedCurve, many other tools can track performance metrics that are based around the CPU. Whether we catch stuff in WZM is yet to be determined. 'Coz it's not on the main thread, so it's not long tasks. There's more work to be done.

This is my favorite one, this is what we started, I think, our third party talk script with five and a half years ago. Optimisation.

And before I get into it I want to quote Kristian Skold from a meet up he did.

Our goal is not to make a fast website.

He was in a performance meet up and said, who here thinks it's really important to make a fast website? Everyone's hand goes up, fantastic performance meet up. But actually the goal isn't to create a fast website. A fast website which is a blank screen will not be a successful business venture.

But it would be super fast.

We're trying to maximize business success.

And I'm not gonna beat up on AB testing.

But AB testing is slow.

This is a known cost.

AB testing can give you huge insight into which variance of your website convert more, generate more revenue, make happier customers, which is what we're really about. But it does that by blocking render, generally. This is the case of optimize the Orange Valley, who's from Orange Valley in the room? Yay, hi, this is one of my favorite blog posts. How AB testing Tools Worsen Your Site Speed. The problem with companies like Optimizely is they put a blocking script on your page, and by default it's a third party domain, and it's a new connection, new DNS, TCP, TLS. And their SLA for success is a 500 millisecond response time.

Now half a second of blocking response time before the JavaScript bundles even delivered executed past is quite significant. They've always had 100% up time.

At 500 milliseconds.

So they actively promote people, proxying stuff through their domain.

You can do this with everything.

You've got cdn.optimizely.com on your page, why not set up a rule at the edge on your CDN to say /optimizely/whatever is actually a proxy through to this domain. You get to share your connection, so the experience will be much quicker.

You can also modify caching headers.

That'd be a bit naughty, but it could help. The only problem is then you're break the sandbox, you're bringing third party scripts onto your domain, which means they might have access to cookies they shouldn't have access to.

So it's a fun game to try and play.

I think this is fantastic.

Not least because I work for Akamai, and Akamai's the CDN used for all the fast ones, but SiteSpect is the fastest AB testing tool. You'll notice it's the only one which is server side. So the experimentation is done at the server, rather than on the client.

Which makes a lot of sense when you think about it, but makes it less dynamic.

And it's got less flexibility.

Optimizely are clever.

They thought we could do that.

We know we're a little bit slow, so why don't we do server side testing? They haven't quite got there.

What they have done is created the Optimizely Performance Edge, which runs on Cloudflare, using EdgeWorkers. And when you request the Optimizely bundle it says, well you're on a modern device, you support ES6, you're on a page which only has one experiment, so the bundle I give you will be one experiment with no polyfills. Which makes it much smaller.

Their experimentation shows that this whole process takes about 30 milliseconds at the median.

So when they get to deliver it to the customer it doesn't have all this extra weight, which saves network time, as well as processing time. They reckon, and the research hasn't been published yet, they reckon that this will mean that the overall impact of having Optimizely will be 50 milliseconds to your page load time. Sorry, to your first render.

Which is marginal, in real terms at the median. For the long tail experience is yet to be determined. So what can we learn form all of this? What can we use in our lives? One of my favorite quotes from Tim, of course. Everything should have a value, because everything has a cost.

If you know the cost of the script that you're adding to the site, and you know the improvement in business metrics, you can balance it out. And Harry's got a blog post that covers how to do that. I think visibility is key.

It's observability, right.

If you can't see it, you can't measure it, you can't audit it.

You can't know when it goes wrong.

And we shouldn't be discovering these things through customer feedback.

WebPageTest and RequestMap we saw.

Static testing, doesn't have dynamic cookie profiles, but what it allows you to so is script a little bit. So you can build a script that consents to your GDPR opt in, so you can measure all those third parties that come in after consent. Otherwise with HTTP Archive, for example, we're testing from the States, but if you got a European website that has GDPR opt in by default, you're seeing far fewer third parties because they're not clicking the opt in than most customers would. So take HTTP Archive data with a grain of salt. Content Security Policy reports are great, report-uri.com gives you a URI to view them. You have to pay for it because it's a lot of data, but that's fantastic.

And we can measure the performance of resources in RUM, real user monitoring, in mPulse.

The problem is the visibility is quite small. And if you have cross origin iframes, as some third parties like, you can get stuck.

CSP doesn't work as you'd expect, we have attributes on iframes to sandbox them, which is good, but you don't get resource timing data.

So its visibility is low.

Determining the risk, again Harry goes into this more detail.

I like the idea that you can correlate with RUM. Tammy showed how you can do it in SpeedCurve. Does the performance of this third party have any impact on my user experience through measurement? If no, then it doesn't matter if it goes a bit slow. If yes, then we need to work harder at it.

And the link is in the notes for the slides, Webbkoll by Dataskydd, has anyone heard of that? Anyone know what Webbkoll means? I think it's German? (indistinct response from audience) Swedish, excellent, what does it mean? Oh, you don't speak Swedish, that's fine.

Web track? This is more around privacy, but you can put in your URL and it'll tell you all the third parties that are loading, whether they set cookies, or whether they try to access cookies, whether you have SRI set up on them. So it's a really good quick audit.

But the URL is horrible, so there's the name. Removing the unnecessary is obviously the quickest way to win.

There's a Dutch airline that had a one year performance improvement policy or program.

The biggest single improvement they made to user experience and web performance was to remove 60% of the third party tags on the site. None of those 60% of tags, and it was a decent number, in the tens, had an owner within the business. No one was actively using the data.

Some of them had no commercial relationships and were returning an empty response.

So if you know who owns it, you know what the value is, you can keep it on the site.

Adaptive loading, like we spoke about.

And if a third party offers an image tag equivalent, or a server side equivalent, why not use it? It takes it out of the critical rendering path. And immunizing, there's a few ways to do this. If it has to be on a site, proxy it through your domain. If you're not allowed to proxy it through your domain make sure if it breaks it doesn't take you down with it. And you can test it.

You can build a Chaos Monkey for your third parties by randomly black holing them in WebPageTest. And minimizing customisation.

One thing when we work with Optimizely or Adobe DTM, I can say this JavaScript bundle is 400 kilobytes, sort it out.

They say, well actually ours is only 80 kilobytes, the rest is the customer's experiments, 60 of which aren't running, and 100 of which are from the preprod environment so shouldn't be in the production bundle.

But they give the power to the user, and don't educate them well enough to know how to take it out of the main bundle. So if you can minimize that you improve your chances of a good experience.

Lock it down.

Barry's blog post about this, about adding controls to Google Tag Manager was a bit of an eye opener to me. So GTM has four role types.

And I think with tag managers you should treat it as an operational tool, like your CDM.

Who would you give write access to your CDM to? Ops, DevOps maybe.

Tag managers do the same thing.

Or can take down your site, change the way your site works. Which is kind of why people who aren't in tech like to use them.

But if you can lock it down to people who have to approve changes, then you can build it in to your Sprint planning. You can build it into your business process. So raise a request for a third party, and we'll add it in our next release.

Then it will be tested properly.

Well, it has the chance of being tested properly. And I think that's it.

This is my last nautical slide.

The slides are up on Noticed and on my blog. And if you have any questions you can ask me afterwards, or get me on Twitter.

So thank you very much.

(audience applauds) - That was awesome, thank you.

- Thank you. - Come on over.

- I feel like I'm about to pass out though. - Oh no (laughs).

Okay, well we won't keep you too long then you can go and - Pass out. - underneath

a pile of jackets somewhere.

Okay, so we had quite a few questions come in. Won't be able to get to all of them, but here are a few. So how do you track, sorry, I'm trying to understand how this is phrased. How do you track server side AB and other tools like RUM? - How do you track, well the performance impact I guess is the question.

Well with server side stuff there still has to be a change, and it'll be a change with the response to the HTML, one imagines.

And you can track that through an AA test.

So test with the optimisation or the RUM tool on and off. Every change to the website should be tested, and this is just another change.

- Have you come across situations where timer events from various third parties are cumulatively causing issues? I need coffee.

Can recursive timeouts maybe cause this, but individually each third party may be fine? - So whether they're all built up together. And yeah, I think it's the straw that breaks the camel's back in the end, isn't it? We have seen third parties that have a set interval with one millisecond, 'coz they're trying to wait for something that's happened. And that can peg a CPU.

- It can be anything.

- Yeah, it's the wild west.

- Yeah, so somebody wanted to know which HA query cost 50 euros? Or was that 50 pounds? - That was the Subresource Integrity query. 'Coz you have to query the body of the HTML document to find the integrity attribute.

- Okay. - Hey, it's all money

to Google, they don't have enough of it yet. (laughs) - Can Subresource Integrity be used to just flag a non matching hash and log that to console instead of blocking? And can that be set up on a per resource basis? - It can be set up on a per resource basis. - I feel we need

to give you a whiteboard right now (laughs). - But if it doesn't match the hash it will be, you have a CSP directive to enforce it, I'm not sure what all the browsers do.

I don't know if anyone else has an answer to that? (indistinct audience comment) No, so it will always be blocked, but with CSP you can enforce SRI, but that means that a script won't be executed if it doesn't have the integrity attribute. So no, you can't just check for when it changes. - We should just get everybody here who works for a browser to just stand in a circle so that everybody can - Ask them the tricky questions.

- ask you those questions during the break. So somebody else asked, they said, I heard vendor bundles are anti patterns, is that true? - Is that question for Tim? - (laughs) Maybe. - I'm gonna say

yes, it is true.

- I like that, it's just really binary.

Yes. - Yep.

Delete them.

- All right so I think we're gonna call it for (laughs) we're gonna call it right now, so everyone, thank you again Simon, that was wonderful.

- Thank you for having me.