Opening title We have another kind of, well, we can't call it a deathmatch session, but we have a very focused session around performance. And obviously, performance has been a theme throughout a lot of what we've done this year, but obviously that's because it's both incredibly important to us.
But I also think we're seeing a maturing of patterns and approaches and tools and technologies around performance.
So I don't think it's gonna go away anytime soon. So in this session, we got two quite different presentations around performance, and we'll get to each in turn.
We're gonna begin with Yoav Weiss.
So Yoav has made really significant and important contributions to the web that we know today. And like many who have, he's not necessarily super well known for that.
But I like to call a couple out, because they are really significant.
So he is more or less responsible for the picture element and all that stuff being in both WebKit and Blink. So he worked in the Responsive Images Community Group, I think, which is a fantastic initiative that you're not familiar with it, sort of kind of helped move responsive images forward in the browser.
And if you're not familiar with it, I recommend you go and have a look around what they did. But basically, they're responsible for the whole idea of being implemented, and then Yoav basically crowd sourced. He said, "Look, I need a few bucks "so I can spend some time just implementing this stuff "in WebKit and Blink." And lo, people paid him and he did.
And now one of the principal reasons why we have that stuff in major browsers is because Yoav actually implemented it.
He is now paid by Akamai to do similar things. He works on standards, and I mentioned yesterday, he, Marcos and Chris Wilson.
So Chris Wilson, very briefly, I'm gonna do a shout out. So Chris is now at the Google Chrome team, but he was on the IE team for many, many years, the Windows IE team all the way back to IE3, and he is more or less the reason why CSS got adopted in browsers in the first place. So IE adopted CSS even before it was for, by any means, fully standardized.
And prior to that, he actually worked on the original Mosaic browser, right? Way back in the day, NCSA Mosaic.
He worked on the UNIX version of that, or perhaps he worked on the Windows version. Anyway so he's worked on probably all major code bases of all browsers and together, those three are the cochairs of the Web Platform Incubator community group that we referenced as well and making tremendous contributions to the way in which we develop standards, and I think really accelerating how the web is evolving.
So huge thanks to Yoav and Marcos and Chris as well for all those contributions.
Yoav is going to talk about the bane of our life, which is, from a performance perspective, the stuff you can't control because it's third-party stuff coming into your website for all kinds of reasons, all right, that you have little, if any, control over. And he's gonna talk about how we can address that and challenges and opportunities around that and how it we can hopefully make our lives better as developers dealing with third-party stuff coming into our sites.
So to talk all about that, would you please welcome and thank him for his wonderful contributions to the web, Yoav Weiss. (applause) - Thanks, John.
So yes, hi.
I'm Yoav Weiss, I work for Akamai.
I work on making both browsers and CDNs faster, and I'm here to talk to you about third-party content. And the reason I wanna talk about third-party content is that our ecosystem is broken and everyone are suffering as a result.
Because you folks, you go to conferences, you hear people talking about performance, performance budgets, about optimizing the critical rendering path, compressing images, making them responsive, and you take that to heart.
You go back to your organizations and you implement all of those things and you make sure your sites are as performant as they can be. And then requirements from business comes along to add this one HTML tag, and this tag brings along its tag's friends with him, which bring their tag friends.
And together, all these tags relieve themselves all over your hard performance work.
(audience laughter) And you are obviously frustrated as a result and the users are suffering from bad performance. But whose fault is it? The business person in the story is not really evil, they're just doing their job, they're trying to make sure there's enough money to pay everyone's salaries so that they can pay rent and live indoors, which is kind of nice.
And the third parties are also not evil here, they're just trying to be the best third-party that they can be.
They're just trying to optimize their metrics so that their ads get the most clicks.
They can track user behavior so you can have better analytics, better analytics, better data to take into account when you're working on parts of your site.
And they're just trying to pay their own rent. So the main problem here is that their metrics are not influenced by the user experience and by the end users' performance because our ecosystem is broken.
So this one I wanna talk about.
I wanna first outline the problem then talk about ways you can mitigate that problem today, work around it.
Basically, you can tackle, if not the underlying issues, you can tackle the symptoms today and make sure that they're less harmful than they can be.
And eventually, I'll propose mechanisms that can fix the underlying issues and make sure that everyone's incentives are aligned and make sure that control is back where it belongs, in your hands.
So a year ago, The Verge put up an article talking about why the mobile web sucks mainly because of performance issues and mainly because mobile browsers are not trying hard enough and therefore, the browsing experience on mobile is not as good as it is on desktop.
An engineer working at Mozilla took that to heart and wrote a response post.
And as part of that, analyzed The Verge's own site. So The Verge were downloading about one meg in order to show the content of their site, which is, on the one hand, pretty hefty.
On the other hand, they do have a lot of shiny images so yeah, you can get away with that.
So it's not really reasonable to demand all of that from the browser while still expecting it to be performant because mobile has its limitations.
And I know The Verge's performance engineers and they are definitely, they definitely don't want their site's performance dragged through the mud by third parties.
But the problem here is that they just have no saying in the matter.
Once you bring in third parties, they can do pretty much whatever they want.
Another recent research showed that 50% of the data that we spend on mobile is spent on downloading ad content, which basically means that the ad industry is sponsoring network operators.
And ads keep getting worse, websites are devoting more and more screen real estate to ads.
They are intrusive.
They can be confusing, pretend to be the actual content, show our users download buttons that encourage them to install malware.
They can play audio or video in the background, actually use it as a security attack vector. And what I'm about to say may sound revolutionary, but bear with me, I think that users don't like that, and the ecosystem responds.
So adblockers has been on the rise in the last few years, starting out as desktop browser extensions. But as of iOS 9, they also are available on mobile as part of iOS 9's content blockers.
In web standards, we often refer to browsers as user agents. They are there to act on behalf of the user, and browsers provide extension mechanisms to enable users to do just that, take control over the content that they consume. And some browsers are even taking that to the next step and integrate ad blockers directly into the rendering engine, making sure that they're more performant and save memory because it's running as part of native code rather than an extension-based code.
So Opera are experimenting with that.
A new contender in the browser market called Brave, one of its main promises is having a native ad blocker.
So this is what the users are doing.
At the same time, content embedders and content aggregators are trying to tackle more or less the same problem. They see their users suffer.
They see their users afraid to click on links because we have trained users that links are slow and expensive.
So what do they do? They build their own formats.
But you could build a faster website without it. The biggest advantage of AMP is not necessarily the raw speed but the fact that it's fast enough, the fact that it's fast enough and the fact that it provides us with guaranteed performance. We and content embedders know that an AMP site is going to be fast enough.
And the same is true for these other formats, for Apple News, for Instant Articles.
And content embedders therefore are using that in order to provide these formats better UI, better SEO ranking and more visibility. They show them to more users.
And the fact that they differentiate the UI also helps with training our users that these type of links is not slow and expensive. So they are using that in order to increase traffic to those sites, which means that these formats run a risk of forking the web. Publishers nowadays have to publish their content in these four different formats.
So HTML, AMP, Instant Articles and Apple News if they wanna reach all these walled gardens. And the SEO and visibility benefits make sure that they kind of have to.
So advertisers, the advertising industry got the message and they realized that ad blockers and these formats pose a significant risk to their current ways, so they published two initiatives targeted at addressing the problem.
The first one is targeted at advertisers, asking them to make sure their ads are lightweight, served over https, have some form of opt in in them and noninvasive for some set of, they have these UX guidelines for what noninvasive means. The second initiative is targeted at publishers and trying to get them to educate users regarding, regarding ad blocker usage.
So basically, they're telling publishers to detect ad blockers, explain the value exchange of advertising, ask them to disable ad blockers and then put them in a corner if they refuse to comply.
So we end up with things like this, which are basically, they're trying to nag their users into submission.
Now since we have a multibillion-dollar industry on the line here, there are new startups and new technologies that are built in order to address this need and block ad blockers and show the ads to users even though they prefer not to.
And obviously, the response to that is new and improved content blockers that make sure that the ads are blocked regardless. They study these new techniques and work around them. So we have an arms race in the war on ads.
Like most arms races, the main people that benefit are the ones providing the ammo.
So what can we do today regarding third parties? How can we mitigate the performance penalties that they bring along? The first method that you can use to mitigate third parties is to load them asynchronously.
So if you're including a script tag in a synchronous way to your HTML, you're basically stalling the HTML parser until that script tag finished downloading and executing, which is a huge performance penalty.
That is, this is something we know in the performance community for a long while, and almost all, if not all, third parties provide asynchronous snippets that you can use instead of synchronously including those tags. So this is a good first step in order to mitigate the negative impact of third parties. But even if you load those scripts asynchronously, you still have arbitrary code running in the main context of your domain with full access to everything.
Browsers don't necessarily deprioritize async scripts, so these scripts downloading will still contend on bandwidth and potentially get priority over your own content.
The second method you can use to mitigate third-party or accelerate in a way, accelerate third parties that are necessary for your page to load, so important third parties in a way, you can use the new link relations that Tim talked about yesterday, preconnect and preload.
And you can use those link relations in order to tell the browser that a certain host would be needed for download of resources so that the browser can connect to that host ahead of time and outside of the critical path. And you can also, if you know the specific resource that would be needed, you can connect, you can tell the browser to actually download that resource ahead of time. So both of these techniques are pretty awesome.
And just a full disclaimer, I worked on both of those, so I like them.
But they don't necessarily solve the whole problem.
So they don't do much to mitigate the fact that third parties are loading way more than they should be.
And on top of that, they're not something that you can apply to all third-party resources because in many cases, third parties are downloading resources that are dynamic.
So the resources change between page loads. So that often means that you don't necessarily know which hosts the resources will be downloaded from, and you don't necessarily know what those resources would be.
So preload and preconnect are great to speed up static third parties, but for dynamic ones, they're less efficient.
You can also use service workers that Marcos talked about yesterday in order to mitigate some of the negative effects of third parties. So we talked about the fact that if you're loading asynchronous resources, you can end up with, they can end up blocking your entire site.
So what we often refer to as a single point of failure, front and single point of failure, you can mitigate that with service workers by racing the resource download with a timeout.
And if the resource hasn't responded within a certain timeout, it can return an empty response or a 404 instead, releasing the browser to carry on with parsing and page construction and make sure that the users actually see the page rendered.
So this is one method you can use with service workers to mitigate the risk of synchronous loading. You can also use service workers to log third-party requests that are not necessarily successful.
So there are other methods to see all the requests that third-parties are downloading.
So for example, resource timing, but they own only give you the resources that were downloaded successfully.
And also, they don't give you that in real-time. Service workers enable you to log all resource requests in real time, even the ones that weren't successful, which is great and solves one aspect of the third-party problem; but it does not, does not give you a full solution for the issues that they introduce.
Content security policy is also called CSP.
It's a security-oriented standard, which enables you to send down policies to the browser and have the browser enforce them.
So these policies can, for example, indicate which hosts the browser can connect to in order to download various resource types.
And while these policies are security-oriented, we can also use them to make sure that third parties are not connecting to hosts that we don't want them to connect. So if we have third parties that we know only need to connect to a certain set of hosts, we can limit them using content security policy. If we know that we don't want our page to download any kind of video or audio, we can enforce that using content security policy.
And if third parties will try to do that regardless, they will get blocked by the browser.
The main problem there is that there is no current way to enforce content security policy on frames, that there is a specification that tackles that but it's not yet implemented.
So for some scenarios, it may not be useful as a current mitigation technique.
It also doesn't enable us to do everything we wanna, it doesn't enable us to control everything we wanna control with third parties because it's a security-oriented feature. But we can use it for some, some of the things.
So the last technique and probably the most useful one to control of third parties is iframes.
Iframes, which I'm sure you're all familiar with, they basically enable you to create a child document inside your main document, which is a separate context. So everything that's running in the context of the iframe, it doesn't have access to the main context dom. It doesn't have access to run arbitrary code on your domain.
So we've talked about scripts that are run inside the context, inside the main context, and they can basically do everything in the page.
You give them a blank check and effectively, it's a cross site scripting attack on your site, which you allow and then trust that third-party not to abuse.
Iframes enable you to step away from that and contain the third parties, making sure that they don't have access to the dom.
Iframes can also be further constrained using the sandbox attribute.
So sandbox is a feature built on top of iframes that enables you to shut down whatever functionality that frame can do.
So it enables you to prevent scripts from running, prevent alerts or other model dialogues from running, prevent form submission, plug-ins, access to the top navigation, et cetera.
So it can really constrain what an iframe can do, but the problem is third parties without scripts usually are not very useful.
They cannot really do what they need to do, so you can, for example, allow scripts to run or other parts of the sandbox limitations.
Each one of the sandbox limitations can be lifted and can be allowed using various allow values on the sandbox attribute.
So sandbox is very useful if you wanna basically constrain what your iframes are allowed to do to just the bare minimum that they need in order to be functional.
The main problem there is that not all third parties supports being iframed because for various reasons, they need access to the main page.
For various reasons, they need access to the dom, so ads often require they have visibility constraints.
They wanna know whether their ads are actually visible by the user in order to know whether they can charge the advertisers and pay the sites.
And analytics providers also often need to know what the user is doing, what the user, where is the user on the page in order to create a report.
SafeFrame is a framework from the IAB, from the advertising industry, Interactive Advertising Bureau that enables ads to be iframed while still getting the info that they need from the dom. So the SafeFrame script is running in your context and is communicating using post messages to the various frames, the data that they need.
It's communicating, they're asking it questions and it's returning responses regarding things that happen in the dom while not giving them full access to the dom.
So this can be very helpful.
It's a very, it's a model that makes a lot of sense to constrain third parties rather than giving them full access.
The main problem there is that not all ad providers support SafeFrame yet, but some of the big ones do.
So if you're integrating ads, you may wanna look into integrating them using SafeFrame and iframes.
But at the end of the day, all this is great but it's not...
It doesn't solve the whole problem.
Third parties, even when iframed, can still download arbitrary amounts of data. They can still run arbitrary amounts of code and therefore, they can hog the CPU.
In most browsers today, that would be the main thread CPU.
So an iframe running, when they run a lot of code, it can freeze the main UI thread.
It can cause jank on your main content.
So are there any third-party providers in the crowd, people who work for the ad industry? Oh, no one? There's two who raised their hands.
If you are a third party that wants to be a better web citizen, what can you do? How can you be a better citizen besides the obvious, not downloading a lot of resources and working well inside of iframes? So there are a few new web standards that address the needs of third parties and make sure that they can better behave with the site's content and with the browser.
So active touch events are inherently slow. So the default form of touch events are inherently slow because touch events can be canceled, which means that the browser has to wait until the touch event handler has finished running before it can actually act on the touch event, which often results in lag, the user is touching the screen in order to scroll or click on the link and we have to wait for the touch event to finish before the browser can actually act on it.
Passive touch events are the response to that. They basically give the developer ways to say I will not cancel this event and therefore the browser can act on the event as soon as the user has actually touched the screen and it's running the touch handler in parallel. So this is something that enables, enables browser to get rid of a lot of junk that is related to touch event handlers.
And analytics providers and ads really like to know what the user is doing and whether it's touching the screen, which areas of the screen they're touching.
And this enables them to do that without runtime performance implications.
The second standard that is exciting on that front is intersection observers.
Intersection observers enable you to get notifications regarding various elements on the screen that are about to become visible.
And again, like I said, a lot of ads require to get notified regarding visibility of their ads and different parts of the screen. And up until intersection observers, the only way to do that was to take over scroll events and touch events and basically continuously poll the various elements that they care about to see whether they are in the viewport or not.
That resulted again in a lot of jank.
And since a lot of websites have multiple analytics providers, all of them took over, all of them handled those events and all of them queried the dom, queried the layout again and again to see whether things are visible or not.
Intersection observers enables us to get rid of this polling mode and basically get notifications whenever things are visible.
Okay, so we talked about the problem and ways to mitigate it, but how can we fix the ecosystem? How can we align everyone's interest towards those of our users and toward a performant user experience? So the first attempt at civil disobedience with regard to the ad industry was the Do Not Track proposal a few years back. And the proposal suggested that when users opt in to not be tracked, browsers will add the DNT header to request going up.
And then the server will be aware of that user's wishes to opt out of tracking and will respect that. It turns out that if you ask your users explicitly do you want to be tracked, yes or no, the answer would probably be no.
So this is what IE tended when they first added Do Not Track support.
They add that as a question as part of the installation process rather than hiding it in a menu, and that meant that overnight, a lot more people started browsing the Internet with Do Not Track request headers, which meant that overnight, the ad industry, which previously supported that initiative, backed out and decided to be less supportive of it.
So DNT still lives as a toothless header, which no one respects and does nothing in particular. Since there is no way to enforce, no technical way to enforce respecting it, it's pretty much meaningless.
And my main takeaway from that effort was that whatever policy we try to enforce, it has to be enforceable by the browser.
It cannot rely on goodwill of all parties involved. The second technology that is interesting in that respect is content security policy, which we talked about as a mitigation method, but it's, more than anything, it's a mechanism that proves that we can apply different policies.
The content owners can apply different policies and the browser can be used to enforce them. And it also includes a lot of interesting mechanisms such as a report-only mode, which enables you to test policies and see whether or not they would break the site without actually breaking it.
So you get violation reports and you know basically if you can turn on whatever policy you're trying to enforce without taking a risk while doing that.
And another interesting, another thing that I thought this effort should tackle is the other team phenomena.
So in many cases, performance teams in many organizations, they act as police officers and enforce performance best practices on other teams that are potentially less interested. That can be done with a performance budget which is enforced through a build process.
And it can work for some things but less so for others, and it would be interesting to have a way to do that in the browser.
And basically, if that other team is introducing features or bloat to your site, you want the browser to lessen their face, break the site and make them feel bad about themselves.
So all these were things I had in mind when I started talking about finding a standard alternative to AMP when AMP started to pick up steam.
And talking to the Chrome team, talking to people on the AMP team, we started discussing various ways to do that. The name content performance policy floated around, initially proposed by the AMP team, and a few months later I wrote a proposal that included many ideas and many features around that concept, some of them more realistic than others.
But the main intention here was to declare a way, to provide a way for both content owners, both first parties and third parties, to declare to the browser that they are performant as they're not doing, as they're not using some feature or not doing X where X is something that is performance kryptonite.
Well, in order for that declaration to have any meaning, it has to be enforced by the browser, and the browser has to break the site or break that functionality that they're trying to use if they declared that they will not use it. So we had a few more meetings with the Chrome and AMP team and basically decided to break up that initial slightly too large proposal into four smaller and more reasonable ones.
The first one is feature policy, and its intention is to limit harmful web platform features. So features that cause the browser to halt. They interrupt the user's workflow or do a little bit of both.
And yeah, so the features that are included as part of that proposal are asynchronous scripts that we already talked about.
They block the HTML parser, they block it on the network until the script has finished downloading and running.
And we want to have a way to avoid them entirely. Synchronous XHR is even worse because it's blocking the main UI thread on the network.
So it provides bad user experience, as well as the user cannot scroll, cannot do anything while a synchronous XHR request is in flight. It is something that is mainly used as a hack for analytic scripts.
So it's widely used in third parties.
And there's no real reason to use that because we have APIs such as sendBeacon that, the Beacon API that enables us to perform the same thing without the negative performance impact.
So we wanna way to shut down that feature.
Document right is also one of those awful features that keep getting abused often by ads.
And browsers, there are a lot of...
The HTML parser has a lot of exceptions and a lot of special handling around document rights.
And if we can guarantee that a certain site has no document rights at all, we can potentially improve the parser speed and make HTML parsing significantly faster.
There are a few other features that are there as part of the proposal, mostly for security reasons because it's a mixed performance and security proposal. The initial syntax that we're discussing looks something like this.
And yes, it's JSON as part of the header, which apparently is soon to be a thing.
But the way that this works is that the policy is applied to the main context of the page and gets inherited by all iframes.
The main page can also enforce stricter policies on various iframes so it can shut down things that are allowed on the main context for certain iframes.
And it will also include a reporting mode similar to CSP that will enable us to test it out before turning it on for reals.
The second part of the larger CPP proposal is resource size limits because we want to be able to limit the amount of resources and the amount of bytes that third parties are downloading as part of tackling that click fear.
We wanna be sure that we're not wasting our users' bandwidth and data plans.
And a very initial proposal to tackle that is the content size policy, which will operate most likely very similar to feature policy in terms of being applied to the main context and then inherited by iframes with some way to override those limits for specific iframes or enforce limits on specific iframes.
There is no concrete proposal.
The main hurdle there is...
Main hurdle there is privacy because enabling first-party origins to know what is the size of third-party resources that are being downloaded is great when the first party origin is a legitimate site, but less so when the first party origin is a malicious site. So it can be abused to basically give out login info.
If you go to a malicious site, they can sniff out using if they had access to resource sizes of third parties, they can sniff out whether you're logged into Facebook, Gmail or whatever other service.
There are various privacy-related attacks around that. So this is the main issue that we'll need to tackle in order to enable content size policy to be something that we can deploy.
The third part of the larger CPP plan is to have some form of CPU and bandwidth priority because today, there is no way to tell the browser this content is important and this content less so. So divide the CPU resources and bandwidth resources accordingly.
Therefore, in many cases, we resolve in contention between the first-party content and third-party content. Ilya Grigorik from Google proposed something he called cgroups for the web, which enable us to define various groups and their share of the CPU and bandwidth and then enforce that on various resources.
We're not yet sure that it will look like that, but that's the main thinking around that the subject, having a way to define CPU share and bandwidth share for the main content, for important third parties versus less important third parties.
Basically, giving the browser enough info so they can enforce those priorities. And the fourth and missing part is, not yet missing but the harder part, is user experience. We talked about the fact that ads can be intrusive and can be confusing, but it's very hard to figure out a way to actually enforce that using browser-based mechanisms. The definition of annoying is very fuzzy and hard to enforce.
So this is something we still need to put some thought into. And if you have ideas on the front, I'd love to hear that.
And you may have noticed that none of this tackles the privacy aspects of the ad ecosystem.
The main reason for that is that its tracking is done on the server, so it would be very difficult to come up with a browser-based mechanism that has some form of enforcement on that front. The browser has no visibility into what the server is doing with the user's data or the user's cookies once those are permitted. So the minute the browser enables third-party cookies, it's on the server to enforce any privacy regulation. So this is out of scope for this effort.
So what we hope that those policies would enable, what do we hope? Which change do we hope these policies will induce? So first, control for site owners, giving site owners a way to control their users' experience, which will hopefully lead to smarter embedders since embedders can have those guarantees as part of the policy that sites send.
Embedders can now provide the same benefits that they provide to the proprietary formats to just fast enough sites that are guaranteed to be fast enough.
And potentially also, ad blockers that can take ad performance into account rather than being based on white lists and hosts.
They can take the actual performance of ad into account and let fast ads through while blocking slow ones or unknown ones.
And eventually, the goal is happier users.
(applause) - Thanks so much, Yoav.
So while we set up Josh, we might have time for a question or two.
I know there's a lot in there, and this is definitely perhaps an opportunity to come and chat with Yoav, ask if you've got some specific questions and so on.
Just over here.
- Hi Weiss, that was a very exciting presentation. - Thank you.
- I'm a novice programmer on front and I was basically working on .NET and Java Beacon, recently transformed into a front-end developer based on various exciting features that front end offers. I was working for a Bank of Montreal project last week and I can contradict with a fact that you were mentioning iframe sandbox for content security policy. We were given an iframe where we were asked to code inside the iframe.
And based on certain placeholders, we were asked to update an element value outside the iframe in the document, and I was using, assuming that there's a test element, say in a div, say id equals to text one, I was just using jQuery inside the iframe like dollar text one parent dot window HTML, and then we were updating the outside element value based on the placeholder displayed.
So you were mentioning that you will be giving, I suppose in this kind of scenarios for ad blockers an iframe where security breach is not possible. So how I was able to achieve it, I'm a novice programmer so just correct me if I'm wrong, or is it because exposing to a third party host or is it in a same host that I can clear on? - So I'm not sure I got the question, but you're basically asking how can we enforce CSP policies on the iframes today or-- - Yeah.
- Yeah, so if it's a collaborative iframe, you could just have the iframe include the content security policy headers or you could tell the iframe what the content security policy headers should be.
Otherwise, there is a current proposal that will enforce CSP restrictions on iframes, but it's not yet implemented. And that policy will enable the first party to declare the iframe CSP and then that policy will be sent as part of the request headers so that the iframe provider, assuming it's a third party, will know what the restrictions are.
Does that answer the question? - Yeah, yeah, yeah, I got it.
- Good, thanks very much for that.
We might head on to Josh and then there might be a chance at the end of Josh's presentation for questions for Yoav as well as Josh.
But until then, please thank Yoav Weiss.
- Thank you, thank you.