Designing Web Apps for Performance

Opening title - So next up, we have Josh Duck.

So Josh is born and bred Aussie up in Brisbane, but he's been over working in Facebook for quite some time now.

He's worked on React and Relay and a whole lot of that open-source front end technology and so on. At the moment, a lot of what he focuses on is startup time. So the kind of architectures where we're kind of rendering the client and so on, not specifically only for React but for a lot of different approaches, and even if we're using a more classic approach to architecting our applications.

That startup time is critical obviously for the sense of performance and engagement for our users. So Josh works a lot on that and has some thoughts around that, so please welcome, Josh Duck, to tell us all about rediscovering the server. (applause) - All right, hi everyone.

Like John said, I'm Josh and I work at Facebook. So I work on JavaScript performance at Facebook. So we have a large application.

It's pretty huge.

It's an application with a lot of moving parts. So a lot of what we focus on is try to make it possible to grow that application without current limiting what engineers can do.

So we don't want to put constraints on what you can actually build, or slowing down that initial page load.

And so as an industry, we all think a lot about outperformance.

So like John said, a lot of the talks at this conference are focused on performance.

We do things like we have best practices about where in the dom certain tags should go. Script tags always go at the bottom, right? And we watch JavaScript benchmarks, so we always have, the to-do app is the kind of classic example, and we always say my framework's faster than your framework in this to-do application, and we also, we celebrate new browser features. So things like service work that promise to solve performance problems for us.

But today, I wanna talk about more than just HTML and more than just JavaScript and more than just the browser.

So I wanna talk about the structure of the web as a platform.

So really, the performance of our applications, it's determined by how the web is structured. And if you kind of take a step back and look at it with fresh eyes, the web is actually a little bit strange.

So we always all carry phones with us, right? If you kind of compare this to the devices we had, say, 10 or 15 years ago, these are essentially supercomputers. So if you have an iPhone, you actually have 10 times the computing power that Deep Blue had when it defeated Gary Kasparov in that famous chess match. This is not Gary Kasparov or Deep Blue, but I like to think that this is a nice dramatic photo and it was probably a little like this.

And browser vendors, they put a lot of time and energy into creating really advanced pieces of software. So they can do things like WebGL, advanced CSS layouts like the grid and Flexbox proposals that we heard about yesterday.

And it can do things like streaming video and all of that kind of stuff.

But the browser, it can't actually do anything on its own. So without a data center that might be located thousands of miles away, the browser really can't do anything.

And so if someone were to propose this as an architecture today, you might think, hey, this is a little bit of a clutch.

If we have this supercomputer, why don't we just do everything on the device? Well, the answer to this is that building a good application, it's not just about raw computing power. My device, it has to be connected to what's happening out there in the world for it to feel alive. If I can't stream a video or chat to my friends on Messenger or connect to the Pokemon Go server, then my device, it just feels dead.

So the challenge that we all face as engineers is how do I get the information from out there and get it to my device when I need it? And so the web chose to answer this by adopting this model where we don't just have the client but we also have the server.

And it's this dependence on the server for application startup that really is a lot of the, the source of a lot of the performance problems that we see today.

But before we kind of get to that and how can actually go about solving that, I actually want to take us kind of back to the beginnings of the web.

Because we're working in a system where we have these constraints, and it makes sense to try and understand why those constraints exist in the first place. So the web is actually 25 years old.

It's older than iOS and Android and Java and even Linux. And before any of that stuff existed, the web was created by scientists at CERN, specifically Tim Berners-Lee, to allow them to share information.

So way back in the beginning, it was never about how do we do text layout or how do we show images. It was trying to answer the question of how do we get information from different computers and different programs when we need it? And to do this, it had to be more than just an application. So you know, there was the client, which is the browser that we know and love, but there also had to be the web server, which provided the information that, that client needed. And it needed an interface between the two in the form of URLs.

So it was this interface that let the client ask the server for the data it needed.

So let's say hey, there's a resource out there, I know it's there, just go and get it for me. And so we look at this architecture, this is obviously how we do it.

But if the web was just a client application and it didn't have the server or the URL part, then you'd be left trying to implement the web on top of some generic protocol like FTP, where there's lots of round-trips to fetch anything. Or you might try and have an archive where you'd download everything and then just pick out the files that you needed on demand.

And these would be super inefficient, like especially back in the early 90s, right, when bandwidth was very limited.

And so having the server and having URIs let us have this model where we could just load the parts that we needed on demand.

And so when we went to our webpage, we just download that one page.

And if we needed something else, because the user clicked on a link, then we just go and get that too.

And this kind of worked, but we moved on from static files in HTML directories. So we moved on to server rendering, and the reason for this is that we realized that people didn't want to just consume information, they also wanted to kind of give back.

So they want to blog and they wanted to shop and they wanted to post photos.

And to do this, the server had to learn to respond to user interaction.

And we couldn't just take the tools and languages and developer experiences that worked at the time for, say, Windows native development and use those.

Has anyone else used Visual Basic? Yeah, this was a lot of fun.

This was my first experience programming and I loved it, like I got to create really cool applications and it's really what got me into this.

But it just doesn't work on the web because on native, the application logic and the UI, they're in the same place, right? They're on the same thread.

But for the web, the application logic is on the server and the UI logic is in the browser and they're thousands of miles away.

So you have that high latency.

And so we had to build something new and what we ended up with was this kind of model that was pretty unique to the web, at least for mainstream development.

We'd build up the context from scratch every single time a request came in, and we'd fetch all of these, the data and all of that kind of stuff, and then we'd render some HTML and give it back to the browser and just say, hey, it's just a file. We had the stateless model, and it's not such an obvious conclusion to how you develop applications.

It's kind of insane to think that we have to hit the database every single time the user interacts with the page.

But it works, and the web actually became really popular on the back of this architecture.

So we had applications that people could use and they could interact with their applications and they could do it from anywhere and from any device and they didn't need to know anything to get started. They had a browser, they just needed to know the URL or the domain and they were good to go.

And for us developers, it made it much easier to launch applications.

So imagine trying to build something like Facebook as a desktop application.

You might send an installation CD to everyone you wanted to join, but that's a lot of CDs. But on the web, launching was just a matter of copying some files and putting them on a Web server and you're good to go.

But we're not actually here to talk about server rendering. We're here to talk about client rendering, and the reason is the web is undergoing this second transformation.

The first was that shift from static files to server rendering, and the second transformation is this move from server rendering to client rendering. And the reason we're seeing these is because we're starting to ask for more from our applications. So it was about 10 years ago that the iPhone came out, and it popularized this idea of delighting users with design.

And so it really raised the bar for what we asked of our applications, for how they looked and how they felt. And for server-rendered applications, it's really hard to be interactive and responsive if every time I clicked on a button, we have to go and make a request to the server, then we're never going to be fast.

We're limited by that round-trip time, which is limited by the speed of light, and that's not gonna change anytime soon.

So we all know the answer to this, right? It's JavaScript.

10 years ago, this wasn't such an obvious conclusion. So JavaScript is this kind of janky language that you'd use to sprinkle some functionality on your HTML document.

Maybe you can have an image that followed your mouse around the screen, but it was not a real language and not something you'd ever contemplate building an application in.

And this is real JavaScript by the way.

This is stuff that's floating around out there on the Internet.

Imagine trying to debug this without Firebug or Chrome dev tools.

Like the developer experience was just not polished. It was just a really immature ecosystem.

But thankfully, we moved on from this and we learned to be rock stars.

So this is a really embarrassing banner from way back in the day.

Yeah, that's shocking.

But yeah, what jQuery did is it kind of introduced this façade to the dom and kind of papered over all of the janky bits that were hard to understand and it made us look at JavaScript, the language, with new eyes.

And from there, we kind of had this steady stream of JavaScript libraries that has kind of turned into a torrent.

But that's a topic for another talk.

And just like we discovered the right way to build server-rendered applications with that stateless model, and we codified that into abstractions like Django and Rails, we're learning to solve problems that work for the browser.

So things like data flow and view rendering. And we're codifying that into our own set of abstractions. But this is not the full story because by moving all of that logic to the client, we've introduced a new problem and that's load time. First that it's really hard to get performance right through client-rendered applications.

So a couple of years ago, we wanted to see how far we could push JavaScript on the web.

And so we built this prototype.

So it was a replacement for our mobile site. And we used React to render this, so it was all client rendered, all React rendered.

And using that, it was actually a really good developer experience.

So very quickly, we could build something that felt great to use.

But when we tested it in real-world conditions on real-world networks and real-world devices, we were just never happy with performance.

So we could never get it to be quite as fast as that server rendered application that we were comparing against.

And obviously, performance is really important. This is why we do all of this so that we can have great-performing applications and have a great experience.

We're not gonna turn around and ship a product just because we happen to like the developer experience better.

That's serving our needs and not the needs of our users. So this is kind of where we're at today.

So we have these great JavaScript libraries and all of these great tools, and we wanna be able to create great experiences to have that nice interactivity and everything that the user wants. But we're still figuring out how to make it fast. So like I said, the web is 25 years old, but really, we're using it in a brand-new way. And so some people just tell you that this is unsolvable. They'll say, "JavaScript is too slow," or, "React is too slow." But we didn't really think that this was the case. So we were actually shipping JavaScript and React in our native applications, and we were happy with that performance.

And that's because when we initialized JavaScript, we were loading it off the device.

So it wasn't JavaScript that was too slow, it was the constraints of shipping code on the web. And so there's a couple of solutions to how we go about solving this problem.

One is we cache everything really aggressively and we preload it.

And this is like pretending that we're a native application. So in this model, the server doesn't know anything about how our application is structured.

And so if I say I want the about page, it'll just say, here's the app bundle, that's all I know, you go and figure it out for yourself. The problem with this is that native applications, they set this terrible example for us to follow. So this is the top-rated weather app on the iOS App Store. It's a weather app.

It doesn't do that much complex stuff.

And it's 90 megabytes.

So this is not something that we should try and bring to the web.

We don't want this.

The idea of shipping 90 megabytes is kind of insane. We actually ship a megabyte of JavaScript code on Facebook's home page.

So suddenly, shipping two orders of magnitude more is not something we wanna do.

And yeah, we can kind of cache this stuff, but if the very first time I go to Facebook I just get a loading indicator because we're loading all of this JavaScript in advance, then I don't really care if you tell me it's gonna be faster next time because I wanna see what people are up to right now, I don't wanna have to wait.

And so there are some applications that try and do this. So you might recognize this.

Does anyone recognize this? Hands up.

Yeah.

So you recognize this and remember this because it's not a good experience and it doesn't feel like it belongs in the web.

We shouldn't ever recognize a loading indicator. That's not a good model to go for.

And the other problem that we found is that caching code is fundamentally incompatible with shipping on a regular basis.

So if every time we ship a new release, we invalidate all of those caches that we worked so hard to build.

We're not gonna get any value out of them.

So on our native applications, we actually ship code once every two weeks and then we let those application versions stay around for months or years after that.

And on those platforms, that was actually seen as really fast.

Like every two weeks, wow, that's so fast.

But we kind of started on the web where we're shipping multiple times a day and we're just like, "What are you talking about? "This is, of course, how you would develop applications." Because we've actually found that releasing code more often, it led the product to be better and it made it more stable. So by releasing often, we were able to see very, very quickly if the product was behaving how we thought it should behave, if users were using it how he expected them to use it. And for engineers, they weren't kind of pushed to, they didn't wanna rush to get their code and their divs into a release branch so that it will go out to production. They could just ship something when they knew it was good and when they knew it was stable.

And so the other solution to this is universal JavaScript, which is isomorphic JavaScript.

And this is where we take those client templates and we render them on the server in advance. And so a lot of people are doing this.

It's actually really easy to plug into an existing client, existing client rendered application for a lot of technologies.

And so this is really popular because we'll get the HTML down to the client and we'll get pixels on the screen really quickly and everyone will be happy because we have that high perceived performance. And so I don't wanna say isomorphic rendering is bad. It definitely has a lot of uses.

From our little friend the spider over here, it's great so that we can have SEO and have that search engine optimization.

But we found that, at least for us, it just really wasn't addressing the performance problems we were seeing. What it was doing was masking them.

It was making us kind of think, "Oh yeah, we're great. "We developed this really fast application." But it wasn't fast, we were just kind of putting a facade in front of what the user was seeing.

So the reason we wanted to do client rendering in the first place is because we wanted our application to be interactive.

If I clicked on the Like button and nothing happens because the event listener is wide up, then I haven't given you a Like button at all. What I'm giving you is a photograph of a webpage. And so it's definitely a nice technology demo and I'm gonna give the server a gold star for trying really hard.

But it doesn't really solve that performance problem at all. It's just hiding it.

And so we still need the client to execute code quickly. And so neither of those things really solved our problem, and that's because they didn't address the root cause. And that's that this move the client rendering, it's taken us away from a world where we just loaded the parts we needed to this world where we have to load all of the application code upfront so that we can start executing.

And this is the exact problem that the web had been built to solve.

And the web hasn't changed, it's just that we're kind of using it differently now.

We kind of assumed that in that shift to client rendering, we moved all of the application logic across. We didn't need the server anymore, we'll just make it an API layer or a CDN.

But the server is exactly what let us have that nice incremental loading behavior.

So we wanna bring that back for client rendered applications too.

And to do this, we need to figure out that boundary between the client and the server, because having the right boundary is what lets you move application logic around. And especially for us, what we wanna do is we wanna divide the labor between the client and the server.

We don't wanna have to just pick one of these. And this boundary for server rendering, it was obviously URIs and HTML.

So we need to figure out what it looks like for client rendered applications too.

And to do this, we looked at the strengths of the client and the server.

And so the client is great at controlling cache, right? So we have local storage, index DB.

We can control exactly what gets cached, whether it's resources or data or anything like that when we invalidate it, and we have a lot of control there. In the server, obviously have caching var headers, but we don't have this really fine-grained control. And the client is obviously great at handling interaction because we can immediately update parts of the page when we need to.

And finally, with the help of things like service worker, which is really exciting technology, we can have offline-first experiences.

So we can render the page even if the network is down or while we're waiting for that initial request to be fulfilled.

But the client, it's never gonna be good at data fetching. We're always gonna be limited by that round-trip time. And again, to load code, we're having to touch the network. We're having to get those bites to come down from the CDN or somewhere like that.

And finally, SEO is much easier to do for the server. And so what we want is to develop an architecture where we can do all of that stuff that the client is good at doing on the client and try and avoid all that stuff that it's bad at doing.

And I'd love to be able to tell you that there's a single way that we do this, it's just a bit we can flip and everything kind of magically works.

But unfortunately, in engineering, things don't often play out nicely like that. So we're gonna have to tackle these one by one. And to do that, I'm gonna show you a few different techniques that we use. And to keep it simple, I'm gonna say let's develop a simple application.

So this is a profile page.

It looks kind of like a Facebook profile page. There's some navigation, some images and videos. And if I asked you to render this on the server, then you might build something that looks like this. So firstly, we just get the user information based on the request that came in and we'd render the header bar.

We'd then check if we're on the feed and if we are, we'll get the story information and render either the images or the video.

And then finally, we can get the comments and render the comment interface and the Like interface. And so if we're running this on the server, it's gonna be fairly good experience.

Like, we're doing a few roundtrips, but the server can do that well.

If we were to take this code and run it on the client, which we can do because it's JavaScript, which works out nicely, it's not gonna go as well because firstly, we can't show anything until we've loaded that code upfront. So we're just gonna literally get an empty page. When we're finally loaded and compiled a parcel of that code, the very first thing we do is we hit this get user block, and that means going back to the network.

And we'll do that and then we can render the header, but then we have to go and get the stories. And then after we do that, we have to go and get the comments.

And then finally, after we've done all of these roundtrips, we can show the page.

So this is not gonna be a good experience if we just take those patterns that work for server rendering and try and apply them to the client. If I map this out on the timeline, it would look something like this.

So the way we structured our code means we're doing a bunch of slow work in series.

And so we haven't developed for the strengths of the server and the strengths of the client.

So I'm gonna walk you through each of these kind of blocks here, and the first is this huge block of JavaScript preparation.

So we all kind of know that like loading JavaScript is expensive, right? We always have that network time.

So this is why we have things like tree-shaking and we have things like JavaScript minification. But it actually turns out that it's a little bit more nuanced than this because it's not just download time that we have to worry about.

So we actually also have to worry about the parse time, the completion time and the execution time of that JavaScript.

And so we need to do all of these things before our application can really start.

So if you have a module system like CommonJS or some kind of require system, the body of each of your modules, it's just JavaScript, right? And so you need to execute that before that module exists. And so that means if you have a huge dependency tree, that's a lot of JavaScript you need to execute before you can really even get into application code. And so maybe we could try and do some push from the server. We could embed our scripts in script tags and HTML, or we could use H2 push to get it down to the client quickly.

But this really doesn't touch the parse and compile cost at all, and it's kind of avoiding the real issue.

We don't want to just make it faster if we can avoid that work.

I think Tim would approve of this slide.

He made this point that performance experts have to always have this slide somewhere in their deck. And the earliest versions of the web, they didn't ask how can we get all of the pages down to the client as fast as possible? They said, how can we get just the page that the user needs right now? So the problem with JavaScript is that our abstractions, they kind of force us to do more work than we need. So things like require statements and import statements, they're all synchronous.

So that means we have to resolve all of the functions before we can do anything.

So for example if I'm on the Feed tab, I still need render about to be resolved and initialized. Because if we don't and then we, for example, change tabs, we don't have a function to execute.

It's gonna be a runtime error.

So we need to do everything in advance.

And so there are some abstractions like AMD and System.import that let you asynchronously require modules.

But doing this kind of everywhere would be verbose, like especially in JavaScript where we love to have utility functions like old friend left pad here. Something as simple as doing some string manipulation might be calling into a different module, so we don't want to make every single function call asynchronous.

And so what we wanna do is introduce boundaries between the different parts of our application in just the places that it makes sense.

And so we use routing to do this.

And so this gives us the biggest win without kind of getting in our way.

And so there are a lot of great routing libraries out there. We actually ended up building our own and it looks a little bit like this.

So we have this matter out function where we have a bunch of routes listed, and each of those is a function call.

And this interface means that we can actually move the require statements that are needed for each of those routes into the branch.

And because we've done that, we can use JavaScript scoping to kind of eliminate one of the branches if we know that we don't need it.

So for example, if we're rendering the feed route, we don't have to load the render about function until a later point in time.

And then we can actually go one step further and generate a bundle for each of these routes automatically.

So we actually have a build step that will walk all of the AST of all of our JavaScript code and it will find all of the references to match route and then all of the references to routes, and then all of the dependencies for those routes so we can build up this big map.

And that means that when someone comes to our site and they say, hey, I'm gonna hit the about page, we can look that up in the map and we can say, well, that maps to the about route, and the about route will need these resources. And so we can just serve them those resources and nothing else.

And by doing this, we can minimize the amount of JavaScript that we need to download and then parse and compile and execute.

But we still have to worry about these data fetching roundtrips.

And so these exists because of those calls to get data. So we have to get the user before we can know which stories we need to fetch, and have to get the stories before we can know which comments we have to get. And so as engineers, we can actually look at this and we can see that we're defining a series of relationships between these different data types. So we actually have context that the interpreter doesn't because we understand why we have these variables and what they mean, when the interpreter just sees them as meaningless integers and arrays.

But even with this context we have, we can't speed things up.

We can't go and get the comments without knowing what IDs we care about.

And so what we're actually doing is we're asking the client to do something it's bad at doing, which are those data fetching roundtrips.

So ideally, we can move this logic to the server. And now the easy way to do this would be to build an endpoint for our profile page and just move all of that logic across.

We'll call it the profile and stories and comments page. And we actually used to do this quite a lot, but we found that it just doesn't work in the long term and in practice because as you add more products, more pages to your product; and as you add more version, there's more and more code that you have to keep around in your server application and it just doesn't scale over time.

So we felt that, that abstraction or that boundary between the client and the server, it still wasn't quite right.

And so to help solve this, we built GraphQL. And what GraphQL lets you do is it lets you define the query that you want or the data that you want, and then you can send that to a server that will execute it and do whatever series of imperative steps are required to fetch that data.

And then it will give you this JSON response back at the end.

And if we take this and we put this in our application at the bottom here, it's not too different from what we had before, right? But what we've actually done is we've embedded that context about what comments are we talking about into our code. So here, it's really unambiguous.

We can say the comments are clearly the comments that apply to stories, and stories are the stories that apply to that user.

And so what we've done here is we've replaced a series of imperative JavaScript statements with a GraphQL, sorry, a declarative interface, which is the GraphQL query.

And declarative interfaces, they're great, right? We have a lot of declarative interfaces in the web world. We have HTML, we have CSS, all of this kind of stuff. And they'll allow us to separate the what and the how. So for our application, the what is the data or the shape of the data that we need.

And the how is how we actually go about fetching that, which is a series of database lookups.

And this allows us to define the query on the client but then execute it on the server.

And so we've implemented this boundary between the client and the server in the form of a GraphQL query. And by doing that, we can actually remove all of those extra roundtrips by just doing one round-trip to fetch that data.

But we can actually do better than that.

We wanna remove all of the roundtrips that are part of our application startup.

So how do we go about doing that? So if we look at the code where we now have that GraphQL query, we can see that we have context again. We can see, okay, obviously, what we're gonna do when we execute this code is we're gonna go to the network. But the client and the interpreter, it doesn't know this. So what it needs to do is it needs to ask the server for the HTML page, load all of the JavaScript, start executing it.

It'll get to that very first statement, that GraphQL query and it'll say, you know what, I have to go back to the server again.

So there's really no way to avoid this, right? We have to have the code loaded so that we can execute it. So what we're actually blocked on here is loading that code. And so this is something that the server is good at doing because obviously, when a request comes in, the server's loaded up all of the application code. And if we, like, happen to refer to a module that we haven't loaded yet, we can literally just get it off disk.

So it's not an expensive operation.

So we wanna run just the data fetching parts of our application on the server.

Sorry, I'm a little lost here.

Oh, we're all good.

So we wanna run just the application fetching parts on the server, and we actually have a way of doing that because we're using these GraphQL statements. And so we actually have a second build step. So it, again, walks out AST and it looks for all of the references to GraphQL queries.

And then we can construct a map of the dependencies to the GraphQL queries that will execute.

So when someone comes in and hits our side, we look at the path they're requesting, we figure out which route that maps to, we figure out which dependencies are required for that route, and then we figure out which GraphQL queries will be executed as part of those dependencies. And so we can take those queries and then execute them on the server in advance.

And then what we'll do is return that as part of the initial HTML response.

So what we tell the client when we're sending them the HTML responses, we say, hey, here's all of the JavaScript that you require, and then when you're ready, when you initialize your application and you're good to go, here's all of the data that you'll need to actually run your application.

And so we call this preload mode.

And we actually found that having a parse step to extract all of this information in advance is both powerful because we don't have to do runtime execution.

But it's also really flexible.

So we actually have a really mature PHP stack. We have a variant called Hack, and we run this on all of our web servers.

And because we've already extract these queries in advance, we don't even need to run JavaScript on the survey if we don't want to.

We can actually have all of this functionality where you're executing the queries that are defined in JavaScript without having to run JavaScript. We can do it all in Hack.

And so by doing this, we can eliminate that round-trip that's part of our application startup.

And so obviously there's more techniques.

We can use things like service worker can massively reduce the amount of JavaScript we might need to load or reduce the amount of overhead of that initial HTML response.

But they've been covered pretty well in this conference so I'm gonna leave it here today.

And so this is the final state of our application. And now we've been using these abstractions for a while now, so our ads code base, we actually have tens of thousands of JavaScript components and we use routing to minimize the amount of overhead and know exactly what we need to load on startup.

And on the news feed, which is the main page you get when you go to Facebook, we use React to render the comments, the chat sidebar and the interface where you create new posts.

And all of the data that we need to render those components, we return as part of that initial HTML payload. And so we know that these ideas work and they massively speed up our page load time, but they only really help us if engineers use them. So in the old days, if someone created a JavaScript application or a JavaScript component and they got performance wrong, then there wasn't a whole heap we could do to kind of speed it up.

So things like loading code and loading data, they're so caught at how applications work that you can't really retrofit a solution after the fact. You're kind of stuck with what you got in the first place. And so to help with that, we put a framework called Relay. And so it combines GraphQL and React.

And with our internal setup of Relay, we give you all of the things that I've shown you today. You have routing, you have GraphQL-ing, you have preload mode.

And we found that, that makes it much easier to do the right thing.

So we can have engineers build performant JavaScript applications without having to be JavaScript performance experts.

We've actually had people come up to us and they said, "Hey, we built a new product and it's 20% faster "than the old one." And this is pretty awesome, right? But the even better thing is that we didn't know that they were building it in the first place. We didn't have to try and lead them toward the best solution or coach them on like all of this esoteric JavaScript they had to write to get things right.

They just naturally found the right solution. And so this is the ideal that we've been working towards. And we feel good about these abstractions and we feel like that they belong on the web. We saw back at the beginning that the web server was created to solve this problem of getting the right information at the right time. No matter how big your website is or how many bundles you have in an application, we can load just the parts we need to get started. And this allows our applications and our websites to grow over time without slowing down the initial page load.

And to do this, we had to understand the strengths of the server and the strength of the client. So we don't have to choose just one of these. They're different environments, they have different strengths, different weaknesses and they should be used for different things. And to get this route, we also had to nail that interface between the server and the client. Having the right interface is what lets you move application logic around or, even better, break it up and do different parts in different environments.

So routing, it let us load our application without having to load all of the JavaScript upfront. We could defer stuff until later.

And GraphQL, it let us ask for data without having to worry about round trips.

And preload mode, it gave us data without us even having to ask for it in the first place.

So for a client application, this is like magic, right? Who doesn't want just the free data provided to you? And so we really love the development experience that you get with JavaScript in the browser. You can create these really responsive and rich applications that you can access from anywhere. And by using the survey in smart ways, we're able to let the client do more of that stuff that it's great at doing.

So the next time you're creating a web application or sorry, client-rendered application, I want you to think about more than just the HTML and JavaScript in your code base.

I want you to think that this platform where you get these two different but powerful environments that you can use.

So this is really rediscovering what the server's there to do.

It's there to empower the client and give the client what it needs to get started.

Because together, the client and the server, they form this platform, and it's a platform that gives us as engineers the tools we need to create fast applications. Thank you.

(applause) Thanks, thanks.

- Thanks so much, Josh.

We might have time for a question or two before we break for lunch.

Somebody got something for Josh or Yoav or (mumbles)? All right, Rob over here.

Sorry.

Oh.

- Hello, that was really interesting.

Thanks for doing the talk.

A question about the, so you've got what the user interface looks like sitting in the JavaScript itself.

So you've got initially nothing on the page and then basically JavaScript's gonna come vertically in the display.

Am I correct with that, just to start with? - Sorry, I missed the first part, what you said then. - Okay, got it.

So when you initially hit, say, the about HTML page, for example, is there anything rendered on it initially? Is there any server-side provided stuff like a frame, like some of the stuff mentioned yesterday? - Good question.

So we actually do a mix.

Traditionally, we've used a lot of PHP for server side rendering, and then we'll tend to put React code on top of that.

So we will augment the frame that's generated from PHP. We're actually moving away from that because you tend to end up with the worst of both worlds. You have maybe like a couple of seconds of PHP execution and then you have a couple of seconds of React execution after that.

So what you want to do is kind of like try and parallelize those so that you're doing them both at the same time, and having client-first execution actually helps with that.

So the server is just streaming in, say, like data or dependencies to the client, but it's able to execute as fast as possible. - Okay.

Some other questions for Josh? Oh, here we are.

That was Neil (mumbles), yup.

That was probably Bill.

(chuckles) - Hi Josh.

I'm interested in the way you split up the code to just download what you need at the beginning. - Yes.

- How much of that is a manual process or how much of that is now completely automated by your-- - Completely automated.

So we've had a build system for a while that basically analyzes how people use Facebook, and it will check what resources are loaded as part of the page loader, as part of the page lifecycle. And then we use machine learning.

It's less fancy than it sounds but it will basically generate the optimal package configuration across all of Facebook.

And so what we do when you come to Facebook, we actually determine which dependencies you need, and then we figure out which bundles are required to fulfill those dependencies.

- Any of those in the client or is it always download a new bundle when it goes to the site? - We use the browser cache where possible.

So we do that methodology of just like hashing based on the contents.

And so you if you download a resource and you ever need it again, hypothetically, the browser cache will fulfill that.

- So if you went to a page that then had, we used half the same components, would it download those components again in the new bundle or would it get a bundle that just the bits it was missing? - No.

It's a one-to-one mapping between, every module only exists in one bundle.

And so that's why we wanna actually get the optimal bundling configuration because if you happen to require something because you have one module that you care about out of say like 20, that's not a great hit rate, right? You're downloading 19 modules that you don't need. - Right, so one page can load multiple modules-- - Yeah, we also do the kind of async loading. So we just load the subset of the page that's changed and leave the Chrome static.

So as you browse around within a given session, you have the optimal kind of experience.

- Cool, thanks. - [Josh] Cool.

- Anymore? Come on, there's gotta be more questions for Josh. - We've got one over in the right here.

- Oh right over here, oh.

Is it Jeremy? Yeah.

Sorry, I can almost see, sorry.

- Great talk.

I'm wondering about with GraphQL with error handling that for example, the stories endpoint had an error, how do you return that error in the response? - So this isn't really defined by GraphQL.

I think it's application-specific.

One of two things will happen.

One is that you could invalidate the entire request. Generally, we only do that if it's a catastrophic failure, like the server is overextended and we can't fulfill the request at all.

Typically, what we'll do is we'll null out the parts of the tree that we can't request for you. And so what you can actually do with GraphQL is you can request really anything.

You could say, for the story, give me the videos and the images and the 3D content and the audio and the, I don't know, whatever else, whatever content types.

And if it just happens to not exist, it's just like, well, you requested something that doesn't exist, there's just no key on the response that corresponds to that.

And so we can use the same thing when a story that should exist, we can't actually fulfill that. We can just emit it and it's like you're still gonna get a good experience because we'll show you the rest of the page or the rest of the application that we can actually give you.

- Any more out there? So I've got one, I guess.

Obviously, over the last 18 months or so or maybe less, last 12 months, the idea of that progressive web app, obviously service worker, whether you're using that to cache or whether you're kind of build you're own caching with index db, enabling or encouraging and allowing users to install on the home screen.

Are these things that you and your team have been thinking about, exploring? - We're really excited about service worker. Like, there's great potential there.

So I kind of alluded to this before, but like if we can load from service worker and have that offline-first experience, like, yes, you get that offline experience but like the better part is that you're actually starting from time zero.

It's like you get an extra 500 or 1000 milliseconds to actually do whatever JavaScript execution you want. And like if you're looking at your total page time being three or four seconds, then like getting an extra second is like Christmas, right? You know, free time is awesome.

So that's super exciting.

The thing about the app show model is we're less excited about that because people don't come to Facebook to look at that blue bar that is at the top of the page. Like, as pretty as it is, it's like they come to actually see the content that people are posting.

So like we still have to optimize that code path to get that content down as soon as possible, and that's always gonna come from the server because the client can't cache content that it doesn't know about. - I guess then the idea of background sync that obviously Marcos was talking about yesterday as well, which is not really there yet but is certainly scoped out, so the idea that would you think to sync in the background-- - Yeah, so-- - Which is how Facebook works as a native-- - Exactly, yeah.

So that's pretty exciting.

So there's two parts of it.

Background sync, I think the thing that's specced out at the moment is for uploads so you can upload a media object without having to keep your browser open.

As to having push notifications when we say oh your friend posted some new content and we send that down to you, that's pretty exciting, but there's some limitations at the moment. The browser vendors are really being quite conservative about what they let you do with push notifications because if you imagine a hundred different sites all sending you push notifications and constantly waking up your service workers, it would be a bad experience and it would drain your battery.

So they're starting off by being very conservative and then opening it up as they kind of see how people are using it.

But that's totally the way that we're gonna be going in the future, is that like we will be pushing new content so that if we know that you're gonna load Facebook, we'll be able to show you content straight away and then incrementally load more stuff from the server after that.

- Right.

And I guess, or even just caching the most recent content the someone's had and they reopen off and just caching that-- - Yeah.

- Pulling that in before you even have fetched new content-- - Yeah, and that's a great difference, especially when you have chronological ordering, which Facebook famously does not have chronological ordering.

So you've got to be a little bit cautious about, do you want to show something from yesterday if they've already seen it and they didn't like it. So there's lots of kind of business concerns or product concerns that go into whether you wanna do that. - But you are definitely, like, these are the things you-- - Totally, yeah.

We're experimenting with service worker.

It's super exciting stuff.

- Fantastic, all right.

Any more questions from folks? Come on, give me one more.

I challenge you for one more.

(Josh chuckles) Looking at the clock, they're getting hungry. Right up the front here, and then we'll break, I promise. And you'll see what we've got in store for lunch. That's right, is we (mumbles) reputation's, you know, "There's gonna be great food we're gonna get today." But anyway, final question.

- I was just wondering if any of those static parsing libraries are available at all, or is it all just behind closed doors? - So we use Babel.

So it's amazing and it's awesome.

So I believe that's what we use for implementing most of this.

Our dependency management system is very, very coupled to our code base, and it wouldn't really make sense for us to open source it.

Like, I think, webpack does a pretty good attempt at solving all of that kind of stuff.

- Seb's working on something though, I was talking to him at dinner last night.

- Seb's working on, like, 10 different things. - Right, something around like NPM style or replacement NPM. - That's a little bit different.

So that's less about mapping dependencies and more about solving the problem of like if you have an NPM client that will do arbitrary JavaScript execution or sorry, arbitrary anything execution, you probably don't want to do that inside Facebook's corporate network.

(laughs) So we try and avoid that.

- Fair enough.

And hang on, I think I'm derailed the answer to the question, or have we got there? - So we used Babel.

We're starting to use more and more open source stuff and also like bringing it inwards.

Like I think traditionally, we've been fairly, like, we'll push out open source stuff but like, we won't take a lot back.

But with Babel and that kind of stuff, it's been super useful.

And there's also like, what is it, ESprima and this js code shift.

And so that's more manual base 3 factoring and AST exploration, but we're starting to use more of the open source stuff and having more collaboration with open-source community.

- Thank you.

- Awesome, all right.

Let's, once again, thank Josh and Yoav.

Great session this morning.

$%^&Josh Duck.srt.txt#$%^

Opening title - So next up, we have Josh Duck.

So Josh is born and bred Aussie up in Brisbane, but he's been over working in Facebook for quite some time now.

He's worked on React and Relay and a whole lot of that open-source front end technology and so on. At the moment, a lot of what he focuses on is startup time. So the kind of architectures where we're kind of rendering the client and so on, not specifically only for React but for a lot of different approaches, and even if we're using a more classic approach to architecting our applications.

That startup time is critical obviously for the sense of performance and engagement for our users. So Josh works a lot on that and has some thoughts around that, so please welcome, Josh Duck, to tell us all about rediscovering the server. (applause) - All right, hi everyone.

Like John said, I'm Josh and I work at Facebook. So I work on JavaScript performance at Facebook. So we have a large application.

It's pretty huge.

It's an application with a lot of moving parts. So a lot of what we focus on is try to make it possible to grow that application without current limiting what engineers can do.

So we don't want to put constraints on what you can actually build, or slowing down that initial page load.

And so as an industry, we all think a lot about outperformance.

So like John said, a lot of the talks at this conference are focused on performance.

We do things like we have best practices about where in the dom certain tags should go. Script tags always go at the bottom, right? And we watch JavaScript benchmarks, so we always have, the to-do app is the kind of classic example, and we always say my framework's faster than your framework in this to-do application, and we also, we celebrate new browser features. So things like service work that promise to solve performance problems for us.

But today, I wanna talk about more than just HTML and more than just JavaScript and more than just the browser.

So I wanna talk about the structure of the web as a platform.

So really, the performance of our applications, it's determined by how the web is structured. And if you kind of take a step back and look at it with fresh eyes, the web is actually a little bit strange.

So we always all carry phones with us, right? If you kind of compare this to the devices we had, say, 10 or 15 years ago, these are essentially supercomputers. So if you have an iPhone, you actually have 10 times the computing power that Deep Blue had when it defeated Gary Kasparov in that famous chess match. This is not Gary Kasparov or Deep Blue, but I like to think that this is a nice dramatic photo and it was probably a little like this.

And browser vendors, they put a lot of time and energy into creating really advanced pieces of software. So they can do things like WebGL, advanced CSS layouts like the grid and Flexbox proposals that we heard about yesterday.

And it can do things like streaming video and all of that kind of stuff.

But the browser, it can't actually do anything on its own. So without a data center that might be located thousands of miles away, the browser really can't do anything.

And so if someone were to propose this as an architecture today, you might think, hey, this is a little bit of a clutch.

If we have this supercomputer, why don't we just do everything on the device? Well, the answer to this is that building a good application, it's not just about raw computing power. My device, it has to be connected to what's happening out there in the world for it to feel alive. If I can't stream a video or chat to my friends on Messenger or connect to the Pokemon Go server, then my device, it just feels dead.

So the challenge that we all face as engineers is how do I get the information from out there and get it to my device when I need it? And so the web chose to answer this by adopting this model where we don't just have the client but we also have the server.

And it's this dependence on the server for application startup that really is a lot of the, the source of a lot of the performance problems that we see today.

But before we kind of get to that and how can actually go about solving that, I actually want to take us kind of back to the beginnings of the web.

Because we're working in a system where we have these constraints, and it makes sense to try and understand why those constraints exist in the first place. So the web is actually 25 years old.

It's older than iOS and Android and Java and even Linux. And before any of that stuff existed, the web was created by scientists at CERN, specifically Tim Berners-Lee, to allow them to share information.

So way back in the beginning, it was never about how do we do text layout or how do we show images. It was trying to answer the question of how do we get information from different computers and different programs when we need it? And to do this, it had to be more than just an application. So you know, there was the client, which is the browser that we know and love, but there also had to be the web server, which provided the information that, that client needed. And it needed an interface between the two in the form of URLs.

So it was this interface that let the client ask the server for the data it needed.

So let's say hey, there's a resource out there, I know it's there, just go and get it for me. And so we look at this architecture, this is obviously how we do it.

But if the web was just a client application and it didn't have the server or the URL part, then you'd be left trying to implement the web on top of some generic protocol like FTP, where there's lots of round-trips to fetch anything. Or you might try and have an archive where you'd download everything and then just pick out the files that you needed on demand.

And these would be super inefficient, like especially back in the early 90s, right, when bandwidth was very limited.

And so having the server and having URIs let us have this model where we could just load the parts that we needed on demand.

And so when we went to our webpage, we just download that one page.

And if we needed something else, because the user clicked on a link, then we just go and get that too.

And this kind of worked, but we moved on from static files in HTML directories. So we moved on to server rendering, and the reason for this is that we realized that people didn't want to just consume information, they also wanted to kind of give back.

So they want to blog and they wanted to shop and they wanted to post photos.

And to do this, the server had to learn to respond to user interaction.

And we couldn't just take the tools and languages and developer experiences that worked at the time for, say, Windows native development and use those.

Has anyone else used Visual Basic? Yeah, this was a lot of fun.

This was my first experience programming and I loved it, like I got to create really cool applications and it's really what got me into this.

But it just doesn't work on the web because on native, the application logic and the UI, they're in the same place, right? They're on the same thread.

But for the web, the application logic is on the server and the UI logic is in the browser and they're thousands of miles away.

So you have that high latency.

And so we had to build something new and what we ended up with was this kind of model that was pretty unique to the web, at least for mainstream development.

We'd build up the context from scratch every single time a request came in, and we'd fetch all of these, the data and all of that kind of stuff, and then we'd render some HTML and give it back to the browser and just say, hey, it's just a file. We had the stateless model, and it's not such an obvious conclusion to how you develop applications.

It's kind of insane to think that we have to hit the database every single time the user interacts with the page.

But it works, and the web actually became really popular on the back of this architecture.

So we had applications that people could use and they could interact with their applications and they could do it from anywhere and from any device and they didn't need to know anything to get started. They had a browser, they just needed to know the URL or the domain and they were good to go.

And for us developers, it made it much easier to launch applications.

So imagine trying to build something like Facebook as a desktop application.

You might send an installation CD to everyone you wanted to join, but that's a lot of CDs. But on the web, launching was just a matter of copying some files and putting them on a Web server and you're good to go.

But we're not actually here to talk about server rendering. We're here to talk about client rendering, and the reason is the web is undergoing this second transformation.

The first was that shift from static files to server rendering, and the second transformation is this move from server rendering to client rendering. And the reason we're seeing these is because we're starting to ask for more from our applications. So it was about 10 years ago that the iPhone came out, and it popularized this idea of delighting users with design.

And so it really raised the bar for what we asked of our applications, for how they looked and how they felt. And for server-rendered applications, it's really hard to be interactive and responsive if every time I clicked on a button, we have to go and make a request to the server, then we're never going to be fast.

We're limited by that round-trip time, which is limited by the speed of light, and that's not gonna change anytime soon.

So we all know the answer to this, right? It's JavaScript.

10 years ago, this wasn't such an obvious conclusion. So JavaScript is this kind of janky language that you'd use to sprinkle some functionality on your HTML document.

Maybe you can have an image that followed your mouse around the screen, but it was not a real language and not something you'd ever contemplate building an application in.

And this is real JavaScript by the way.

This is stuff that's floating around out there on the Internet.

Imagine trying to debug this without Firebug or Chrome dev tools.

Like the developer experience was just not polished. It was just a really immature ecosystem.

But thankfully, we moved on from this and we learned to be rock stars.

So this is a really embarrassing banner from way back in the day.

Yeah, that's shocking.

But yeah, what jQuery did is it kind of introduced this façade to the dom and kind of papered over all of the janky bits that were hard to understand and it made us look at JavaScript, the language, with new eyes.

And from there, we kind of had this steady stream of JavaScript libraries that has kind of turned into a torrent.

But that's a topic for another talk.

And just like we discovered the right way to build server-rendered applications with that stateless model, and we codified that into abstractions like Django and Rails, we're learning to solve problems that work for the browser.

So things like data flow and view rendering. And we're codifying that into our own set of abstractions. But this is not the full story because by moving all of that logic to the client, we've introduced a new problem and that's load time. First that it's really hard to get performance right through client-rendered applications.

So a couple of years ago, we wanted to see how far we could push JavaScript on the web.

And so we built this prototype.

So it was a replacement for our mobile site. And we used React to render this, so it was all client rendered, all React rendered.

And using that, it was actually a really good developer experience.

So very quickly, we could build something that felt great to use.

But when we tested it in real-world conditions on real-world networks and real-world devices, we were just never happy with performance.

So we could never get it to be quite as fast as that server rendered application that we were comparing against.

And obviously, performance is really important. This is why we do all of this so that we can have great-performing applications and have a great experience.

We're not gonna turn around and ship a product just because we happen to like the developer experience better.

That's serving our needs and not the needs of our users. So this is kind of where we're at today.

So we have these great JavaScript libraries and all of these great tools, and we wanna be able to create great experiences to have that nice interactivity and everything that the user wants. But we're still figuring out how to make it fast. So like I said, the web is 25 years old, but really, we're using it in a brand-new way. And so some people just tell you that this is unsolvable. They'll say, "JavaScript is too slow," or, "React is too slow." But we didn't really think that this was the case. So we were actually shipping JavaScript and React in our native applications, and we were happy with that performance.

And that's because when we initialized JavaScript, we were loading it off the device.

So it wasn't JavaScript that was too slow, it was the constraints of shipping code on the web. And so there's a couple of solutions to how we go about solving this problem.

One is we cache everything really aggressively and we preload it.

And this is like pretending that we're a native application. So in this model, the server doesn't know anything about how our application is structured.

And so if I say I want the about page, it'll just say, here's the app bundle, that's all I know, you go and figure it out for yourself. The problem with this is that native applications, they set this terrible example for us to follow. So this is the top-rated weather app on the iOS App Store. It's a weather app.

It doesn't do that much complex stuff.

And it's 90 megabytes.

So this is not something that we should try and bring to the web.

We don't want this.

The idea of shipping 90 megabytes is kind of insane. We actually ship a megabyte of JavaScript code on Facebook's home page.

So suddenly, shipping two orders of magnitude more is not something we wanna do.

And yeah, we can kind of cache this stuff, but if the very first time I go to Facebook I just get a loading indicator because we're loading all of this JavaScript in advance, then I don't really care if you tell me it's gonna be faster next time because I wanna see what people are up to right now, I don't wanna have to wait.

And so there are some applications that try and do this. So you might recognize this.

Does anyone recognize this? Hands up.

Yeah.

So you recognize this and remember this because it's not a good experience and it doesn't feel like it belongs in the web.

We shouldn't ever recognize a loading indicator. That's not a good model to go for.

And the other problem that we found is that caching code is fundamentally incompatible with shipping on a regular basis.

So if every time we ship a new release, we invalidate all of those caches that we worked so hard to build.

We're not gonna get any value out of them.

So on our native applications, we actually ship code once every two weeks and then we let those application versions stay around for months or years after that.

And on those platforms, that was actually seen as really fast.

Like every two weeks, wow, that's so fast.

But we kind of started on the web where we're shipping multiple times a day and we're just like, "What are you talking about? "This is, of course, how you would develop applications." Because we've actually found that releasing code more often, it led the product to be better and it made it more stable. So by releasing often, we were able to see very, very quickly if the product was behaving how we thought it should behave, if users were using it how he expected them to use it. And for engineers, they weren't kind of pushed to, they didn't wanna rush to get their code and their divs into a release branch so that it will go out to production. They could just ship something when they knew it was good and when they knew it was stable.

And so the other solution to this is universal JavaScript, which is isomorphic JavaScript.

And this is where we take those client templates and we render them on the server in advance. And so a lot of people are doing this.

It's actually really easy to plug into an existing client, existing client rendered application for a lot of technologies.

And so this is really popular because we'll get the HTML down to the client and we'll get pixels on the screen really quickly and everyone will be happy because we have that high perceived performance. And so I don't wanna say isomorphic rendering is bad. It definitely has a lot of uses.

From our little friend the spider over here, it's great so that we can have SEO and have that search engine optimization.

But we found that, at least for us, it just really wasn't addressing the performance problems we were seeing. What it was doing was masking them.

It was making us kind of think, "Oh yeah, we're great. "We developed this really fast application." But it wasn't fast, we were just kind of putting a facade in front of what the user was seeing.

So the reason we wanted to do client rendering in the first place is because we wanted our application to be interactive.

If I clicked on the Like button and nothing happens because the event listener is wide up, then I haven't given you a Like button at all. What I'm giving you is a photograph of a webpage. And so it's definitely a nice technology demo and I'm gonna give the server a gold star for trying really hard.

But it doesn't really solve that performance problem at all. It's just hiding it.

And so we still need the client to execute code quickly. And so neither of those things really solved our problem, and that's because they didn't address the root cause. And that's that this move the client rendering, it's taken us away from a world where we just loaded the parts we needed to this world where we have to load all of the application code upfront so that we can start executing.

And this is the exact problem that the web had been built to solve.

And the web hasn't changed, it's just that we're kind of using it differently now.

We kind of assumed that in that shift to client rendering, we moved all of the application logic across. We didn't need the server anymore, we'll just make it an API layer or a CDN.

But the server is exactly what let us have that nice incremental loading behavior.

So we wanna bring that back for client rendered applications too.

And to do this, we need to figure out that boundary between the client and the server, because having the right boundary is what lets you move application logic around. And especially for us, what we wanna do is we wanna divide the labor between the client and the server.

We don't wanna have to just pick one of these. And this boundary for server rendering, it was obviously URIs and HTML.

So we need to figure out what it looks like for client rendered applications too.

And to do this, we looked at the strengths of the client and the server.

And so the client is great at controlling cache, right? So we have local storage, index DB.

We can control exactly what gets cached, whether it's resources or data or anything like that when we invalidate it, and we have a lot of control there. In the server, obviously have caching var headers, but we don't have this really fine-grained control. And the client is obviously great at handling interaction because we can immediately update parts of the page when we need to.

And finally, with the help of things like service worker, which is really exciting technology, we can have offline-first experiences.

So we can render the page even if the network is down or while we're waiting for that initial request to be fulfilled.

But the client, it's never gonna be good at data fetching. We're always gonna be limited by that round-trip time. And again, to load code, we're having to touch the network. We're having to get those bites to come down from the CDN or somewhere like that.

And finally, SEO is much easier to do for the server. And so what we want is to develop an architecture where we can do all of that stuff that the client is good at doing on the client and try and avoid all that stuff that it's bad at doing.

And I'd love to be able to tell you that there's a single way that we do this, it's just a bit we can flip and everything kind of magically works.

But unfortunately, in engineering, things don't often play out nicely like that. So we're gonna have to tackle these one by one. And to do that, I'm gonna show you a few different techniques that we use. And to keep it simple, I'm gonna say let's develop a simple application.

So this is a profile page.

It looks kind of like a Facebook profile page. There's some navigation, some images and videos. And if I asked you to render this on the server, then you might build something that looks like this. So firstly, we just get the user information based on the request that came in and we'd render the header bar.

We'd then check if we're on the feed and if we are, we'll get the story information and render either the images or the video.

And then finally, we can get the comments and render the comment interface and the Like interface. And so if we're running this on the server, it's gonna be fairly good experience.

Like, we're doing a few roundtrips, but the server can do that well.

If we were to take this code and run it on the client, which we can do because it's JavaScript, which works out nicely, it's not gonna go as well because firstly, we can't show anything until we've loaded that code upfront. So we're just gonna literally get an empty page. When we're finally loaded and compiled a parcel of that code, the very first thing we do is we hit this get user block, and that means going back to the network.

And we'll do that and then we can render the header, but then we have to go and get the stories. And then after we do that, we have to go and get the comments.

And then finally, after we've done all of these roundtrips, we can show the page.

So this is not gonna be a good experience if we just take those patterns that work for server rendering and try and apply them to the client. If I map this out on the timeline, it would look something like this.

So the way we structured our code means we're doing a bunch of slow work in series.

And so we haven't developed for the strengths of the server and the strengths of the client.

So I'm gonna walk you through each of these kind of blocks here, and the first is this huge block of JavaScript preparation.

So we all kind of know that like loading JavaScript is expensive, right? We always have that network time.

So this is why we have things like tree-shaking and we have things like JavaScript minification. But it actually turns out that it's a little bit more nuanced than this because it's not just download time that we have to worry about.

So we actually also have to worry about the parse time, the completion time and the execution time of that JavaScript.

And so we need to do all of these things before our application can really start.

So if you have a module system like CommonJS or some kind of require system, the body of each of your modules, it's just JavaScript, right? And so you need to execute that before that module exists. And so that means if you have a huge dependency tree, that's a lot of JavaScript you need to execute before you can really even get into application code. And so maybe we could try and do some push from the server. We could embed our scripts in script tags and HTML, or we could use H2 push to get it down to the client quickly.

But this really doesn't touch the parse and compile cost at all, and it's kind of avoiding the real issue.

We don't want to just make it faster if we can avoid that work.

I think Tim would approve of this slide.

He made this point that performance experts have to always have this slide somewhere in their deck. And the earliest versions of the web, they didn't ask how can we get all of the pages down to the client as fast as possible? They said, how can we get just the page that the user needs right now? So the problem with JavaScript is that our abstractions, they kind of force us to do more work than we need. So things like require statements and import statements, they're all synchronous.

So that means we have to resolve all of the functions before we can do anything.

So for example if I'm on the Feed tab, I still need render about to be resolved and initialized. Because if we don't and then we, for example, change tabs, we don't have a function to execute.

It's gonna be a runtime error.

So we need to do everything in advance.

And so there are some abstractions like AMD and System.import that let you asynchronously require modules.

But doing this kind of everywhere would be verbose, like especially in JavaScript where we love to have utility functions like old friend left pad here. Something as simple as doing some string manipulation might be calling into a different module, so we don't want to make every single function call asynchronous.

And so what we wanna do is introduce boundaries between the different parts of our application in just the places that it makes sense.

And so we use routing to do this.

And so this gives us the biggest win without kind of getting in our way.

And so there are a lot of great routing libraries out there. We actually ended up building our own and it looks a little bit like this.

So we have this matter out function where we have a bunch of routes listed, and each of those is a function call.

And this interface means that we can actually move the require statements that are needed for each of those routes into the branch.

And because we've done that, we can use JavaScript scoping to kind of eliminate one of the branches if we know that we don't need it.

So for example, if we're rendering the feed route, we don't have to load the render about function until a later point in time.

And then we can actually go one step further and generate a bundle for each of these routes automatically.

So we actually have a build step that will walk all of the AST of all of our JavaScript code and it will find all of the references to match route and then all of the references to routes, and then all of the dependencies for those routes so we can build up this big map.

And that means that when someone comes to our site and they say, hey, I'm gonna hit the about page, we can look that up in the map and we can say, well, that maps to the about route, and the about route will need these resources. And so we can just serve them those resources and nothing else.

And by doing this, we can minimize the amount of JavaScript that we need to download and then parse and compile and execute.

But we still have to worry about these data fetching roundtrips.

And so these exists because of those calls to get data. So we have to get the user before we can know which stories we need to fetch, and have to get the stories before we can know which comments we have to get. And so as engineers, we can actually look at this and we can see that we're defining a series of relationships between these different data types. So we actually have context that the interpreter doesn't because we understand why we have these variables and what they mean, when the interpreter just sees them as meaningless integers and arrays.

But even with this context we have, we can't speed things up.

We can't go and get the comments without knowing what IDs we care about.

And so what we're actually doing is we're asking the client to do something it's bad at doing, which are those data fetching roundtrips.

So ideally, we can move this logic to the server. And now the easy way to do this would be to build an endpoint for our profile page and just move all of that logic across.

We'll call it the profile and stories and comments page. And we actually used to do this quite a lot, but we found that it just doesn't work in the long term and in practice because as you add more products, more pages to your product; and as you add more version, there's more and more code that you have to keep around in your server application and it just doesn't scale over time.

So we felt that, that abstraction or that boundary between the client and the server, it still wasn't quite right.

And so to help solve this, we built GraphQL. And what GraphQL lets you do is it lets you define the query that you want or the data that you want, and then you can send that to a server that will execute it and do whatever series of imperative steps are required to fetch that data.

And then it will give you this JSON response back at the end.

And if we take this and we put this in our application at the bottom here, it's not too different from what we had before, right? But what we've actually done is we've embedded that context about what comments are we talking about into our code. So here, it's really unambiguous.

We can say the comments are clearly the comments that apply to stories, and stories are the stories that apply to that user.

And so what we've done here is we've replaced a series of imperative JavaScript statements with a GraphQL, sorry, a declarative interface, which is the GraphQL query.

And declarative interfaces, they're great, right? We have a lot of declarative interfaces in the web world. We have HTML, we have CSS, all of this kind of stuff. And they'll allow us to separate the what and the how. So for our application, the what is the data or the shape of the data that we need.

And the how is how we actually go about fetching that, which is a series of database lookups.

And this allows us to define the query on the client but then execute it on the server.

And so we've implemented this boundary between the client and the server in the form of a GraphQL query. And by doing that, we can actually remove all of those extra roundtrips by just doing one round-trip to fetch that data.

But we can actually do better than that.

We wanna remove all of the roundtrips that are part of our application startup.

So how do we go about doing that? So if we look at the code where we now have that GraphQL query, we can see that we have context again. We can see, okay, obviously, what we're gonna do when we execute this code is we're gonna go to the network. But the client and the interpreter, it doesn't know this. So what it needs to do is it needs to ask the server for the HTML page, load all of the JavaScript, start executing it.

It'll get to that very first statement, that GraphQL query and it'll say, you know what, I have to go back to the server again.

So there's really no way to avoid this, right? We have to have the code loaded so that we can execute it. So what we're actually blocked on here is loading that code. And so this is something that the server is good at doing because obviously, when a request comes in, the server's loaded up all of the application code. And if we, like, happen to refer to a module that we haven't loaded yet, we can literally just get it off disk.

So it's not an expensive operation.

So we wanna run just the data fetching parts of our application on the server.

Sorry, I'm a little lost here.

Oh, we're all good.

So we wanna run just the application fetching parts on the server, and we actually have a way of doing that because we're using these GraphQL statements. And so we actually have a second build step. So it, again, walks out AST and it looks for all of the references to GraphQL queries.

And then we can construct a map of the dependencies to the GraphQL queries that will execute.

So when someone comes in and hits our side, we look at the path they're requesting, we figure out which route that maps to, we figure out which dependencies are required for that route, and then we figure out which GraphQL queries will be executed as part of those dependencies. And so we can take those queries and then execute them on the server in advance.

And then what we'll do is return that as part of the initial HTML response.

So what we tell the client when we're sending them the HTML responses, we say, hey, here's all of the JavaScript that you require, and then when you're ready, when you initialize your application and you're good to go, here's all of the data that you'll need to actually run your application.

And so we call this preload mode.

And we actually found that having a parse step to extract all of this information in advance is both powerful because we don't have to do runtime execution.

But it's also really flexible.

So we actually have a really mature PHP stack. We have a variant called Hack, and we run this on all of our web servers.

And because we've already extract these queries in advance, we don't even need to run JavaScript on the survey if we don't want to.

We can actually have all of this functionality where you're executing the queries that are defined in JavaScript without having to run JavaScript. We can do it all in Hack.

And so by doing this, we can eliminate that round-trip that's part of our application startup.

And so obviously there's more techniques.

We can use things like service worker can massively reduce the amount of JavaScript we might need to load or reduce the amount of overhead of that initial HTML response.

But they've been covered pretty well in this conference so I'm gonna leave it here today.

And so this is the final state of our application. And now we've been using these abstractions for a while now, so our ads code base, we actually have tens of thousands of JavaScript components and we use routing to minimize the amount of overhead and know exactly what we need to load on startup.

And on the news feed, which is the main page you get when you go to Facebook, we use React to render the comments, the chat sidebar and the interface where you create new posts.

And all of the data that we need to render those components, we return as part of that initial HTML payload. And so we know that these ideas work and they massively speed up our page load time, but they only really help us if engineers use them. So in the old days, if someone created a JavaScript application or a JavaScript component and they got performance wrong, then there wasn't a whole heap we could do to kind of speed it up.

So things like loading code and loading data, they're so caught at how applications work that you can't really retrofit a solution after the fact. You're kind of stuck with what you got in the first place. And so to help with that, we put a framework called Relay. And so it combines GraphQL and React.

And with our internal setup of Relay, we give you all of the things that I've shown you today. You have routing, you have GraphQL-ing, you have preload mode.

And we found that, that makes it much easier to do the right thing.

So we can have engineers build performant JavaScript applications without having to be JavaScript performance experts.

We've actually had people come up to us and they said, "Hey, we built a new product and it's 20% faster "than the old one." And this is pretty awesome, right? But the even better thing is that we didn't know that they were building it in the first place. We didn't have to try and lead them toward the best solution or coach them on like all of this esoteric JavaScript they had to write to get things right.

They just naturally found the right solution. And so this is the ideal that we've been working towards. And we feel good about these abstractions and we feel like that they belong on the web. We saw back at the beginning that the web server was created to solve this problem of getting the right information at the right time. No matter how big your website is or how many bundles you have in an application, we can load just the parts we need to get started. And this allows our applications and our websites to grow over time without slowing down the initial page load.

And to do this, we had to understand the strengths of the server and the strength of the client. So we don't have to choose just one of these. They're different environments, they have different strengths, different weaknesses and they should be used for different things. And to get this route, we also had to nail that interface between the server and the client. Having the right interface is what lets you move application logic around or, even better, break it up and do different parts in different environments.

So routing, it let us load our application without having to load all of the JavaScript upfront. We could defer stuff until later.

And GraphQL, it let us ask for data without having to worry about round trips.

And preload mode, it gave us data without us even having to ask for it in the first place.

So for a client application, this is like magic, right? Who doesn't want just the free data provided to you? And so we really love the development experience that you get with JavaScript in the browser. You can create these really responsive and rich applications that you can access from anywhere. And by using the survey in smart ways, we're able to let the client do more of that stuff that it's great at doing.

So the next time you're creating a web application or sorry, client-rendered application, I want you to think about more than just the HTML and JavaScript in your code base.

I want you to think that this platform where you get these two different but powerful environments that you can use.

So this is really rediscovering what the server's there to do.

It's there to empower the client and give the client what it needs to get started.

Because together, the client and the server, they form this platform, and it's a platform that gives us as engineers the tools we need to create fast applications. Thank you.

(applause) Thanks, thanks.

- Thanks so much, Josh.

We might have time for a question or two before we break for lunch.

Somebody got something for Josh or Yoav or (mumbles)? All right, Rob over here.

Sorry.

Oh.

- Hello, that was really interesting.

Thanks for doing the talk.

A question about the, so you've got what the user interface looks like sitting in the JavaScript itself.

So you've got initially nothing on the page and then basically JavaScript's gonna come vertically in the display.

Am I correct with that, just to start with? - Sorry, I missed the first part, what you said then. - Okay, got it.

So when you initially hit, say, the about HTML page, for example, is there anything rendered on it initially? Is there any server-side provided stuff like a frame, like some of the stuff mentioned yesterday? - Good question.

So we actually do a mix.

Traditionally, we've used a lot of PHP for server side rendering, and then we'll tend to put React code on top of that.

So we will augment the frame that's generated from PHP. We're actually moving away from that because you tend to end up with the worst of both worlds. You have maybe like a couple of seconds of PHP execution and then you have a couple of seconds of React execution after that.

So what you want to do is kind of like try and parallelize those so that you're doing them both at the same time, and having client-first execution actually helps with that.

So the server is just streaming in, say, like data or dependencies to the client, but it's able to execute as fast as possible. - Okay.

Some other questions for Josh? Oh, here we are.

That was Neil (mumbles), yup.

That was probably Bill.

(chuckles) - Hi Josh.

I'm interested in the way you split up the code to just download what you need at the beginning. - Yes.

- How much of that is a manual process or how much of that is now completely automated by your-- - Completely automated.

So we've had a build system for a while that basically analyzes how people use Facebook, and it will check what resources are loaded as part of the page loader, as part of the page lifecycle. And then we use machine learning.

It's less fancy than it sounds but it will basically generate the optimal package configuration across all of Facebook.

And so what we do when you come to Facebook, we actually determine which dependencies you need, and then we figure out which bundles are required to fulfill those dependencies.

- Any of those in the client or is it always download a new bundle when it goes to the site? - We use the browser cache where possible.

So we do that methodology of just like hashing based on the contents.

And so you if you download a resource and you ever need it again, hypothetically, the browser cache will fulfill that.

- So if you went to a page that then had, we used half the same components, would it download those components again in the new bundle or would it get a bundle that just the bits it was missing? - No.

It's a one-to-one mapping between, every module only exists in one bundle.

And so that's why we wanna actually get the optimal bundling configuration because if you happen to require something because you have one module that you care about out of say like 20, that's not a great hit rate, right? You're downloading 19 modules that you don't need. - Right, so one page can load multiple modules-- - Yeah, we also do the kind of async loading. So we just load the subset of the page that's changed and leave the Chrome static.

So as you browse around within a given session, you have the optimal kind of experience.

- Cool, thanks. - [Josh] Cool.

- Anymore? Come on, there's gotta be more questions for Josh. - We've got one over in the right here.

- Oh right over here, oh.

Is it Jeremy? Yeah.

Sorry, I can almost see, sorry.

- Great talk.

I'm wondering about with GraphQL with error handling that for example, the stories endpoint had an error, how do you return that error in the response? - So this isn't really defined by GraphQL.

I think it's application-specific.

One of two things will happen.

One is that you could invalidate the entire request. Generally, we only do that if it's a catastrophic failure, like the server is overextended and we can't fulfill the request at all.

Typically, what we'll do is we'll null out the parts of the tree that we can't request for you. And so what you can actually do with GraphQL is you can request really anything.

You could say, for the story, give me the videos and the images and the 3D content and the audio and the, I don't know, whatever else, whatever content types.

And if it just happens to not exist, it's just like, well, you requested something that doesn't exist, there's just no key on the response that corresponds to that.

And so we can use the same thing when a story that should exist, we can't actually fulfill that. We can just emit it and it's like you're still gonna get a good experience because we'll show you the rest of the page or the rest of the application that we can actually give you.

- Any more out there? So I've got one, I guess.

Obviously, over the last 18 months or so or maybe less, last 12 months, the idea of that progressive web app, obviously service worker, whether you're using that to cache or whether you're kind of build you're own caching with index db, enabling or encouraging and allowing users to install on the home screen.

Are these things that you and your team have been thinking about, exploring? - We're really excited about service worker. Like, there's great potential there.

So I kind of alluded to this before, but like if we can load from service worker and have that offline-first experience, like, yes, you get that offline experience but like the better part is that you're actually starting from time zero.

It's like you get an extra 500 or 1000 milliseconds to actually do whatever JavaScript execution you want. And like if you're looking at your total page time being three or four seconds, then like getting an extra second is like Christmas, right? You know, free time is awesome.

So that's super exciting.

The thing about the app show model is we're less excited about that because people don't come to Facebook to look at that blue bar that is at the top of the page. Like, as pretty as it is, it's like they come to actually see the content that people are posting.

So like we still have to optimize that code path to get that content down as soon as possible, and that's always gonna come from the server because the client can't cache content that it doesn't know about. - I guess then the idea of background sync that obviously Marcos was talking about yesterday as well, which is not really there yet but is certainly scoped out, so the idea that would you think to sync in the background-- - Yeah, so-- - Which is how Facebook works as a native-- - Exactly, yeah.

So that's pretty exciting.

So there's two parts of it.

Background sync, I think the thing that's specced out at the moment is for uploads so you can upload a media object without having to keep your browser open.

As to having push notifications when we say oh your friend posted some new content and we send that down to you, that's pretty exciting, but there's some limitations at the moment. The browser vendors are really being quite conservative about what they let you do with push notifications because if you imagine a hundred different sites all sending you push notifications and constantly waking up your service workers, it would be a bad experience and it would drain your battery.

So they're starting off by being very conservative and then opening it up as they kind of see how people are using it.

But that's totally the way that we're gonna be going in the future, is that like we will be pushing new content so that if we know that you're gonna load Facebook, we'll be able to show you content straight away and then incrementally load more stuff from the server after that.

- Right.

And I guess, or even just caching the most recent content the someone's had and they reopen off and just caching that-- - Yeah.

- Pulling that in before you even have fetched new content-- - Yeah, and that's a great difference, especially when you have chronological ordering, which Facebook famously does not have chronological ordering.

So you've got to be a little bit cautious about, do you want to show something from yesterday if they've already seen it and they didn't like it. So there's lots of kind of business concerns or product concerns that go into whether you wanna do that. - But you are definitely, like, these are the things you-- - Totally, yeah.

We're experimenting with service worker.

It's super exciting stuff.

- Fantastic, all right.

Any more questions from folks? Come on, give me one more.

I challenge you for one more.

(Josh chuckles) Looking at the clock, they're getting hungry. Right up the front here, and then we'll break, I promise. And you'll see what we've got in store for lunch. That's right, is we (mumbles) reputation's, you know, "There's gonna be great food we're gonna get today." But anyway, final question.

- I was just wondering if any of those static parsing libraries are available at all, or is it all just behind closed doors? - So we use Babel.

So it's amazing and it's awesome.

So I believe that's what we use for implementing most of this.

Our dependency management system is very, very coupled to our code base, and it wouldn't really make sense for us to open source it.

Like, I think, webpack does a pretty good attempt at solving all of that kind of stuff.

- Seb's working on something though, I was talking to him at dinner last night.

- Seb's working on, like, 10 different things. - Right, something around like NPM style or replacement NPM. - That's a little bit different.

So that's less about mapping dependencies and more about solving the problem of like if you have an NPM client that will do arbitrary JavaScript execution or sorry, arbitrary anything execution, you probably don't want to do that inside Facebook's corporate network.

(laughs) So we try and avoid that.

- Fair enough.

And hang on, I think I'm derailed the answer to the question, or have we got there? - So we used Babel.

We're starting to use more and more open source stuff and also like bringing it inwards.

Like I think traditionally, we've been fairly, like, we'll push out open source stuff but like, we won't take a lot back.

But with Babel and that kind of stuff, it's been super useful.

And there's also like, what is it, ESprima and this js code shift.

And so that's more manual base 3 factoring and AST exploration, but we're starting to use more of the open source stuff and having more collaboration with open-source community.

- Thank you.

- Awesome, all right.

Let's, once again, thank Josh and Yoav.

Great session this morning.