Compiling for the Web with WebAssembly
(upbeat music) - Alright, nice to be here.
Having a good day so far? You're not out in the boiling heat outside, it's good to see! (chuckles) Alright, so a quick round of questions.
Who's written a C or C++ as part of their day job or a hobby? Quite a few, okay, excellent.
Anyone written any Rust? A few people, oh, this is a great crowd! Okay, well, today I'm just gonna try and let you leave the room with a bit of kind of mental space knowledge about what this WebAssembly thing is, how you should start using it, why you should use it, the kind of tools that are out there to do stuff with. I got a couple of demos, stuff that I've made, a couple of things that other people have made. And we'll talk a little bit about the future where WebAssembly is heading so it's a kind of what chip's today is, what's called a minimum viable product, so it's kind of the smallest thing all the browsers could ship, everyone could agree on. So the future is stuff that people are still arguing over. Okay, so what is WebAssembly? That's probably the first question.
It's a new binary format for executable code. So, can you imagine, like you have your laptop or your desktop workstation? All the applications you run on that like your web browser or your editor or an ID, whatever, they're compiled down to native code.
And that code's binary so it's like X86 instructions or could be ARM instructions on another device like an iPad Pro or a Chromebook.
So what WebAssembly is is those kinds of instructions. So the actual adds, multiplies, loads and stores, it's kind of an instruction set for an imaginary computer. So what are these is the instruction set is designed to look like a virtual machine that's a stack-based virtual machine.
So does anyone know what a virtual machine is? Quite a few, awesome! (chuckles) Okay, so when you have virtual machines, basically, it's like this imaginary computer and you compile your code to the instruction set for this imaginary computer and then the WebAssembly run time takes that instruction set, converts it to real instructions using a JIT or Just-In-Time compiler and then your programme runs really fast.
So that's kind of the basic little foundational piece. Now, this is also a specification that's being developed in the W3C so people all know what the W3C is, yes? Awesome, Worldwide Web Consortium.
Yeah, so that's basically the forum with which all web standards are kind of blessed and then eventually end up in browsers or end up dumped on the floor, depending on the spec. In this case, the WebAssembly spec was designed in what's called a community group.
So, I've done stuff with community groups.
Has anyone here participated? Anyone know what they are? One hands, no, nobody! Cool.
You know what they are? Anyway, so for the people that don't know, when you develop a specification, something like say the HTML spec, it goes through what's called a working group which is a formal group.
They all sit around the table, they throw paper planes at each and argue over minutiae of the spec.
Now, it's a really horrible process to engage in, it's very boring and dry and technical.
And so the W3C came along with this idea of a community group and anyone in this room can start a community group.
So I could say, "Damon, you're seated in the front. "Right, you and I should write the next "3D graphic spec thing for the web community group." And we can just participate.
We don't have to sign up anything, anyone can join in and that's the kind of environment that WebAssembly was developed in.
So we had people from Apple, from (chuckles) Microsoft, from Mozilla, everywhere, all in there, arguing, doing all sorts of stuff and then when the spec was eventually developed, it graduated into a working group for a final, proper formal specification.
But the beautiful thing about this stuff and the reason you are all here is because it's out. It's actually something that's available now in all major browsers and this is an amazing achievement. It's very hard to imagine that all four browser vendors or four of the major browser vendors got the same thing out at the same time and spoke to each other.
That's unprecedented in W3C history.
So when we say it's in all major browsers, it's in Chrome, it's in Firefox, it's in Edge, it's in Safari.
But more importantly, as web developers, if you're sitting here thinking, "Should I use this technology?" The overwhelming answer is yes, 'cause it's out there and more importantly, Chrome on mobile, iOS Safari, both support it as well so you're talking about two billion portable devices out in the field that runs this stuff.
So you need to seriously consider it if you wanna build things.
Now what it actually is, it's a compilation target for other languages.
And so that's what WebAssembly has targeted. So it's meant for writing C or C++ or Rust, and compiling those strongly typed languages into this binary format and the primary reason for doing that is high performance.
Now, it's the logical successor to asm.js and Native Client. Does anyone know what these technologies are? Few people, okay, I'll give you a little quick run through of the beginning.
Actually, I'll tell you an anecdote of what really happened in my life in about 2003, 2004.
Anyway, fast forward a few years and there's a crazy guy.
Well, maybe he's not crazy, he might be a nice guy. Alon Zakai from Mozilla who built a compiler to do exactly that.
It's called Emscripten.
So it was a complete disaster.
And this worked great for Firefox absolutely brilliantly. But the other browser vendors weren't so keen on it. And so, in Chrome's case with the V8, we were like, "It's nice and all but we don't believe "that that's the future, necessarily.
"We're gonna build this thing called Native Client." And probably one or two people have maybe used Native Client in this room if you're lucky.
But what Native Client was was basically X86 machine code in a file.
But that X86 machine code had special instructions to guard the memory region so that pointers and things couldn't go out in the memory and corrupt things and cause all sorts of attack vectors so Native Client ran a bit slower than true X86 code, roughly 5% slower.
And so after we had Native Client, people started shipping laptops and things with chips in them so we built this thing called Portable Native Client.
Now, Portable Native Client was like almost like an independent language for different architectures, it was for technical people, it's a LLVM bitcode. So it's basically halfway through the compiler step. Like the compiler goes from source code to intermediate code to machine code.
So we would ship this intermediate code as a bundle and then the client browser would actually convert it to final machine code.
Now, that took us years to ship and the main reason is it was a really slow way to do things, to actually get started because the time to go from the intermediate code to nicely up the Moz machine code was quite slow and that part of the compiler again, was then vulnerable because it could have memory leaks and things like that.
So it had to run inside a knuckle sandbox so you had these sandbox compiler, I mean, it's just architectural nightmare and funnily enough, no other browser vendor wanted to touch it with a bar of soap. (chuckles) So, the teams that built Native Client, Portable Native Client, the teams that built asm.js got together and started working on WebAssembly and that's what we have today.
So what is WebAssembly? It's not web. (chuckles) Okay? This is kind of very ironic that they built this thing, they designed this like virtual architecture, if you like this instruction set, but there's absolutely nothing tying this to the web. It's literally just a binary format for executing programmes. It has no web APIs, it can be hosted anywhere. So for example, Node.js supports WebAssembly so you can write C or C++ and run it in Node.js. Who here used to write CGI bin programmes? Heaps of people, awesome! We used to write those in C! And so you could compile them and stick 'em in CGI bin and magic stuff would happen on the website. And that all went away, ASP.net came along and Java and all these things and Node.js 'cause it's easier and then now we've come full circle 'cause now you can take those old programmes of yours, compile them into WebAssembly and run 'em, courtesy of Node.js.
It's marvellous, isn't it? Everything else new again.
And the other thing is, it's not an assembly of donuts, and it's not an assembly language either.
So when you say WebAssembly, you think of assembly languages and assembly languages are like little words, add blah-blah-blah, register one, register two or whatever. Well, it's a binary format.
There's actually no assembly involved anywhere in WebAssembly.
There are text formats so you can kind of de-compile you binary into something that you can read. It looks vaguely like assembly language but it's really not assembly.
On the WebAssembly side, it's heap.
It's heap and stack and everything.
Then underneath, you have the actual run time, the virtual machine itself.
What it has in here is a few interesting things. Now, on the right, you can see there's this thing called the execution stack.
In a normal programme like on your laptop or workstation or whatever, there's a stack that contains all the function calls and they're called frames, that's where the frame point is. So when you have a debugger or when you have a malware scanner, or anything like that, now quite often you walk through the memory, you look at all the function calls.
In fact, if you're in DevTools or something, you can see your stack frames.
That's invisible to the WebAssembly programme. Now, that's for a specific reason.
If you can see the function call stack, you can change it and manipulate it and that's a brilliant malware vector.
And so one of the safety features of WebAssembly was disallowing you even seeing that.
So that's kinda hidden in the VM.
So this is a kind of isolation layer.
So when you compile a C or C++ to WebAssembly, you'll have something like this.
You'll have the size of the memory, you have the addressable stack summary in the heap, and what's interesting here is that everything's 32-bit at the moment.
So since that block of memory is a fixed size, all of the offsets inside the WebAssembly executable itself are just indexes.
They're not addresses.
Another really good use case is supporting image formats. They're not supported in a particular browser. A great example of this was there's a guy, Kenneth Christiansen, who's in Europe, and he actually wrote a WebP to bmp converter that runs inside a service worker, and ran it in Firefox. So you can basically, since Firefox doesn't really support WebP at the time, you can literally throw WebAssembly module inside the service worker, the webpage can just transparently fetch the image it thinks it wants and you can send WebP over the wires, save a whole lot of network bandwidth and do it all magically in the service worker and it's all hidden under the hood.
So that's a great use case.
Probably a bigger use case and this is far more relevant, probably to everyone in this room is the existing C and C++ codebases that exist in the world. Like if you go out to GitHub and places like that, there would literally be millions of man hours, of code just sitting out there ready to be pulled into the browser.
And if you think about this for a while, you can kind of (mumbles) and go, "Hang on, actually, that makes sense." Well, my company builds a mobile web app or mobile app like a Native iOS or Android, Apple or whatever.
And it's like, "Hang on, we can actually take that logic "and cross-compile it and run it in the browser "and that would be really cool." And a really good example of this, we showed AutoCAD running in the browser.
AutoCAD is this company Autodesk and that was code was written 30, 35 years ago. It's a really old codebase with millions of, millions of (chuckles) hours probably poured into it. Now, they managed to take their AutoCAD code, 30-something year old code, cross-compiled it, and run it in WebAssembly in the browser and so they have AutoCAD on the web now.
And one of the things they said was really cool about this was that they could fix bugs, so if they fix bugs in the desktop application, they just recompile it for the web and it's one codebase running everywhere.
It's just a great portability story.
And so, it's not just C, C++ and Rust.
These are the kind of main languages that are supported now but a lot of other languages are coming along. So this is part of the Go official release of version 1.11 has a Wasm port, there's Kotlin, there's people who have ported Mono, so there's all the .net languages, C Sharp, et cetera. There's a project Microsoft has called Blazor where you write in C Sharp, you do your whole user interface in this kind of C Sharp template in HTML thing and bang, you have this .net in the browser thing. It's amazing.
Other things that kinda get to be weird is blockchain. Okay, so these people came along to find something that is really cool to run in the browser.
But there's always crypto people that are wanting to do fancy things.
And as it turns out, because the geeks that designed WebAssembly were real language heads and designed it to be mathematically provable and safe, the blockchain people came along and said, "Hey, that's what we could use for some of the stuff," There's actually a project that lets you run WebAssembly programmes on top of the Ethereum smart contract thing. There's also a few projects that I've seen where they're doing distributed computations. So the company's basically about selling compute time to you, and so you compile the programme you wanna run, you say how much money you wanna pay to have that programme execute and do the work, and they send it out over the internet somewhere and you don't know what machine it's gonna land on so they just compile to native code, execute it, send the results back to you and collect the money so some very interesting places it's kind of turning up now.
And so when you wanna start playing with this stuff, it's really, really basic.
You literally instal an SDK on your computer called Emscripten, the Emscripten SDK.
And it just takes something like hello.c, like we'll say a programme hello.c that does something. You compile it with the emcc which is the C compiler for Emscripten, and you do a -o for the output then hello.js! And it's that simple.
You can even see a flow with this slide. (chuckles) What's wrong with this slide, people? Nope, well what's the js for? (chuckles) That we're compiling WebAssembly, right? What actually happens here is the compiler outputs two files.
It outputs a .js file and it outputs a .wasm file. And what the js file is is all the glue logic to load the module for you, to marshal all the variables in and out of the functions, to do all the magic that actually makes the module run. And so from your point of view as a web developer, you just literally do scripts, source equals hello.js and the magic happens under the hood, and you never know what was going on underneath. If you wanted to write this stuff yourself, you could. You can actually explicitly write this stuff. WebAssembly instantiate parse in the moduleBytes. You can do all this magic but we highly recommend you just use Emscripten because things change, APIs change, bugs are fixed, it's a lot easier to let Emscripten do the work for you.
And very soon, Wasm modules will be part of the year's module spec.
Now, a few little details about the (mumbles). WebAssembly modules can only access things via their imports and exports.
And so what that means is that, A, you need to define your API to the module at compile time.
And the converse is the same.
The module itself is what that's kind of a nomenclature or whatever.
What we call on the WebAssembly modules are module is a global.
Module exports the property and then the functions hang off the end.
So it's not like an RPC mechanism where you throw out a call and then something comes back across the wire. They're actually first class citizens working closely with each other, hence, you get a nice interleave stack frame. You can see that in DevTools, too.
So just a few of the bits of terminology to remember. Every WebAssembly binary is known as a module. And also, at the moment, modules can only be roughly two gigabytes in terms of memory consumption. The actual binary can be 512 meg in Chrome, at least. The heap can be two gig, but you can have multiple modules on a page, so if you have a 64-bit machine, you could theoretically have a bunch of two gig blocks and chew all your RAM up if you really wanna do that. (chuckles) Okay, so the actual format itself, there are three really important design decisions that were made in doing the format so the first thing is streaming.
And what that means is that the format is designed to be streamed over the network and the streaming aspect of the design means that you can compile to machine code as it's coming over the network.
So if I'm doing something big, if I'm running like a Unity 3D game, and I've got 10 megabytes say of WebAssembly to get over the wire, it can come streaming over to me.
I can be compiling to machine code as it's arriving. So the moment, the network has finished, it's pretty much ready to run.
So that's one of the top features.
Now, in order for streaming to work, the validation has to be single parse as well. So the design of the validation is that because the instructions have to be checked. You wanna make sure that things don't go out of the bounds of the memory, et cetera.
And so it's validated on the fly.
So it streams in, validated in one pass, spits out machine code.
And then the third thing, of course, which is almost obvious but it deserves to be said is that it's a design for efficient compilation to machine code.
So they really thought about the format so that's very fast convert to machine code. In fact, since the first release of WebAssembly, both Firefox and Chrome have implemented what's called a tiered compiler.
So what a tiered compiler does is does a really fast compile but not very optimised, initially.
And then spits up a background thread that compiles, does all the fancy optimizations and hot swaps. And the difference that this provides is something order of five to 10 times.
So Unity's 3D engine used to take say 11 seconds to start up on a Macbook Pro and that drops down less than two seconds. So it's literally so you can get your game up and running or your media codec or whatever it is that you're doing. So I've kinda touched on this a little bit earlier, that the modules themselves, they define the functions that they're exporting.
They define the globals that are global to the WebAssembly module.
Now, in WebAssembly, the amount of memory that's gonna be in your heap is defining 64k blocks.
So one memory block is 64k, and that was to be compatible with Windows. What you can do is you can say, "I want my WebAssembly module to consume a 100 meg," and that's it, or you can say, "I wanted to be able to grow the memory." In which case, if I try and allocate more memory, you will just make the memory bigger.
And so you would think naively that that's the right way to build them.
Just let everything grow and everything will be happy. In actual fact, that's not necessarily the case. And the reason is that when you grow the memory, you first have to allocate the bigger block and then copy stuff across.
So at the moment, the limit to the memory is two gig. So that means if you've consumed one giga memory, and it goes to try and enlarge it, it will allocate a block bigger than a gig and suddenly you've got more than two gig and the whole thing will crash. (chuckles) Instead, you could've just declared, "I wanna use one and a half gig," and everything will work perfectly.
So you have to trade off the growing memory not growing a memory.
And also, a growing memory has a small performance here as well.
Now, we talked briefly about the instruction set. It's a simple stack-based instruction set.
Now, the reason that they chose this stack-based instruction set-- I mean, there are two reasons.
One reason is it's a lot easier to do certain compiler optimizations that way. But the second more important reason is that when you design a virtual machine, like the Dalvik virtual machine or the Java virtual machine, Dalvik was Android's original virtual machine and that was a register-based virtual machine. And the JVM was a stack-based virtual machine. Now the difference is that you compile and you pre-allocate your registers for the register-based virtual machines.
So your register allocation thing is being done at compile stage.
On the stack-based one, you do that in the JIT. Now, today's microprocessors have different numbers of registers and the future ones could have different numbers of registers, et cetera.
So if you baked the number of registers into your design now, you don't future-proof it. And so that's a really good reason to go over the stack-based machine.
So what you do is if you wanna pass a string into a WebAssembly module, you write it into a portion of this typed array and then you parse the offset of that.
And so that looks like a pointer index to the programme in the WebAssembly module so you can easily push strings in and out.
It's just this simple interface was deliberate to make it fast to compile, et cetera.
So let's have a little look at what the instruction looks like.
Here is on the left there's a stack.
That's stack is basically contains two numbers and then so if we have this 32-bit add instruction, what happens is it takes the two numbers, adds them and replaces the bottom piece of the stack with the result.
And that's it.
Nice, simple computation.
Here's another example of loading from a fixed offsetting memory.
So there's two instructions, i32 const, it's saying the constant number of 1,024 and then the load is an indirect load from that value and that value happens to be sitting on the stack so what happens then is when the const instruction gets executed, it throws a 1,024 into the stack and then when the load instruction gets executed, it takes that, uses that as an offset and pulls the one, two, three, four, five, six, seven out of linear memory.
As a 32-bit value.
Enough of my talking.
You probably wanna see what kinds of things you can build, right? (chuckles) Yeah, of course! I was getting sick of my voice, too! Alright.
I'm gonna start with SVG because this is my favourite. Well, I wrote it so it's gotta be my favourite. (laughs) Alright, so what this is is a little SVG graphic that I've bashed together.
Now, the balls here, the grains of sand are falling down and doing all their magic.
Now what's important to know here, this is a web application, this is a web page. Those balls are all circle nodes so they're all with a little gradient on them.
Now how does the motion work? You're probably wondering.
Well, what I did was I took a C physics engine so this is physics engine called Chipmunk2D and it's used in a lot of top tiered titles. I cross-compiled it into a WebAssembly module and then I call into the physics engine to say where should I put the grain of sand? So basically I modelled the shape of the glass and I throw in all the balls, say where they are and the physics engine moves them around for me. So it literally kicks off on a request animation frame. And you can even flip it because it's an actual, physical model.
It just flips and all the grains of sand go where they want to.
So things like this, you could do that for gaming or any kind of other fun stuff.
I'll just put you on another demo.
So here is, this is a great demo.
I didn't write this.
But originally when I started playing with WebAssembly, we were looking at this thing going, "Well, you know, this is great." You can write an in-browser video editor, like I could just literally do this.
But what this particular application actually does is it does real time video processing in the browser courtesy of WebAssembly.
So it's literally got all this going.
Here's some edge detection happening, super edge inversion, so all these things, gaussian blur is probably boring.
(chuckles) Moss, this is the alien look, leaf look. (chuckles) Ghost, what does that do? Oh, okay.
Let's just click to there for just a moment. I just wanted to show you something else that I wrote that just out of curiosity, just so that you can see how portable this stuff is. Who here watches YouTube? Oh, I'm not connected to the internet.
Oh, dammit! (chuckles) Excuse me, I'll just hotspot and then I'll talk. So who watches YouTube? Anybody? Lots of people? Good.
The reason I asked you that is that YouTube is this wonderful thing.
Who here has unlimited internet? Not me.
(audience laughing) Come on, get on. Yeah, so the thing is that YouTube, I can't remember the statistics but the number of bytes being sent across the wire for YouTube are outrageously high. Just massive amounts of data.
And so of course, one thing that computer companies like Google and Apple and everybody have been working on a new generation video codecs. And so there's a new video codec called AV1 which is kind of gonna be the future of video on the web and the people at YouTube love that because it really uses up a lot less bandwidth than other video formats like H.264 or et cetera. And so I got the AV1 codec and I compiled it into a WebAssembly module because I figured let's see if you can actually run them in the browser.
And so this file, you can see it's 20 megabytes. And so this 20-megabyte file is like 10 minutes of video. It's a little bit chunky because it's all running on the main thread.
But this is literally the AV1 video decoder source code compiled into a WebAssembly module running in Safari. And this runs in all the browsers.
So when I talked about playing with new media formats and running 'em in browsers that don't support them, you can do this kind of thing.
And so this is a really great kind of application. It's quite amazing when you think about it. Okay, so, I will just go back to you for a minute.
Okay, now that you've seen what you can build, you're all dying to get started, right? Who's gonna try it when they leave? (chuckles) More than one, good.
I mean, thank you! (laughs) First thing you need to do is load the Emscripten SDK. This short link will get you there.
Or you can go to WebAssembly.org so really, the best reference place to work with the stuff is WebAssembly.org 'cause that has links to tutorials, to documentation, to where you get the SDK, et cetera, et cetera.
So it's a great source of stuff.
Now, few bits of advice.
If you really wanna do good things, run your WebAssembly modules inside Web Workers, if you can.
Because WebAssembly typically is for really heavy computational workloads so if you can shove 'em in a Worker, even better. So like the AutoCAD case that I talked about earlier, they did basically exactly that.
When you're compiling, use -O3 for the fastest performance, alright, so that just optimises the code even better. So there's -O2, -O3, et cetera but if you're not debugging, then -O3.
Now, Chrome 70, which has just hit stable a few days ago, includes threads support.
And what that means is that if you have a multi-threaded application, you can actually compile it to run and it will run in Chrome 70 right now. So for things like 3D games, they quite often use threading, Google Earth wants threading because they parallel load the tiles that cover the globe.
So there are a lot of applications that actually require threading support to work properly. In fact, I'll just give you another little tiny demo just so that you can see what that look like. So here is an example of a really simple thread of application.
This is ASCII, multi-threaded ASCII, where we have four threads running.
Each thread is responsible for two of the characters and so you have no idea what order they're gonna be coming up in because the threads are running in parallel that are completely non-deterministic.
So it's kind of bit of fun, I didn't write that, by the way. (chuckles) Anyway.
Then another little tip.
When you're debugging, is that if you open up Chrome DevTools when you're running a WebAssembly module, that instantly turns on an interpreter, rather than running the native code.
The reason for that is because it thinks you're trying to debug so if you're gonna be single-stepping and doing all those kinds of things, it has to go into an interpretive mode and so quite often when you're working with it, you go, "Ah, yeah, this looks good." Load it up, why is it running so slow? It's it's 'cause you forgot to close DevTools so if you're testing the performance of your WebAssembly application, close DevTools. Right now, you can only single-step instrunctions, they're in the binary format which is pretty primitive so a lot of people just do console.log style debugging, (mumbles) debugging.
But Chrome 71 will come with source-map support so you'll be able to single-step from your source C++ and actually see what's going on. So that'll be a fantastic improvement that's been a long time coming.
Now, the future of WebAssembly.
I said earlier this is the minimum viable product, just give you a bit of taste of what's coming in the future. Now that we have threading support, we can use all those cores, they're in our CPU. So CPUs have multi cores, now you can use them with multi-threading.
The other big piece of silicone in your chip that's in your pocket, is what's called a SIMD core. So about 50% of an ARM chip is SIMD instruction support. Intel also have a lot of area dedicated to that. And what SIMD is for is for things like faster image processing so really good for video. We all have a kind of RGB images or RGBA images. In a normal CPU when you're manipulating that image or doing an effect or something, you have to do the R, then the G, then the B, and then the alpha, so you kind of it's four instructions to do everything? A SIMD instruction is single-instruction, multiple data so one instruction, does all four channels in parallel so you effectively get like a four-time speed up. And so that's coming very soon to WebAssembly as well. The fifth data type is coming as well.
This is a really important one.
It's what we call anyref.
So the anyref type is a type so that you can parse in an opaque type to the WebAssembly module, it goes, "Oh, okay, this is the thing I'm gonna manipulate," and then it can say, "Oh, I'm throwing it away now," so that it can be garbage-collected.
So you could basically pass your DOM node pointers around and it all just works like magic.
So that's coming relatively soon.
The other thing that's coming which will be a great boon for performance is a thing called Host Bindings.
So right now, if I wanna manipulate my DOM tree, I'll go, correct DOM node, (mumbles), do all this kind of stuff, and I can do that from within WebAssembly.
But under the hood, the browser has a C++ implementation of that.
So right now we have Go and the Mono Runtime and things like that that are running on top of WebAssembly but they run their own garbage collector so there's no actual native primitive and it actually slows it down quite a bit.
Because of some complexities like the fact the execution stack is hidden means you can't look for pointers on the stack and things like that. There are technical limitations and so the WebAssembly group is adding support for garbage-collected languages which will bring better performance to Go and C Sharp and potentially Java and languages like that. And so that's pretty much what I wanted to cover today and hope you learned something (chuckles) and weren't too bored.
You can come and play with the video processor if you like.
It's always good to experiment with things like that but I'm open up for a few questions, if anyone would like to ask for something.
(audience clapping) (upbeat music)