Preact: Into the Void(0)
(synthesiser intro) - Everybody, so, is everybody here actively caffeinated, 'cause this is gonna be a little bit thick, but in a fun way, I assure you.
And also, I have a laser pointer, just in case, which I just found out, which is pretty awesome. All right, by the way, that is not loading anything. I just put that in there to scare the techs, 'cause I find that hilarious. (audience laughs)
Also, it's just a really cool spinner.
I think it's from Gerbil. I have no idea where it's from. Okay, so like John said, my name is Jason Miller. I'm from the woods in Canada, which for those of you who haven't been to Canada is about an hour west of Toronto. I am developit on Twitter, no, I'm developit on GitHub and _developit on Twitter. I'm pretty sure the person who has the non-underscored version is Australian, so if you're here, we should totally talk.
(laughing) I have no ulterior motive. Okay, but so more importantly, I'm a serial library author, and that's self-diagnosed. They're still working on figuring out what the definition for that would be.
What that means us I'm fairly obsessed with this idea of constraints.
I started building stuff for the mobile web around the time the mobile web became something that you and I would all use, so, you know, the introduction of iOS and then subsequently Android, and even a little bit before then.
So Windows Mobile, the whole Palm Pre story. Never forget. (laughing)
I've created a bunch of UI frameworks kind of along the way to the point where it's actually become a fairly serious problem for me, because I have to go around and clean up after myself and delete all traces of these things from the internet such that I can't be held accountable for my past mistakes. So, the reason why I like constraints is I find constraints to be very interesting challenges. This is incredibly relevant for me because I have something called Attention Deficit Hyperactivity Disorder and I can't say that any faster than that.
Basically, this means that in order for me to be able to focus on something to the best of my ability, I need something I can really sink my teeth into. I need something that I can essentially like, dive into, lock myself in a tiny box for six days, and then, you know, come out of it with some sort of an interesting solution to a problem. I especially like challenges where the assumption is that something is impossible.
Oftentimes, it turns out that it is, but it's fun. (audience laughs) It's fun to investigate, especially in a browser context. No offence to the browser. Okay.
So, over the last couple of years, I've been particularly interested in one challenge, one constraint, which is file size, or you may see this referred to as bundle size. I like to think of bundle size of file size as a fairly quick essential optimization problem. We want to kind of maximise the productivity that we get out of the libraries that we use and the code that we write, but we also want to minimise the size of the code that we deliver to our users.
A way I like to describe this is we wanna stand on the shoulders of giants, but then compile those shoulders out. (laughing) Apologies to all giants. So I'm here today to talk to you guys about Preact. Preact, like John said, is a tiny little library that has a similar API to React.
Sometimes I say same API; sometimes I say similar API. It depends on the day of the week.
Also a very difficult thing to track.
So, how many people here have heard of Preact? I know this is super self-aggrandize...
Oh, that's way more than I would have expected. I was thinking like six people, all front row. My wife. (laughing)
Yeah! Hi, mom. (laughing)
Okay, so I'm actually going to explain today more about how Preact works than how Preact is implemented or why you should use Preact, more about the kind of design decisions that go into Preact to help it stay small and focused and fast. If you're new to Preact, you're probably wondering what on earth does this look like.
I usually wear it as a shirt, but here it is as a slide. This is the entire library with some weird white space stuff on the right.
Line wrapping in a browser. Who knows.
But this is the whole thing.
I've spent two years working on this.
Sometimes I cry. (laughing)
I always tell people that aliens are going to come to earth, and they're just going to see me screaming at this, and they won't need to understand my pain.
They'll just feel it.
So we're actually going to spend 45 minutes talking about this slide.
How exciting is that? In particular, we're going to talk about that, laser pointer, but that is really important once we figure out what this is.
But neither of those two things will make any sense until we've covered the yellow bit there.
Because it's a Preact presentation, we have to have some purple.
I have no idea what that's highlighting.
I never actually thought to check.
For those of you who are worried, I do have actual slides here.
There are real slides in addition to this code slide. The thing is, we can't talk about any of these fantastic things that have been highlighted here until we've talked about the elephant in the room, which is JSX.
You saw Zero's presentation.
JSX is probably not a thing that people are afraid of anymore, but it's still something that people maybe don't understand as well as they could, and don't feel bad if that's you.
There's no reason to feel bad there.
JSX is really simple, but it's the kind of thing where you don't have to understand it to use it. I'm going to try and break that down.
We're going to learn JSX in the next like, five minutes. My goal by the end of this talk is to have everyone here rewriting their webpack configurations in JSX by the end of the talk and then writing them back not using JSX having realised that is a terrible idea. (laughing) However, it is a documented terrible idea.
This is the webpack documentation on how to write your documentation or your configuration in JSX. You can also see there's a Preact cameo in there, which is kind of hilarious.
And actually interestingly, you know, all that stuff there is implemented in Preact. Like the actual rendering, even the sidebar on the webpack documentation is implemented in Preact. Most of the page even at this point is.
So it's like this weird of full-circle thing that now I get to put into a presentation, which is fun. So what is JSX? JSX is an XML-like expression that is compiled to a function call.
That slide should update momentarily. Yeah. So, and this is like my Merriam-Webster definition slide, I had to have one in here. So, it's, you know, we wanna write the thing on the left and we want to see the thing on the right.
So we wanna write one, compile it, and then get the other one.
To me, the value of JSX has always been that if you had to write the thing on the right, you'd still keep your job.
You wouldn't quit. You'd be okay.
It might not have as many as angle brackets, but it's pretty serviceable.
I've actually written whole apps like that. There's lot of people who do. I'll show you that in a bit.
But that's really it.
That's like, kind of, the basis of JSX, is a little compiling of angle brackets.
Other than that, there's really only two things that we care about.
The first is you can always jump back into regular old JavaScript land, the JavaScript of today, and you can do that using these curly braces. So you can see in the example on the left, we've got curly braces around that one literal and curly braces around the reference to the world variable. And they just kind of come through verbatim of the right, so they're not affected by the JSX compilation. Similarly, if you use an uppercase character as the first character in a tag name in JSX or a nodeName, it's going to come through as a variable reference instead of a string in the resulting, you know, compiled code.
This becomes really important, because that's how we leave function references in JSX, and that's how components work.
So we have our XML expressions and we have our references coming through after. So, what is h? What is this h function that has been randomly inserted in the place of all of our tags? That's actually up to us to decide.
So we get to define what that is.
We get to define A, it's name, but B, what it does. You'll often see h referred to as createElement or react.createElement.
That's just the Babel default.
That's just literally the name of any function to call, it doesn't really matter. We get to define what that behaviour is at runtime. All we have to do is define a function that adheres to this signature.
So, it need to accept nodeName, which is gonna be like div or that variable reference. Not gonna be a string or a function.
It needs to accept attributes or null if there aren't any. And that's gonna be like id equals food, for example. The rest of the things that were contained inside of this JSX tag when we wrote it, those will come through as the rest of the arguments, which we're gonna call children.
So you might be sitting here saying, "Yeah, Jason, I existed during the year 2000, "so I know that this is something "that might have been called HyperScript at some point." And, you know, props to you, that's a cool reference. It's one that I only got like later on.
So super props there.
And by and large, that wouldn't be entirely incorrect. So I've literally copied and pasted this text box, so I know it's exactly the same, but you can see the left and the right are very similar, nay, the same.
So in that regard, it's the same.
You can transpile JSX to this h function, or you can write Hyperscript, and you'll get the same result.
Where they're not quite the same though, is Hyperscript had this extension where the tag name can be kind of this CSS selector shorthand and it will expand those into attributes on the element that it creates. Hyperscript creates DOM elements.
And then in the JSX world, this isn't something that Hyperscript doesn't do, but it's just not commonplace there. We can leave a variable reference as a tag name. So it might be a function, might be a string, who knows, and then we often use non-string or even non-scalar values for prop values.
Doable in Hyperscript, but not common.
So, not quite the same there, but interestingly, JSX's compile target is what you would consider to be a subset of HyperScript, which ends up being important.
So it's worth having a really solid understanding of this, because the JSX concept is not only simple, it's also very direct.
We can take a look at the Babel REPL, and I would encourage you to do this if you're just kind of getting into JSX or haven't heard of it before, and we can, you know, we can write some angle brackets on the left, and we can compile them to some h function calls on the right. And it's pretty easy to get to the point, after a few minutes with this, where you can see JSX and imagine exactly what the Hyperscript corollary would be and exactly the same thing the other way around. So you can see one and imagine the other.
Also, I wanted to stress, there's nothing about JSX that says it has to produce DOM.
They're not inherently related in any way.
You can define a JSX function that returns XML strings, objects, more function calls.
I have no idea why you'd wanna do that, but you could do it, the possibilities are endless. So, one example of this would be the webpack config that I showed, that crazy idea.
That is a JSX h function that creates objects. Another more concrete and awesome example would be, I'm working with a SOAP API right now, and I know everybody cringes when they hear SOAP. I actually really like SOAP. I think it's pretty cool.
It's like a precursor to GraphQL.
But SOAP can get kind of gnarly on the client, because you send up serialising and deserializing a lot of XML, and XML white space, and encoding, and all this kind of business.
JSX makes it super, super easy because JSX is a way to write XML in your JavaScript, and then if you just define an h function that serialises stuff to a string for you and takes care of all the encoding, and all of the sudden, working with SOAP in JavaScript kind of feels native.
It feels like E4X, which died.
Okay. So, yeah. (laughing)
But it feels better than that.
Okay, so that's JSX out of the way.
The next bit sort of on our journey is Virtual DOM. Again, like JSX, it is a simple primitive, but it is a useful primitive to understand fully. So, what's Virtual DOM? Virtual DOM is just nested objects that define a DOM tree structure.
That's it. There's no behaviour.
There's no fancy stuff, no special sauce.
It's just nested objects.
I actually like to think of Virtual DOM as a big configuration object that we would pass to a theoretical DOM builder function, and it would return a DOM that matches that big configuration object.
So it's just kind of a simple creation process. So, how do we get from JSX to Virtual DOM? We know that we can get from JSX to HyperScript by transforming it uses a compiler like Babel or Buble. Yeah, whatever. There's lots.
I'm curious what the next one's gonna be called. I feel like we've run out of B names, but who knows. So we know we can get from one to the other. How do we get to Virtual DOM? Well, as it turns out all we need to do is define an h function that when invoked returns Virtual DOM objects.
That way, we can get all the way from JSX to VDOM objects. So, to clarify, what we wanna do is we wanna write the angle brackets.
We wanna transpile that to these h function calls, and then when invoked, we want that h function to return objects that looks like this, that have this sort of DOM-like structure.
So, what's that function? What is this h, this concrete h function that we need to write that does this job? I'm happy to say this is my favourite kind of function, which is a useless one-liner.
I write them all the time. They're kind of my standby.
All this does is it takes the arguments that we pass it, you know, makes children into an array, but that's giving it too much credit, and it returns them as three properties on a plain JavaScript object.
You'll sometimes see this object referred to as a vnode. This one doesn't inherit.
It isn't a constructor or anything like that. But it counts as a vnode, just because of its structure or Virtual DOM element, sometimes you'll hear. Same sort of deal.
In reality, we probably want to do some normalisation in this function.
You might wanna strip out any empty entries in the children array, or if there's infinitely nested arrays inside of children, flatten them down to just their internal values. But that's just to make our own lives easier. In reality, you could build a Virtual DOM library using this, so let's do that.
The whole point of Virtual DOM is we wanna realise these objects into real DOM elements so that we're not just working with something that is a data format.
Data formats are cool, but you're not going to build a user interface with them.
So, rendering is fairly straightforward, kind of like the JSX, kind of like the Virtual DOM. All we have to do is we take our object that we passed ourselves up there with the scroll bar, and we invoke the DOM method to create an element with that given name.
So we have a nodeName of div.
We call ddocument.createElement(div), we get back a div, and we assign that to node. Second thing we do is we loop over the attributes object, which in this case, has one property, id, with a value of foo.
And we call setAttribute(id, foo). Simple.
You can see we're kind of creating that DOM builder function that I said was theoretical.
Theoretical no longer.
Then what we do is we loop over the children property, which is an array.
You see that there on the right.
For each child in the children array, we invoke render again.
So we're setting up recursion here.
We pass it the virtual child, which is hello or the nodeName, be our object there, and that we're gonna get back a DOM node.
And that DOM node, we can just append into the current DOM node.
So we set up recursion, and then using recursion, we set up a hierarchy.
There's one extra little tidbit here that we have to write. You notice the first entry in children, the first child there, is just a string? Technically a string is a valid vnode, and what that means is we have to special case it. So what we do is we just check if the vnode's type is string.
If it is, we bail out with the DOM text node. Simple as that.
I did gloss over one thing here that's fairly important. I've been calling these things attributes, and anybody who writes React is probably like making that crazy screaming sound in their head that happens. I think that's a Silence of the Lambs thing. I don't know. I've never even seen that movie. So I've been calling these attributes.
That's technically incorrect.
More often than not, we want the latter, which is properties.
We might express them as attributes in our JSX, but in reality, we wanna set them as properties, 'cause it's a little faster, a little cheaper, a little more direct.
It's certainly, from a JavaScript context, it's more the API we tend to work with.
The thing is, both of these are at least somewhat incorrect. The problem is there's attributes that you can't set as properties, and there's properties that you can't set as attributes. You can't set non-string values as attribute values. So we need both and the best way to do that that I've found kind of two years on is through an in check. Essentially, if you're unfamiliar with the in operator, it just checks to see whether a given property exists on an object, including searching to the prototype. This is super useful for us, because let's take our example where we had an attributes object with an id property whose value was foo.
We're gonna come through here, we're gonna hit the in check, and we're going to be saying if id in node, which we know is going to be true, because nodes have an id property.
It's null by default.
Then we're going to set it as a property.
So we're essentially asking the DOM, do you support this property, rather than trying to have built-in behaviour. Preact is three kilobytes, so it can't really have its own behaviour.
There's no room for that. It can't have white lists.
It can't have mappings or any of this kind of fancy stuff. It just has to ask the DOM for everything.
And so in another case, let's say we had a data attribute, data-foo, it would fail the in check and we would just fall back to setAttribute. So it actually ends up working out quite nicely. Interestingly, with the advent of custom elements and web components, web components have this state they're in before they get initialised.
I think it's called uninitialized state.
And there's no definition there for the component. It's just kind of a dumb element.
In this case, the in check will fail for any custom properties it's added, because they don't exist yet. And so we set them as attributes.
This is really useful, because when the custom element comes and initialises, it can loop over the attributes that have been set on it and initialise with the correct data.
We just have to serialise it.
Once initialised, though, custom elements often define getter/setter pairs for the properties they support on the elements that they've decorated.
Now the in check passes, so now we can start setting complex types on these elements.
This has really interesting implications.
Honestly, it's still kind of in flux.
There's lots of people looking into the implications of this, but I think that this is kind of a key, a key insight.
So that's that. We wrote a bunch of code just now. We wrote a render function. We wrote a JSX reviver.
We gotta test this stuff at least a bit. So, here's a test. We're gonna define a Virtual DOM tree, and at this point, once we've assigned the variable here, we're just gonna have nested objects, not actual DOM elements.
We're going to have a title tag represented in JSX and a list represented in Virtual DOM.
We can pass that to a render function to realise or actualize that tree.
We get back an actual DOM element that has the recursion and the nesting, and then we can just pass that into body. So, does it work? Yes.
Also, the network works, which is good.
I'm always like, is it gonna white screen? No. Okay, good. So, yes, it works.
Let's get that out to production.
Obviously it's pretty important.
We've gotta get this list out there.
People want the list. Okay. I have a lot of list apps.
It's my thing. Okay.
So we just build a simple Virtual DOM renderer. The problem is we built a horrifyingly bad Virtual DOM renderer.
It's terrible.
Any time data changes, we're going to want to update our user interface, and the only way for us to do that is to tear everything down and rebuild it from scratch. That's going to take a very long time, especially using the Imperative DOM APIs.
At that point, it probably would have been easier for us to just use inner HTML, and arguably, better. So let's not do that. Instead, we need to use diffing. So the diff algorithms used by Virtual DOM frameworks are subject to a fair bit of debate and mystery. I think some of that is warranted, because this is complex stuff, but it's also unfortunate, because in order to really understand this technology as a substrate, right, if you're building an application on top of React of Preact or whatever, you know, you should really understand the fundamentals of how that thing is working so you can make sure you're using it correctly, so you're avoiding foot guns or pitfalls, whatever term you wanna use. You need to really know what it's doing underneath the hood, at least at a high level. So the fundamental premise of Virtual DOM is, what if, instead of rendering from the top down, creating each element anew as we went down, as we descended through that tree, we gave ourselves a reference to the DOM now, to the way the DOM looks now, and as we descended down through, we tried to reuse those things.
So we try and morph the DOM into the DOM we want rather than blowing it away and recreating it. So that's kind of the premise of Virtual DOM. Here are the APIs that we're working with.
On the left is a Virtual DOM. We just built that. That's our stuff, easy to understand.
On the right is the actual DOM APIs we're gonna be hitting. And you can see there's a lot of strong relationships here. nodeName, pretty much the same.
Attributes.
Okay, we've got a getItem, a setItem, but by and large, it's just, you know, map. Not super interesting on its own.
It's also iterable on both sides.
Children, the name changes.
There is actual a DOM children property.
It just doesn't include text nodes, so it's not super useful for us.
But aside from that, they're pretty much the same. It's just gonna be a list of things.
So hopefully, you can look at this and see how we would compare the left to the right and flush out the changes on the right.
So, the process for doing that is three steps. First step is we need to diff the type.
So in order to get kind of a root in the DOM, we need to create an element, and the element types are immutable.
So we have to initialise the element with a type. We can never change it after that.
So we have to do that first.
The second thing is we need to diff the children. So we've gotta compare the children in the Virtual DOM tree and the real DOM tree, figure out what the differences are, and then flush those out. And the third one is attributes, more accurately referred to as props.
We just compare with the two.
That's just a bidirectional object comparison, not super difficult.
And then importantly, for each of the children, when we're doing that comparison, we're going to invoke the diff again.
So we're going to recurse down through the diff using recursion.
If you're, you know, if you're an astute learner, I'm not, but if you're an astute learner, you realise that I've kind of swapped things here. In the previous example, we diffed props and when we diffed children.
As it turns out, that's not really the right way to do things in the DOM.
There are a couple of cases in the DOM where the children of an element define the behaviour of its properties. One that I like to use for this is the select dropdown, the select element.
If you have a select element with no children and you set its value, it's going to try to go and look up the display text that it needs to render, but it doesn't have any children, so there's no display text for it to look up. It's gonna try and search for option, children. There's nothing there and you'll just get a blank select dropdown.
Even if you then go and add children, still blank. If you flip that on its head, though, and you render the children into the select dropdown and then set its value property or attribute, it'll look up the display text and show appropriately. There's a couple of cases where that's true. There are no cases where the opposite is true. So we just flip the two.
I never went back and corrected the first one, 'cause it gives me the opportunity to talk about it. So step one here, we need a diff type.
So, okay, it's immutable, so we have to do it first. And there's kind of two sides to this story. The first side is the easier one, so I like to talk about it first.
We're gonna check that the node's a component, and this is just, there'll be a property on the element that indicates yes, I'm a component, or in the case of Virtual DOM, if the nodeName property that we gave ourselves is a reference to a function, that's a component. Otherwise, it's a string, so it's just an element. In the case where it's an element on both sides of the fence, we have our lives kind of easy. All we do is we check if the type in the Virtual DOM element is the same as the type in the real DOM element. So, is it a div and a div? If it is, we have no work to do.
We're going to update this element in place so it's a no-op. If the type is different, though, we have to essentially blow away the existing DOM element and recreate a new one, because we can't change the type of a DOM element. It's set in stone once you create it.
So in that case, we just use replaceChild and blow it out and replace it.
On the other side of the tree, slightly more complex, but I'm going to kind of try and simplify it to get us through here. I'll still kind of root it in reality.
So, if the node is owned by a component, and let's pretend that that's signalled by the fact that there is an _component property on the node, to steal how Preact works, all we're gonna do is we're gonna either create an instanceof that constructor reference, _component or the nodeName, or we're going to update one if there already was one, and that's just, you know, there'll be an instance property hanging off of the DOM element that points to, you know, the old constructor value. We just reuse that.
So, either way, we're going to get an instanceof that constructor.
In both cases, all we do is we call render on that component.
Thankfully, render just returns more Virtual DOM. We go from our component reference, which is abstract Virtual DOM.
We call render, we get back concrete Virtual DOM. This is really easy, all we do is we take the results of render, which is a vnode.
We know how to deal with vnodes. We pass it back to diff. All of this kind of occurs within the context of a single DOM element.
It's kind of a summery of one of the key features of this react component model. You can have multiple components that are owned by a single DOM element or that are associated with one single DOM element. So you can compose within the confines of one DOM element. You don't have to have six elements stacked in order to have six components stacked.
That's something that's highly valuale.
This is how high order components work and react. I feel like this is a decent way to illustrate it. It's that diff process, recusing within the context of one node.
Okay, so that's type.
Now we've gotta nail it with the right type. Second step is diff and children.
This one's pretty straight forward.
It's a little but more bidirectional.
All we have to do is we loop over each child in the DOM, so we're looping over child nodes.
For each, we're gonna put it into a bucket. If here's a keyed property on the child, we're going to put it into an unkeyed bucket. This is just an ordered list. Chuck it on, it's an array.
If here is a keyed property, we're gonna put it into the keyed map.
This is essentially a JavaScript object where the keys are whatever the value of the key property was on that element.
The values are the elements themselves.
It's pretty straight forward, we'll end up with this kind of ordered list of unkeyed and a map of keyed. Those are our two buckets.
The next thing we have to do is we iterate over all of our virtual children, that children array that we had as a property of our vnode. We have to find a match in one of those buckets, so if the vnode has no key we're gonna go to our unkeyed ordered list and we're gonna find the first instance of an element with that type in the list.
So our vnode has a nodeName of div, we're gonna find the first div in the list. We're gonna come back form the operation with either a div or no match and that's fine. Save the old for if it has a key, we're going to find the corresponding value in the map for that key.
There might not be a match and that's okay. We're gonna take whatever we got from that operation and pass it into the function we just wrote, that diff function for type. What that's gonna do is take the element we passed it, or null if we didn't find a match, and the vnode that we pass it and either destroy and recreate the element 'cause the type was wrong, create a new element because there was no element there, or just return the element we gave it because it was a dead match.
That works nice and easy.
We're gonna come out of that operation with the right element instead of nothing or the wrong element or possibly the right element. Now we know it's the right element.
Now we have to do with that is insert it at the corerect index in the DOM, which happens to be the index that we're at in this loop over virtual children. So we'd use insert before to insert that in the DOM. Third step is the easiest and my favourite. If there's anything left in the lists, we just drop them on the floor.
That means that they're no longer used in the render so we just get rid of them, empty them out. I kind of glazed over keys there and it's something I want to touch on because keys are really, really important and they are often very poorly understood.
Basically the job of keys is to attribute meaningful order to an otherwise uniformed set of elements.
What I mean by this is if you have five divs in a Virtual DOM renderer, and then you render seven divs as a second render for that same component, there's nothing in Virtual DOM that tells it, "Oh, I added two in the middle." Or at the beginning or wherever.
It's just always gonna add and remove at the end. It's dumb, it starts at the beginning and it goes to the end, so if the length of the list is different it's gonna add or remove at the end. Keys let us kind of turn this on its head.
It's easiest to see this as an example, so here's the example.
Let's say we have a component and in the first render we have three elements.
One, two, and three. These are list items.
In the second render for the same component, we have two list items.
You and I can look at this and see clearly the second item has been removed, right? There's no
These are just text nodes, so it's not gonna read the text and be sentient.
What it's gonna do, is it's actually gonna hit number one and that's a no opp, still an
It's gonna hit the second
"You changed the text content of "that
It just did two operations even though it could have done one and saved itself some time. That's because Virtual DOM always adds and removes at the end for a uniformed list of elements. If we add keys into the mix though, now we're back in control of that type of diff. You can see now, we've got key 1, 2, and 3 in our list items there.
In the second render we have keys 1 and 3.
Key equals 2 is gone.
Now the Virtual DOM library can see that, so it's gonna run through its diff.
It will hit item one. No change. Same diff.
It will hit item one, no change. That's a no opp. It's gonna hit item two, and it will see "Oh, I've got a key mismatch.
"That had key equals 2 before, now it has key equals 3. "I have to destroy it. I have to recreate it."
It's gonna get rid of that.
Then it's gonna move onto the next one.
It knows now that the next one's gonna be a position two because it just removed it and as it turns out, it's a dead match for three. It doesn't have to do any work.
Now it's a no opp for the third item.
All we had to do was delete the second one and let the DOM shift things automatically and keys let us establish that. So that's keys. Okay.
The third step here, and probably the easiest one, is diffing attributes or properties.
All we do is we loop over what the DOM looks like now, so loop over all the existing attributes or properties in the DOM and if they don't exist in our Virtual DOM attributes in this new render work that we're doing, we just delete them.
Set them to undefined, delete them, call it a new attribute, however.
Then for the new stuff, we loop over the new attributes which is that DOM attributes that we have in our Virtual DOM.
If the value is different from the DOM or if the Dom didn't have that value originally, we just set it. Thankfully create and update are the same, which is a set operation, so that works really well. As of this point, we have now solved all performance problems in all web applications, period. Thank you and good night. No. (laughing)
Okay, so the one percent clash.
"Get off the stage!" (laughing) All we have done is we've taken our performance problems and we've moved them into a library.
We've kicked them over the wall, onto the shoulders of Open Source, which is a good thing to do because now we can collaborate on those performance problems, but it also centralises the problem on people like me or the React team.
What I wanted to talk about a little bit was some of the performance journey I've been on since starting the Preact project.
The first thing that I've encountered is a lot of people saying the DOM is slow.
I think this is the sub heading underneath Twitter.com, especially if you're following Ken Wheeler. No offence Ken. (laughing)
So, I'm not here to refute this.
This is true in certain ways and that's good. The DOM is definitely slower than immediate mode drawing API.
It's slower than canvas. It's designed to be.
It does a lot more.
Canvas and the DOM or any drawing surface and the DOM are kind of an apples to oranges comparison and that's because the DOM has a lot of features that we forget about. The DOM has accessibility built in.
You can slap aria attributes on an element or use semantic mark up with labels containing forum elements. I think that's the current best practise.
You're gonna get tab nabbing ability and screen reader support and all these things essentially for free which is amazing.
We essentially default to an accessible web as long as we're all doing things more or less correctly. That's awesome! Second thing is the DOM is extensible by default. You can't even opt out of this, which is amazing. An example I like to use for this is I just switched to Windows fairly recently and I kind of missed my imoji picker from the Mac. But it was really easy for me to solve this problem because I only enter imojis when I'm posting garbage on Twitter, follow me on Twitter.
(laughing) I use a lot of imojis there. All I had to do was I added the imoji one browser extension to my browser and that automatically adds imoji input to every input box on the web because every website is using the DOM to render input boxes.
It's kind of the substraight for which all this stuff works and really importantly, Twitter, where I'm doing the imoji posting, doesn't know about my imoji one extension. It's just using an input field in the text area. Imoji one doesn't care that I'm using this on Twitter versus Google Plus.
It just kind of injects itself into every input. It doesn't need to have specific logic for an application. That's really important.
That's what gives us kind of a decentralised model. The last thing is that the DOM is framework agnostic. You can use angular and Preact side by side. People do this all the time.
You can use React and web components side by side. It works great.
You can kind of mix all these things together because they're all using the same substraight. The DOM is kind of the linguifranka of the web. It's that base layer API for which we all communicate as developers.
I would argue that the DOM is not slow, our DOM is slow. Or your DOM, to be more offensive. (laughing) So the speed of the DOM is highly dependent on how you use it.
Since Preact is a DOM library, I'd like to share with you a couple lessons that I've learned while optimising it for the DOM over the past couple of years.
The first of those is deceptively silly.
That is to use text nodes for text.
I can't do a show of hands for this because most people don't realise when they're not doing this and there's nothing wrong with that. We work with an API called the DOM that exists on multiple layers in the abstraction stack. You can create a text node or you can use an inner HTML or text content or any number of these APIs and they're all doing roughly the same thing underneath the hood, likely using the same primitives, but there are varying levels in this abstraction. I'm here to encourage you to use text nodes because that's the lowest level of the abstraction in terms of what we can access in JavaScript. It's easy to see why text nodes are faster 'cause they do less.
They don't normalise adjacent text.
They don't remove elements when you set them. They just kind of show text inside of another element. They're singular in their purpose.
This Missy Elliott joke is also a description of a text node API.
You can create a text node and you can see it some text. You can insert a text node into the DOM optionally before another element, same as the elements API. Then you can update its nodeValue, which is the text that it is rendering.
You can see, there's no parts there.
There's no HTML parts incurred and there's nothing here that's going to delete surrounding elements or do any normalisation. This is kind of bare bones, I'm specifically avoiding saying bare metal 'cause I've been made fun of for that. (laughing) There's no such thing as bare metal JavaScript. It's easy to see why it's fast though 'cause it's doing one job very effectively. But it's easy to also overlook how much faster it is. This is a performance benchmark of text content versus updating nodeValue.
I believe the bench mark is create an element and then add some text to it and then update that text a bunch.
Very common thing for us to do on the web.
Think of like an SPF metre of counter or whatever. Oftentimes they're updating text in place.
We do this all the time. A clock.
It's 50% faster to roll with nodeValue here. This is not the kind of optimization that we can overlook. It is a low level optimization, so you just might be able to hope that your library of choice is doing this underneath the hood and oftentimes, they are.
But it's something you need to be cognoscente of if you're working with the DOM directly. The cost of text content is high.
It's not unjustified, but it's high.
Second lesson is just don't use DOM getters. Don't use them. I say this knowing that you can't actually avoid DOM getters, they're everywhere. But at least know when you're using them.
Here's how to know when you're using them.
You can use get own property descriptor to ask a given object for the property descriptor for a given property. Here is a kind of a strange conative dissonance. I figured that the text prototype would contain a nodeType property, given that if you ask a text node for its nodeType, it returns three. You would think it would be a property with the value three, that is not the case.
All nodes in the type, regardless of their type, inherit from node and there's actually a getter for nodeType defined on node.
Essentially, you are doing what boils down to, more or less, a function call when you are asking for nodeType.
The thing is we ask for nodeType a lot, especially in Preact.
You think about comparing Virtual DOM to DOM, you are constantly trying to figure out whether something is an element or a text node. That's the first step of every operation, so it matters whether this is fast or not.
Here's a couple different ways to check to see if something is a text node.
First one is the obvious one.
Check if its nodeType's three.
I say slow here, this is obviously all relative, but in regards to the speed between these two, checking for the existence of .splitText, which is just a function that exists on the prototype of text, is much faster.
So if you're working with text you may want to consider duc typing here.
You could even take it a little bit further by using explicit check, and I'll cover why that's better in a second.
But here's the benchmark.
Preact used to use instanceof text to check for things and it's fast, but the thing is it doesn't work across document contexts if you create a text node in an iframe and then pass it to the parent frame, it's a different text constructor so this check fails. Note type is not bad.
It's been optimised a lot, but it's still a getter. You can see wholeText is just a regular old getter. That one's rather slow, same deal for nodeName. But .splitText and it's brethren attribute, which we don't care about in this case, it's really fast. It's just a property access and property access in JavaScript are very fast.
If you have a chance to duc type for this, give it a go. So you know, there's the list.
Third one is avoid live nodeLists.
This is also very difficult, but it's also a case where you just need to be cognoscente when you're doing it. So live nodeLists are kind of like scary arrays that self update when you mutate the DOM.
We've all heard stories of document done all, it's still there whether you want it to be or not. But interestingly there are other live nodeLists that are incedibly commonly used.
There are live nodeLists that are used in Preact, even though I care about this stuff deeply. One that you may be familiar with is child nodes. Here's an example.
We want to delete all the children out of a parent. We want to remove those children from the parent. The common way of doing this would be to do a backwards loop over child nodes 'cause we know that the embassy's are going to shift as we remove things 'cause it's a live nodeList.
Do a backwards loop and for each entry in the loop, remove from the parent.
Seems legit, seems fine.
Problem is we're constantly asking for child nodes. So that's gonna have a performance impact and more importantly, in every iteration of the loop we're accessing some arbitrary offset within the live nodeList that is changing as we are iterating over it. So there isn't a great way for a JavaScript engine to optimise this.
It's not a frozen object.
We can't make assumptions about it.
It's just some pointer to a crazy thing.
A better way to go here would be to use a DOM getter. I'm already contradicting myself.
LastChild, I get to do this twice. (laughing) lastChild is a DOM getter, but in comparison it's the lesser of these two evils.
We invoke the DOM getter until it's exhausted. LastChild will always return whatever the lastChild in a parent is until there isn't one, at which time it will return null.
All we do is we keep sending childs, whatever that is, and then calling remove child. We don't have to worry about crazy embassy's changing, all this kind of stuff. We just check it until it's null and we're gonna store the value instead of accessing it twice because performance. This is much faster.
This benchmark doesn't do a good job of describing that because there is a boat load of setup overhead. I have to create 1,000 elements before deleting them and delete is less expensive than create, so just keep that in mind.
But lastChild, this kind of simpler approach that is actually less lines of code interestingly, is consistently faster, so keep that in mind. Now I can hear all of you groaning, thinking, "Jason, this si ridiculous.
"I don't do my own DOM manipulation.
"I have people for that." (laughing) And I will say to that, good on you.
You're at a good point in your career.
(laughing) So to you, I have some generalised measurement tips. A little bit more abstract, not specifically about the DOM, more about JavaScript.
The first of these is dead obvious.
Some of you might live in Chrome dev tools at this point, I certainly do.
My favourite view is the timeline though and this is not only because it's a tiny copy of Sam Siccone that you can ask to run a performance test on a website at any time, including when it is 3 a.m. in San Francisco, but also because it is the best tab from Chrome dev tools to show in a performance slide because it is pretty. In addition to that though, it's also really useful. This is a flame chart essentially and so the example I'm showing here is Preact showing a benchmark called VBMon, which is essentially like a giant excel style spreadsheet. It just keeps rendering over and over again and sending a bunch of textNodes 'cause that's really important. I can look at this and see how long does it take to do a top level render, full top to bottom render of this excel spreadsheet thing, giant table. I can also see how consistently I'm hitting that benchmark. 3.81 there, lines look all roughly the same. This is not an exact science, but it's a really, really useful metric to have. It's especially useful because you can do this in production.
You don't need to have your code instrumented to use this tool.
You can mark, you can add markings and that kind of stuff and make your life easier, but you can do this to any production website in the wild, including YouTube. The next one would be IRHydra, by Mr. Aleph, whose name I can finally pronounce.
IRHydra is a way more low level tool, but also incredibly useful.
It helps us answer the question, when my JavaScript runs using actual types that I pasted in production like settings, how is the JavaScript engine going to optimise my code? The JavaScript engines are not interpreters anymore, they are essentially compilers, just in time compilers. There's different pipelines that things fall down through, and this is a tool that in the context of Chrome, will show you how it is going to compile your JavaScript code.
It will show you, for any given code path in your application that the generated code that comes out of this from the compiler is going to contain deopts. Deopts are essentially when it gets type information from your code and it generates optimised source code for a given code path, there's gonna be situations that it can't account for.
Let's see a function that normally accepts two arguments of the type string and then all of a sudden at run time we pass it one argument as a number.
It didn't account for that.
The bite code that it generated didn't account for that. It's gonna have a builtin bail out to essentially say, nope! You're going back on optimised land.
Then it may reoptimize again in the future, which would be fairly expensive, but it's essentially gonna kick you out.
That's what a deopt is.
This will show you how and why your code deopted. It will mark a property access and show you this property access is more effective so it is going to deopt. It will mark, I don't even know which one this is, but this looks like a soft deopt, so if it hits this point in the code you're gonna have essentially a performance breakdown there. This is really, really useful especially if you're doing low level profiling.
If you're on OS10, this is a git.io URL that is a description of how to run Chrome with like 17 different command line flags.
That will generate a dump file that you can upload to IRHydra and analyse the source.
Same sort of deal on Windows.
Same flag, just a different way to run it.
But if you're curious, go to that URL.
The last tool is ESBench and this is totally a plug. ESBench was the first app I built with Preact, which is probably unsurprising given Preact's focus on performance.
It's also very simple.
It is an interface in which you can access Babel and Benchmark.js.
It takes the performance tests that you write, it runs them through Babel, and then it runs the compiled source for that through Bechmark.js which is a lovely benchmarking tool for JavaScript. I did not write my own benchmark tool.
I really shouldn't, it's like crypto.
This is a really useful tool though because if you are considering changing a function or you want like three different versions of a function and you want to know which of these is fastest, ESBench is a way for you to just run them all, pass them some data, and get back a graph telling you which if them is fastest.
That's super useful.
It's also really important to note, when I'm talking about micro benchmarks, not specific DDS bench, not even specific o JavaScript, it's very, very easy to test nothing.
As it turns out, nothing is super fast.
It's like the fastest thing you can do.
It's really easy to get duped by this and be like "Oh man! I found the fastest implementation function ever! "It's just an empty body." (laughing) If that's your job, awesome.
But in a real world setting, I would encourage you to try to pass the functions that you're profiling real data.
Come up with some sort of function that generates variable real data that may ne different every time it generates in some strange way to try and fool the JavaScript engine.
Because the JavaScript engine is smart and it will outsmart you and your benchmark. If you do something that is static ally analyzable, it will statically analyse it and then you're gonna get a code path that is nothing like you're gonna see in the wild and you're testing something that is essentially wrong.
The other thing is if you have a function that has a return value, do something with the return value. Let's take for example the h function that we saw earlier. When I optimise that in Preact, I think that's actually what this is, when I was optimising the Preact I actually wrote a simple DOM stream renderer and I used that in the test to make sure that when I optimise h, I'm not doing it at the expense of anything that's using h's output. When you're trying to benchmark, you need to be holistic about stuff.
Okay, so couple lessons. The first is be explicit.
I touched on this earlier, but it's really important. If you want to do something in JavaScript and you can tell the JavaScript engine more about what you want to do, do that.
I this case, if we want to do an existence check for the property props, you could do the first option, but you're also checking is it not zero, not an empty string, null, undefined, false, etc. There's a lot of those. That would work.
If you actually wanted to check if it was true V, then this would be very logical.
A better option if you want to do an existence check is to check if the property when accessed is not undefined. You know essentially you want to check if obj.props is defined, so check if obj.props is not undefined.
Circular logic.
But this is way faster 'cause it doesn't have to cast it and check if the value is true.
It's doing less work because we told the engine more about what we wanted to do. Another one of the inline helpers, my point here is just to avoid these.
We used to have a function at Preact called hook. You essentially pass it an arbitrary object, an arbitrary method name, and some arbitrary arguments. You think about hoe JavaScript engines work, identifying patterns in your code and creating hot code paths for them, this is horrible for that.
It's just arbitrary behaviour lumped into a line function. It deopts almost immediately and is a source of slowness. We removed this and just turned it into method polls and it was a 10% speedup over all in the library, which is pretty awesome.
Also, just a little but easier to understand. You don't have to have the weird A, B, up to two arguments thing.
Another one would be short circuiting. I have a quote here. The cheapest function call is the one you never make. This is true for any language.
This quote is by me. I am quoting myself on a slide. (laughing) An example of this in Preact would be if you have an element in the DOM that has one text child, such as an element that contains text and you have a new Virtual DOM renderer that contains one text child.
It's gonna short circuit the entire rest of the diff, blast that text update out to the DOM, and early return. That's a huge win because it avoids recursion, it avoids function calls and checks.
It avoids calling these getter methods that we talked about. That's a huge win.
It's probably the single largest optimization that has ever gone into Preact. That's kind of like the easy pass.
If you're doing optimization, you should always have this in the back of your mind.
So all this is to kind of make the point, we should be trying to make decisions based on data. We need to collect better data through better benchmarks and better performance tools, techniques, and tests and then act on that data by setting better goals for ourselves and for our code bases and our coworkers even. Because remember, we all have a shared goal of creating a faster and more widely accessible web because that's how we provide better experiences to frustrated users who are stuck looking at that. Thanks. (clapping)
(synthesiser outro)