Keep Your Errors Close
(peppy music) - Oh dear, alright, hello.
Well I guess I'm not gonna introduce myself. Hi, so yeah, today I wanna talk about errors, because frankly it's something we all deal with a lot. They're rather, alarming and annoying and all the other things, so we really wanna try and get on top of them as best we can.
So yeah, I know there on the trap it says Dan Cabbage, but there will be absolutely no FP in this talk, I promise, cool.
So, question time, I wanna kick this off with a question. Can you please raise your hand if you've seen this at all recently? Or seen this at all recently? Or my favourite, seen this at all recently? Anyway, so yeah, these suck.
I'll get into exactly why they're bad, but for the moment, just we all know they're kind of awful. And unfortunately, the worst bit is, the error message doesn't even tell you anything particularly useful. It exploded somewhere, it's not really, when you get this, it's not a useful pointer to where the thing actually went wrong.
So a really quick example, and all the examples I have in this are as micro as possible, they may be oversimplified, but they all illustrate the points I'm trying to get to.
So for example, we've got a really simple and silly function called frankSummary that basically defines frank as undefined, and then immediately goes and tries to use frank's name.
This isn't gonna work out so great, and what we've got is we have the bug, such as it is in this particular example, up on the line here, and then basically, immediately explodes below.
So if we think of our code base as effectively a whole bunch of interrelated modules or functions or whatever little units, they're basically things calling things calling things calling or whatever, or making use of, however you wanna think of it, and so what we've got here is effectively a little self-contained thing.
This is a module that in itself is well it's never gonna work to start with, but it's basically self- contained in the fact that it's got the bug, and it's also got the bit that goes kaboom. So we've got other examples, where you don't really need to dig into the, I've broke my 12 lines at most rule in this one, I'm very sorry, but we've got three functions, we've got a summary, which prints out user details, this is like frankSummary before, we've got something that just goes through and maps the over list users, and something that does some ajax-y, whatever, you can fill in the blanks here, but basically something that makes use of these functions. So for example, we fetch some users, and then over the list of things we go and get the stuff we want from the API to spit out a user, and then we go and give it to the listUsers function, which then, listUsers function is gonna make use of the users to use the summary, and so we use the summary function, and then we'll go user.name, and well there's kind of a bit of a problem here.
The problem is I've got which isn't exactly returning the new user, it's just sort of declaring an object, and then leaving it on the floor and kinda just sorta walking off, so our bug is down here, but our actual explosion, our undefined property something something something is all the way at the top here, so effectively what we have is that these are across the same file, but you can think of across these different units as functions or modules or whatever, so we've got this module that depends on a module, this function that depends on a function that depends on a function, and the bug is here, the bug is here, and the actual explosion is all the way over there, so the real reason why these unrelated errors kind of suck to deal with is because of this distance in my opinion, so what we, in order to have a nice error experience, which, it's never gonna be nice, but nicer, it's always like an incremental, asymptotic kind of thing, but our goal here, our first goal I would argue is kind of proximity, or accuracy if we wanna try and give it a different name, where we've got the source of the error as close as, the error part as close as possible to the source of the error, so we can act on it, as opposed to going on a bug hunt, and doing, and following the trail of bread crumbs that may lead into a wall or a whatever, it's much nicer when you say the error is here, here is also the bug, or it's right next to it, or something similar.
So second question time, cause I love asking questions, who's written any tests that actually didn't catch the, catch bugs? Basically you've written a test, and later on you've found a bug in that code? I mean, I do it all the bloody time, so this is also a pretty common case.
Now let's think about unit tests.
Again, very simplified thing, but we've got a function here that just adds some numbers up, and we've got a test that is good for proximity, but it's, it's not covering all the bases.
We're testing, when we sew up numbers, when we got one, two and three, fantastic.
What happens when we get an empty list? We didn't think of that in the original test. Congratulations, kaboom, because we forgot to give an argument to reduce, like at the end, very simple example, but it's very easy to have a bunch of edge cases, where you've got code coverage on it except bang.
So the idea is what we wanna try and do with out error mitigating techniques, is to try and get as much coverage as possible, and I don't mean just in the test coverage kind of sense, I mean as in trying to discover as many errors as possible. So third question time, third and final, I promise. So who has a test that takes forever to run. I'm going to put up both my hands and a foot. (laughing) It's, the problem is not only this distance, but it's also this distance, where the person typing, and the thing going wrong, you want as short a time as possible, so argue where I'd say the third goal here is actually speed.
So that's the problem, and ways we would like to, things we would like to do about it.
So how do we start to address this? We're never ever gonna totally fix this, although pretty sure we all have a job, but we've got a bit of a toolbox we can deal with.
A catalogue of tools.
So let's step through them.
So first we've got tests.
We've already sorta kind of covered this, in a way, so the test where you effectively like double entry bookkeeping, you write the thing, and then you write the thing that asserts how it behaves, and if the two line up, congratulations you're off to the races, otherwise you go and fix one or the other. And so, the accuracy of tests, I'd argue, or the proximity depends on the kind of test.
Unit tests tend to be pretty accurate, in the sense of you've got a bit of code, and a test that is intentionally targeting that bit of code, and if something breaks, well, it's in this function, I mean the functions shouldn't be super long, hopefully. But it's reasonably targeted, as opposed to, a ran test here and something else broke.
It's, it can still happen, it's just unit tests are more accurate, as opposed to say integration tests, where you make a browser where, make a browser, hit an endpoint, and then it runs a swathe of code, and something wrong has happened in there, but in the same way, the visible error might be caused by something deep in it, so we've got an incredible variance in how accurate tests are, in terms of the proximity, like finding errors, bugs and errors being close, sorry the actual source and the error report being close to each other.
So coverage, lots of unit tests will get you decent coverage, that's also an unfortunate thing. You need to write scads of tests to get decent coverage. You need to write lots of unit tests, and you can't write enough integration tests to cover the state space that integrate, that you need to deal with as a Cartesian product of product of procut of product going on.
When you've got an integration test, you're not going to cover all the parts, so that's kind of a bit of a crapshoot, you gotta try some happy path ones, and some sad paths and hope you've got enough representative cases, but with unit tests you do need to write a bunch of tests, a bunch of tests to make sure you've covered a lot of the bases, which is part of kinda why tests are also annoying to deal with, we've really, it's really good that we've gotten to the point where we have as an industry, where writing tests is something that people do, for something like, it's definitely something there's been a very gradual thing, but it's now a case of yeah, we write tests for pretty much most things or we try really hard.
Anyway, so unit tests can be fast to run, which is good, like Garin Bernhardt famously wants his test to finish by the time he's finished printing the key, which saves the file, integration tests can be really slow to run, that hour long CI build running on the server, standing up pretend browsers, doing visual regression tests, again incredible variance, so tests can also be really slow to write and/or maintain, if you wanna maintain that coverage, the ability to discover those errors.
I said last question before, but who's felt reticent to actually make larger changes to a code base on account of you'd need to update a scad load of tests when you go and do it? This is me yesterday and the day before and, I started a new job recently so, before that I was on holiday, so I was footloose and fancy free.
Anyway, so Linters is a different kind of category of tool. Arguably trying to do a different thing, still roughly trying to achieve a similar goal, discovering errors, but we've got things like eslint, that you may have used before, and sonar, and there's generally stuff along those lines, and the idea being is they're basically trying to analyse the structure of the code, but that's about as far as it goes, and if it tries to run it, it's very static, and it also doesn't, it's really locally, it's very focused, it looks at small pieces, this bit looks like a bad kind of thing, this thing may look like it might explode later if you were to run it, or has an anti-pattern, and so it tends to be the community that comes up with, this is kind of a bad-ish thing, and then you can choose to flip them on or off or whatever.
So in the case of eslint, this handle callback error, which is one of the defaults in eslint recommended, I think, but it's basically hey, we haven't handled the case of when error is frowned here, like this could be bad if the file doesn't exist, or whatever other error.
So, linting, accuracy, the proximity is good, as in again, it tends to be trying to focus, it's trying to basically, the stuff it points out tends to be close to the source of the bug. It's like, I'm guessing this thing is gonna cause a problem for you later, and it's pointing directly at the source, where it's actually a bug is then the thing you start toggling switches, like no, linter, go away, I know what I'm doing this time, but it's an attempt, and so coverage is not fantastic, it's neither, it's not terribly smart, and it's, it can't figure out a lot, and so therefore it's again a bit of a crapshoot in a way, but it does cover potentially a different set or overlapping set of stuff as tests to a very limited extent, so speed, because they're not actually doing all that much, they're pretty damn quick.
It's not super smart, doesn't have to do much, can run if you've got it set up for your IDE or whatever. So another category of stuff, one of my favourite ones, type-checkers, or we can almost think of them as like super-linters, in the sense of they, yeah they're like really high-powered linters. And it's basically linter to typechecker to whatever is kinda like a spectrum more than a hard line somewhere, okay I'm not gonna go into it. Alright cool so we've got TypeScript and Flow as the two headlining Google closure compiler is also in there somewhere, I have never used it, I'm sorry, I can't speak to it at all, but very simply, TypeScript has gained more and more mainstream adoptions since I gave my talk two years ago so I'm not gonna spend very much time at all explaining how TypeCheck and Flow works at all, but quick representative example, really simple function, we'll get to something more complicated in a tick, where we've got a function that takes two inputs, and we describe to the computer, we basically, it's a conversation with the computer, we say that these two inputs should be numbers, I assert that they're numbers, and that their return result is a number, and then we go and check the input and the function matches up with the types.
Basically it's again another double entry bookkeeping. They're like tests, but they're like embedded in the structure of the code, and arguably, a lot cheaper, but they cover less in some cases and more in others. You cover more state space, but then there's a whole bunch of stuff you do, it's basically, these are sort of overlapping venn diagrams.
I forgot to put a venn diagram in this talk, so you just have to imagine one, but they're basically, these three tools are kinda overlapping, and sort of occur in similar-ish ways.
So a more complicated example is, and please don't feel the need to scan this entire thing, but basically we've got a similar thing going on, we've got something which is fetchPostwithComments. And what we do is we take the id of the post and how many comments we wanna limit the fetching to, and the entire function, because it's an amazing thing, is gonna return a promise of either the post with comments or null, and literally neither, you're not gonna be able to pass, you're not gonna get an undefined or a false out of it, it's literally these two types or nothing at all, and TypeScript and Flow are gonna be checking that for you to make sure you've honoured the contract that you've set out.
It's basically, you tell the computer, I want this, these are the constraints I wanna apply to this thing, and the computer says, yes you did that, or no you screwed it up, which is incredibly useful as a tool, cause it allows us to say things and then have the computer automatically go and check a scadload of stuff for us.
So in this case, we've got a return contract, and in effect it's saying that yes, the computer says yes, you're returning the post with comments, or the null.
So the accuracy is pretty good.
You encode assumptions into code as you're writing it. It's all in-line, you're stating assumptions to the computer and getting it to check it, and it's very sort of local, kinda of close proximity level, you're defining the shape of particular functions and so if stuff doesn't align you're gonna find out pretty quickly that it's inside that box.
There are a bunch of stuff that it has trouble, because you're speaking to the computer in a particular language, TypeScript is almost like a language that you're describing the shape of functions and things to, there are things you can't easily say with that language, or things where it will allow through that you didn't really intend, but it's overall, it's pretty good. So one of my favourite things, as in, of the javascript tooling I have available to me, things like TypeScript and Flow are arguably my favourites.
We've got coverage, it's reasonably good, but sometimes you need to meet it halfway in the sense of, you need, it's pretty good but there are cases where you can aid the compiler in checking your code by designing types in a particular way, for example if you got something as string that needs an id, a string that means a name, then if you want to not get them confused, then you design a separate type such that the computer can tell them all apart, and tell you when you've mixed it up, it's kind of a two way conversation. So sometimes it doesn't cover stuff, and that includes, well, one example is in JavaScript is the not a number number.
Not a number is a value of a type number and so will be treated the same as zero, one, infinity, and minus seven. And so you can, this stuff happens where say we get a negative number into a place where we expected a positive one, you end up with you didn't do a math operation on it, that comes out with not a number, then you do a further operation on it, now ieee754floats were designed to sort of propagate errors like this, it's just unfortunate, cause it means that you end up with error values being the same type as the real values, and you can't distinguish them with the type checker, you can't use the type checker to check any of this stuff for you as opposed to having like false or a post with comments, you end up with just, it's just a bucket of numbers, and one of them's gonna be wrong, and you're gonna have to keep checking everywhere to make sure you've got, you've handled it correctly. And there's other stuff like where TypeScript itself, come back to this, will allow you to do stuff like this, and it's just totally fine, even though it's always gonna be wrong, so about the tool, further. Any using, back to error discovery, using any instantly, like the escape hatch, it's the trust the human, they've got it right, so immediately anything where you pass it any, suddenly just all bets are off, TypeScript is effectively not helping you.
So to speed, it's not lighting fast, but it's also not dog-slow, so we've used Flow at work, for example. We've got about 400-ish files, so I made a change to an icon one of the core files in the middle of the code base that a lot of stuff relies on, there's about 120 different references to it, I'd made some intentional type errors, and it got back to me with 80 errors, a nice long to-do list of stuff to fix, but mostly one line of changes in about three seconds, which okay, that's not send the unit, run the unit test for 15 minutes, and it's not instant feedback from a linter.
I'll take this kind of trade off.
As far as actually writing them goes, in terms of speed for that, it's pretty good once you learn how to use the types.
Sometimes you do end up fighting a tool, because again the language doesn't let you describe things that you'd want, and so therefore you end up with more constraints on something, where it's like, I know it's gonna break, I need to turn on the, I'm a human just trust me, I'm a human who knows what's going on kind of switch, inevitably whenever I do that I shoot myself in the foot, but we you know, moving right along.
But I personally find, after spending five years writing tests, and then five years writing different type languages, that I find changing types is less of a time sink than piles and piles of tests, such where I have fewer, I tend to have fewer tests and a lot of types, and I find that balance personally works for me in terms of being able to radically shift the direction of a programme without going oh god I have to update a whole bunch of copy pasted, cause inevitably tests aren't dry intentionally. A whole bunch of tests that basically cover a small amount of code, so me personally, I like this trade off, may not work for you, it's totally fine.
But yeah, so there's a bit of a mash up one going on here as kind of an extra on the end, and that's type-aware linters, so you've got for example tslint, and SonarTS, and so for example, things it can check are linting like stuff, but with some more type information, so for example you've got something that lets you check that you used a weight correctly, because if you, you can weight an object and it immediately it gets promised, not resolved, and then it's like, it gets automatically wrapped into a promise, which is probably not what you want most of the time, so you can tell it that no, actually you should really, it's a weight, only promises and nothing else.
TypeScript doesn't help you out, but type-aware stuff that uses TypeScript's engine, plus eslint-ing's kind of style rules means that you can get that sort of information and find the bug before it kicks your ass. So await promise from tslint will do that, SonarTS is also an alternative if you don't wanna use something from Palantir, but bringing it all together, we've got the function that we described earlier just in broad strokes, you've effectively got types to sort of cover the imports and outports and user different functions, and it's got your back in that regard, you've got regular linters that will check sort of little furfies like you didn't use the triple equals or other sort of stuff like that, type-aware linters that will let you check more stuff with type of information available to you as well, such as I awaited three instead of a promise that returns three, and yeah and then you use tests to make sure the entire thing actually works, that it's actually, it actually works when you run it, that you got the coverage test coverage of both branches, and it's kinda like the triple combo of these tools that kinda makes it work, I really like the combination of these, it's not types or tests, it's types and tests and kinda things that are kinda like types or linters and just mashed in a big pile, use it all, please tell me when I've screwed up, cause I do it all day long, every day. So wrapping up really quickly, cause I'm just 10 seconds over, we're after accuracy, we're after coverage, and discovery of errors and we're after speed. Okay, sure.
(laughing) And we can do that by using tests, which are sort of variable accuracy, coverage and speed depending on how you decide to use them, you've got linters which have pretty decent coverage, sorry pretty decent accuracy, locality, but it doesn't, the coverage isn't super great, and they're pretty damn quick. You've got type-checkers, which have pretty good accuracy, good but sometimes variable coverage, depending on what kinds of things you're trying to catch with it, and yeah, reasonably decent speed.
You've got typed linters, which are basically linters plus as a nice sorta combo, still low to medium coverage, but I would give them a serious look, anyway, that's my talk.
Thank you very much.