Function composition: What’s the big deal?
All right.
Hi there.
My name's James.
I work for a little company called Atlassian and in my spare time, I like to write about JavaScript, but today I'm here to talk to you about function composition.
And that might not sound like the most exciting subject in the world.
I mean function composition is not going to be the next viral Tiktok sensation.
But if you talk to functional programmers, well, then it's a different story.
The way they go on, you'd be forgiven for thinking that composition was some kind of divine truth etched on stone tablets, by the very hand of God.
Or perhaps some sort of AI powered tool that writes code for you.
But it's not that.
Which raises the question.
What is it, then?
What's this wonderful thing called "composition".
And what's so special about it?
Well, to answer that we are gonna have to do something that's a little bit scary.
So brace yourselves, we're going to dive into the dark arts and do some mathematics.
So here's the equation for composition.
And this one little equation explains everything there is to know.
It reads h of X equals f of g of x.
And that one tiny equation captures all there is to know about function composition.
If you get this, everything else is just implications.
So I'll break it down piece by piece.
On the right hand side of that equation, we have two functions f and g, and we're talking about mathematical functions right now, but JavaScript of functions, we can kind of, sort of think of them as equivalent.
Now when we compose two functions together, we get a new function and we're calling this one h, and if we give h some value x, then we get back a new value.
Now to work out what that value is when we first pass our value into x value x into g, because we work from the inside out, then we take the value return from that and pass it into f.
And that gives us our final result-h of x.
And I get it.
All of this is kind of abstract.
But the key point here is that we can create a new function h by combining two other functions: f and g.
And once we have that, we can treat it like any other function.
It's no different from f or g-doesn't need any special treatment.
It's a plain function, just like any other function.
We don't need any special tools to do this.
It's kind of built into how functions work.
And mathematicians, they have a special notation for describing composition.
They use this thing called the dot operator, which looks a little bit like a bullet.
So we'd write the composition of f and g as f dot g.
And that's the mathematical explanation.
And by now you're probably thinking, well, well done, James, you've done a great job of demonstrating that composition is both boring and kind of obvious.
I mean, most of you will probably be intuitively familiar with how functions work.
None of this is particularly new or mind blowing.
And what does this have anything to do with JavaScript?
Well, like mathematics, composition is built into JavaScript functions too.
So for example, if we have two functions, f and g and they both return values, then we can compose them together using a similar syntax.
So here we have cost h equals a function that takes some value x and returns, the composition of f and g.
Now this here, what I've put on the screen, is not valid JavaScript, though I wish it was.
JavaScript won't let us define a bullet operator for composition, but what we can do is make a function that does roughly the same thing is what that bullet would do.
You see, one of the neat things about JavaScript is that it lets us pass functions around as values.
So we can create a function that does composition for us.
We'll call it c2 short for compose two functions.
And at first this might seem like a pointless kind of function and you'd be right.
It is a little bit pointless, but bear with me.
Let's see if we can do something with this composition function.
So.
What we're gonna do is do something real.
Let's imagine for a moment that we are writing some kind of comment system, like maybe say hypothetically, just at random, you work for a company that makes some kind of project management software, you know, and maybe people could say create tasks to track what they're doing, and they can leave comments on these tasks, just picking something, just totally out of the blue.
So with our comment system, what we wanna do is we wanna allow people to include images and links in their comments, but we're also concerned about security.
So we don't wanna allow people to just write any old HTML.
So to make this happen, what we're gonna do is we're gonna support a cut down version of markdown.
So in our cut down version of markdown, we allow people to write an image that looks like this.
So we've got an exclamation point followed by some square brackets, then the alt text, and then some round brackets to describe the path to the image.
And a link looks fairly similar, but there's no exclamation point.
So we've got the square brackets followed by the link text and then round brackets for the URL.
And we'll write a couple of functions that convert this kind of syntax into HTML.
So.
We've got two functions here, one for images, one for links.
And each one looks through the whole comment, finds the syntax we've specified and replaces each occurrence with the relevant HTML.
Now I know regular expressions look a bit scary and these ones are particularly indecipherable because of the fancy font I've chosen, but don't worry too much about the implementation details.
See regular expressions aren't the point, they're just for demonstration, the only thing you really need to understand is that they both take in a string and they give us a string back with some adjustments made.
That's, that's all.
So getting back into our c2 function, we can combine these two functions like this.
So we pass Linky and magnify into c2 and we get its mus them together so that we get one new function, which we've called Linky and a mag.
Now, if you were to write that same thing out by hand, it would look like this and truth be told our c2 function isn't actually saving as many characters here.
And it gets worse if we try to add in more functions.
Like suppose we wanted to, to add in support for emphasizing with underscores.
We could write another function like this one, it's nothing fancy, just another regular expression replacement.
It looks for an underscore, followed by a bunch of stuff that's not an underscore followed by an actual underscore.
And then it wraps whatever's between the underscores in em tags.
Now we could take that emphasize function and we could add it in with another c2.
But what we have to do is create a new function using c2 on the inner inner part.
And that gives us a new function back, which we passed straight into another c2 which we combine with linkify.
And that gives us our single processComment function.
Lots of smooshing going on.
Now, if we compare that with writing the composition out by hand, the hand compose version is still longer, but not by much.
And going back to c2 for a moment, you can see how, if we were to keep adding functions, we're gonna have a lot of c2s and a lot of brackets all over the place.
So, what would be nice is if there was some way we could just chain a bunch of functions together without all those annoying brackets and commas everywhere.
Kind of like that imaginary bullet syntax I showed earlier, if we wanted more than to compose more than two functions, we'd just add them to the end with another bullet.
And then it would work just a little bit like doing addition or multiplication.
But alas JavaScript doesn't let us.
There is however, a TC39 proposal that might let us do something similar in future.
It's called the operator overloading proposal.
And it's very interesting.
I suggest you check it out.
In the meantime, however, we don't have operator overloading, so we're gonna have to make, do with what we have and what we can do is multivariate composition.
We can create a function that does composition for us.
Now "multivariate" is just a fancy word that means involving multiple quantities.
So a multivariate function is a function that takes a varying number of parameters.
And in JavaScript, we can create a multivariate function using rest parameter syntax.
So have a look at this function definition here.
So we're creating a function called compose and instead of a list of parameters, we've written three dots followed by funcs.
Now, if you're not familiar with the way rest parameters work, all that's going on is those three dots tell the interpreter, just shove however many arguments we get into an array and call that array 'funcs'.
So JavaScript gives us the ability to create a function that takes different numbers of parameters and that's useful.
But before we go any further, let's stop and have a think about what we're doing.
We wanna take a list of functions and smush them together so that we get one new function and that new function is gonna take a single value as its input.
And it's gonna pass it to the first function in our list.
Oh, actually the last then it's gonna take the result of that, pass it to the next function, take the result of that, pass it to the next function and so on.
And in other words, we're gonna loop through our list of functions and we're gonna carry a little bit of state with this as we go.
Which sounds a lot like a 'reduce' operation to me.
So if we put that all together, we can get something like this.
We've got this new function called 'compose' and its using our fancy rest parameters syntax, and it returns a new function, which I've creatively called 'newFunctionToReturn'.
And that function, it takes a single parameter x0, and we call reduceRight on funcs to loop through each of those functions.
Passing it x0 as the initial value.
Now you might be wondering why we're using reduceRight, rather than reduce well that's because composition works from the inside out.
We wanna call those innermost functions first and we'll come back to that in a moment.
But for now, the main thing to notice is that we've created a function that returns another function.
And with a bit of tidying, we can get rid of that funny variable name and we reduce it down to this.
It does, it's doing the same thing just without that newFunctionToReturn variable.
So let's try out this shiny new function, but first let's increase the difficulty level by adding a new requirement to our comment system.
So what we wanna do is we allow level three headings, and we mark a level three heading by putting three hashes at the start of a line.
So here's a function that'll do that for us.
And once again, it's not super important how it works.
It's a regular expression that looks for three hashes at the start of a line followed by some whitespace.
If it finds that it grabs the rest of the line and makes that a heading.
So we've got four functions to run in our composition.
And we use our shiny new 'compose' function like this.
And to me, this looks rather neat.
We've got a new function 'processComment' that's composed of four small single responsibility functions joined together.
But there is a small difficulty with this function.
And that is, we end up writing our functions in the reverse order of how they execute.
That is we've written 'headalize' last, but it's actually the first function that runs.
And if we map it out, the data flows through this processComment function like this.
So we pass the first value into headlalize, then the result of that into emphasize and then into imagify and so on.
So the data's flowing from right to left or bottom to top as the case may be.
And as I've been saying, it works that way, because if we wrote that composition out by hand, it would be the innermost function headalize that we want to call first.
And that's why we we're using reduceRight.
We go wanna preserve that order.
But if we're gonna write out our compose functions in a vertical list, like this one, there's no reason why we can't create a composition function that composes in the opposite direction.
That way we could have the data flowing from left to right or top to bottom as the case may be.
And it would flow more naturally.
And so we'll call that function "flow".
So to create flow, all we do is we replace that reduceRight with reduce, and it looks like this.
As you can see, the only difference is that reduce method call.
Now to show, show it off in action.
What we're gonna do is we're gonna add yet another requirement to our comment processing system.
We wanna allow for text between back ticks to be formatted as code.
Now once again, we've got a function that does that for us, again, with regular expressions-don't try this at home kids-but all our regular expression is doing is looking for a back tick, followed by a bunch of characters that aren't back ticks followed by another back.
Then it grabs whatever is between those two back ticks and wraps them in code tags.
It's a lot like the emphasize function from earlier.
Now, if we try that out with flow, our new process comment function looks like this.
This time, the data flows from top to bottom or left to right, depending on how you write it.
And we start with headalize and we move to emphasize, imagify.
So things happen still in the same order.
It's just the difference is the way we write them.
But if we were to write this out by hand, this is starting to look a lot more clean and simple than if we were doing all that composition with all those brackets.
So indeed I think flow is rather neat.
You can see how concise it is.
We've composed a function functions, and there's not a fat arrow insight.
We're treating functions as values, which is something a lot of people find hard to get used to, but it opens up a lot of possibilities.
And we're gonna talk about that more in a moment, but for now you can see that some people might find this rather pleasant to use.
And because it's pleasant to use we might find ourselves using it to build functions all over the place.
But if we only use some of these functions once we might get a little bit lazy and we start to invoke those functions immediately.
So for example, we might end up writing something like this.
If we just want a single processed comment, we build the function and then we call it straight away with some value.
And there's nothing terribly wrong with this, but it does look a little awkward.
The largest problem though, is that seeing immediately invoked functions makes some JavaScript developers a little bit nervous.
You see we've already taken away those comforting fat arrow functions, and then taking it a step further and having functions that sort of returned functions, which we call immediately.
It's all just gets a bit too much.
So maybe there's something we could do to help these folks out.
We do that with yet another composition function and it's called 'pipe'.
Now pipe works a little bit like flow, but we treat our parameters in the spread just slightly differently.
So here's how pipe looks in code.
I'm gonna flip back to flow for a moment so you can see the difference here's flow and here's pipe again and back to flow.
Notice how flow always returns a function.
You see, that's why we've got two fat arrows there, but in pipe there's just one fat arrow and pipe takes x0 as its first argument.
It's like we've shifted it, but because of that, we don't have to wait for the return function to be called with an initial value.
Instead, pipe gets started straight away.
It passes x0 through the list of functions.
And this means we don't have a function we can reuse, but we don't always need one of those.
So now to illustrate how pipe works, let's go onto the next step.
So we've got a pretty good function for processing an individual comment, but what if we want to process lots of comments?
Like suppose we have a big list of comment strings sitting inside an array.
Let's put together some code with pipe that will help process them.
But first, just to set things up, we're gonna introduce a few utility functions.
And the first lot of utility functions are for processing arrays.
They're fairly simple.
All they do is call array methods.
But notice that they all have two fat arrows.
That means that when we call them with the first parameter, we get another function back and that's important.
But they're all reasonably straightforward.
The map utility calls, array map, filter utility calls, array filter, and then so on.
The only one that's really any different is take and that's where we call array slice with a fixed starting point of zero.
Now while we're at it, let's introduce a few utility functions for strings too.
We have itemize, we're putting a list item wrapper around something.
We've got orderedListify for making an ordered list and chaoticListify for making an unordered list.
And finally, we've got a couple of functions to check that we are not mentioning any super secret company information in a public comment.
Now imagine we have an array of comment strings.
We wanna filter out any comments, grab the first 10, then we wanna run our processComment function from earlier on each of those 10.
Then we wanna format each comment as a list item and finally join everything together as a single string.
Doing that with pipe looks like this.
Now, if we squint a little bit, this sort of, kind of looks like chained method calls like these here.
And in fact, the way I formatted pipe, they look eerily similar.
Right?
Have a look again.
Here's the pipe version.
And here's the array method version.
Not much difference.
And at this point, some of you clever people are probably wanting to scream at me by now, because I said earlier, wouldn't it be really nice if we could have some kind of operator that would let us do composition.
And there is in fact, a TC39 proposal that looks kind of similar to our pipe function and it's even called the 'pipeline' operator.
So we could rewrite our code from above using the pipeline operator like, so.
There's a lot I could say about this, but for now, I'm just gonna point out that the pipeline syntax requires that extra hash symbol, which is a little ugly, but not the worst thing in the world.
And sadly it hasn't made it into browsers yet.
So I'm gonna focus on what we can do right now, which is use our pipe function.
And we were comparing it with the array method version.
And after comparing those two, you might be thinking to yourself, well, why bother with pipe?
Why not just use those array methods?
And that's a fair question.
Cause after all, with the array method chaining we don't need to add all these utility functions.
There's no extra overhead of trying to figure out what the pipe function actually does.
The array methods are familiar and they're built in.
There's something pipe can do that method chaining can't.
Pipe can keep going even when there's no methods to call.
For example, we can add our chaoticListify function from earlier to our pipeline.
If we wanted to, we could keep on adding functions to our pipeline.
It's possible to build up entire applications this way.
Now, compose, flow, and pipe.
They're neat.
They make for some concise code, but someone still might be thinking "well, that's nice, but so what, like, so you can write stuff in a pipeline, big deal.
What difference does that make to the code I'm writing day to day?" And that's a reasonable question.
After all, we don't need pipe, we can achieve things in other ways.
So for example, we can write equivalent code using variable assignments.
Here's one that does the same comment processing, does the same job, no trouble without any of that pipe business.
And for most people, this version is going to be familiar and easy to read.
So by that measure, the pipeline version is objectively worse.
So why would we bother with pipe?
Now to answer that, I want us to compare this version here with pipe, and I want you to notice, first of all, the number of semicolons.
Second of all, we didn't need any array utility functions here.
Now, if we look at those semicolons, we can see that there's six of them in this version.
And the pipe version has just one.
So what does that mean?
Well, it means that the variable assignment version is made up of six individual statements, but the pipe version is a little bit different.
It has one semicolon because there's only one statement.
It's a variable assignment and this is a key difference.
Now it might seem like I'm splitting hairs, after all the compose version still has a bunch of commas, keeping things separate, but there's a subtle, yet important difference here.
In the variable assignment version, we created six statements.
In the pipe version, we composed the entire thing as an expression, which we happen to assign to a variable.
And again, you might ask "well so what?.
Who cares?" And in one sense there's no difference.
The two pieces of code still do the same thing.
They produce the same results.
The performance is roughly the same, but something that does change is the meaning of what we're doing.
Now to make this clearer.
What I'm gonna do is I'm gonna convert this code back to producing a function rather than a value.
So if we switch back to using flow, we now have a function called 'processComments'.
And if we wanna update the variable assignment version, it's a function as well.
We just wrapped it in a fat arrow and some curly braces.
And it's now a function.
But what changes now that we've done this, the difference between the two is, what that equals sign means.
So let's have a look.
In the variable assignment version, the equals sign says, processComments means run this set of steps in order and at each step of the way store the result in a named variable.
In the flow version though, the equals sign says that processComments is the composition of this list of functions.
So we're defining 'processComments' as a relationship between functions, not a series of steps.
And that difference there, that's, that's a big deal.
Now it doesn't change the instructions we send to the CPU, much.
These two pieces of code still do essentially the same thing.
But this idea, this writing code is a set of relationships between expressions, that is a big deal because it's not about what changes in the code, it's about what changes in us.
Writing code this way changes the way we think about the code.
In composition it encourages us to think about code as relationships between expressions and this in turn encourages, focus on our desired outcome, rather than thinking about each detailed step.
And as a result, our code becomes more declarative.
But based on what we've seen so far, that might not be so obvious.
So we've written the same piece of code two ways.
You might be thinking well, potato, potato, but I can prove that the flow version is more declarative because we can make it more efficient without changing a single character of that function.
What we're gonna do instead is we're gonna change some of those array helper functions.
So first we'll redefine map to use generators.
And if you're not too familiar with generators, don't worry too much about how it works.
Just notice that it's still a simple single line function and it's just yielding instead of returning a value.
And we'll redefine fields very much the same way.
Again, we're using generators, it's a little longer, but not much longer.
And we'll give, 'take' the generator treatment.
Now this one, this does get a bit, little bit more complex, but all it's doing it's keeping a count and it stops yielding values once it reaches the specified limit.
And finally we redefine 'join', which again is a single line function.
Now we don't change anything in our flow based processComments function, but we've changed the way this function works.
So suppose we had 1000 comments to process.
In our original version we would've had to run that `noNazi` function on every single comment.
But with this generator version, if there's no comments that mention Nazis, it will only run on the first 10 comments.
So what's going on is that by using generators, we can now run each comment through all the functions, without creating any interstitial arrays.
And we stop once we've processed 10 of them.
Which means with generators, we're now using less memory, spending less time allocating and deallocating, which in theory will make the whole thing more efficient.
But the point isn't the performance gain though.
So, no, I haven't benchmarked it to see how fast or slow it is, 'cause I don't care.
The point here is that we can do this and it might be significantly faster.
Now.
Sure.
I'll admit there's no reason we can't write a version of this function that used generators with assignment variable assignments, like the one we did before, and we get the same results.
That's true.
The point is we are less likely to do so because writing code as a series of statements doesn't really encourage that way of thinking.
Remember that I asked you to take note of where we used utility functions in the this variable assignment version versus the composition version?
We can switch to using generators without changing the code in the pipeline version because using composition encourages us to use a bunch of utility functions.
And those utility functions allowed us to switch out the implementation without changing the API.
We defined our pipeline as a set of relationships between functions and to do that, we needed those reusable utility functions.
In domain-driven design terms, those functions created a natural anti-corruption layer.
This let us change the implementation details without altering the high level intent.
We've separated, what we want to do from how the computer does it.
And this, this is why function composition is kind of a big deal.
At its core, function composition isn't all that complicated.
Combining functions with composition is straightforward and fairly easy to understand.
We can take that core idea and we can extend it out, such that we can compose whole lists of functions all at once.
That's how we get 'compose', 'flow' and 'pipe'.
And we can use these functions to create concise, elegant code.
The real beauty of composition isn't how it changes the code.
It's about how it changes us.
It changes how we think about the code, allowing us to think about code as a set of relationships between expressions, rather than a set of steps that must be executed in a given order.
And that opens up a whole new way of organizing and optimizing our code.
With that, I'll stop rabbiting on and let you get on wth the rest of the conference..