Record & Tuple: immutable data structures in JS

Robin Ricard at Global Scope '21

Transcript
Slides

Hello everyone.

My name is Robin Ricard and I work at Bloomberg.

At Bloomberg we use a lot of JavaScript to par our core product, and I work in a team that's responsible for maintaining our JavaScript infrastructure for that product across engineering.

Part of my job is also to invent standards to advance what we imagine being the future of desk script.

So that is why I am a TC 39 delegate for Bloomberg.

And I will get into what TC 39 is soon.

But in short, the only thing that you need to know now is that I can work on JavaScript proposals, including the one that I'm going to present to you today "Record and Tuple".

And I am one of the co-authors and co champions of that proposal, with my colleague Ric Button that also happens to work at Bloomberg.

So let's get straight away into what Record and Tuple in JavaScript are and what they mean for you.

So that's just a record, contains two keys, associated with two values.

It just looks like an object literal, except that we push this hash character in front.

That's it makes an object become a record if you do that.

Now here's a Tuple, It looks like an arrow literal as well, but with a hash character upfront as well.

This one is a sequence of two values, but you can put way more than two values in them, right?

We're not limited by two because they're called 'Tuples', it's only two values.

No.

You can put n values in there.

And here is a two pool of records.

That makes sense, because you could put objects in arrays.

Right.

But, well, I, I wouldn't continue the similarities between object arrays and record in and Tupes.

More because it's a bit misleading, but yes, you can indeed put records in Tuples and also records and records and Tuples in Tuples.

And finally, tuples in records.

All of those combinations are actually possible.

And here's the first catch.

Contrarily to an object, you can't change your records.

You can't mutate any value in the record and you will get an exception if you do so.

The same goes with tuples and records inside of tuples or any level of nesting that you're going to deal with here.

If you try to change anything.

Record and Tuple structure you're going to see an exception.

Record and Tuples can't be changed once they have been created.

Can't be changed once they have been created.

So don't, we have objective.freeze to actually ensure that?

So two things about object.freeze: over the lifetime of an object, there will always be a moment where the object will not be frozen, can be when your just creating.

It is not frozen and objects by nature will never come as frozen just after creation.

So that can be a problem specifically as we're going to see later on.

But the second thing is that objects are not deeply frozen either.

And so you can freely change whatever is stored inside of an object, if that thing has not been frozen as well.

So you either need to recursively freeze things, or you're not going to have the guarantees that you want.

And so in that example, talking about guarantees, I have no guarantee whatsoever that any driver is not going to mutate the contents of config.DB, for example, So config.db.host might have been changed by any drivers here.

However unless I go look inside of any drivers or run to code, I will have no idea.

So in order to prevent that, that brings me to something I shamefully have to admit I did in the past-defensive copying.

In this case, I'm creating a deep copy of the config by converting to a JSON string and parsing it again.

It is unnecessarily slow and we could definitely avoid doing that work.

Right.

And that's what we can do with records and tuples.

Right.

We don't have to copy anything.

We don't have to freeze anything.

I know that if any driver tries to mutate the config, an exception will be raised.

That is it.

There is no freezing after the fact.

There is no deep copy again, the feature just gives you guarantees out of the box.

And now there is a second catch, right?

Because in order for this deep immutability thing to even make sense while you can't put anything mutable in records and tuples, if you do put in objects, you will get an exception and this can be seen as something inconvenient.

Right.

But it's actually here to protect you against yourself in a sense.

More than just mutability think of record and tuples as primitives, such as strings or numbers, except they are compound primitives.

The association of multiple primitive values with each other.

The thing is with primitivs, primitives don't have a night NTT right?

They are the only, they are only represented by their own values.

And so objects do have an identity.

Anything with an identity can't be stored in something that doesn't have an identity.

And that is why, even if we're able to completely freeze everything, even after the fact, you would still have an object that's has an identity, and that is problematic for the way record and tuples work.

But we do understand that sometimes you need things that have an identity and a specifically, specifically here, well, if I wanted to put a function to have some kind of callback inside of my configuration here, well, I wouldn't be able to, because functions do have an identity because they are objects in JavaScript.

So how do we store them in records?

We did think about this.

And we created an escape hatch for that.

Box is an explicit way to convert an identity to an actual value.

So now a b will explicitly take the identity of the function, wrap it in a value.

And so you can put it in inside of a record.

Or a tuple.

The important thing with box is that you really explicitly need to box things and unbox them.

As you can see in both instances here you need to unbox to look up the objects, and this is clearly a feature because everything that will refer to an identity will be now in a way safely boxed inside the books and you just have to look for unboxing to potentially see problems with that code.

So if there is a problem because something has been mutated at some point in the lifetime of the program, and you can't find where that happened, you can statically try to look up for unboxing in your code base and see if that would have been the reason why things have been changed.

So that keeps your code clear on whether you're going to have an identity or a mutation being involved at some point.

So why do we even do all of this, this whole muddle that makes record and tuple like identity-less primitives brings an extremely powerful property too them, I do think.

And that is their equality semantics.

With objects and arrays, even if the value they store are identical, they will have different identities because they have been created distinctly.

They will not check equality checks, and they will fail.

So with record and tuple, if the internal values are the same, then the overall value is the same, no matter how and when the value was created.

In order to stress out how cool, this is I'm going to use a real problem that I have almost every day.

Part of my job is to create build pipelines at Bloomberg and in any build pipeline, you have to deal with it.

Some point or another.

And as an additional constraint, well, some of our developers have windows machines.

So we need to be able to both handle Unix and windows paths and they differ in the separator they're using.

Right.

So forward and backwards slashes.

A good way to abstract that away, so you just use an array to represent path parts.

But by doing so I lose that very important property of strings.

I can compare two strings by value.

I can't do that with an array.

So with an array.

I can compare identities but not values.

So we'd have to iterate over each value in the array to know if there are storing the same things.

That is why this equality is failing.

here.

However, if I do replace arrays, we've tuples and things behave like strings would have behaved minus the separator issue, right?

There is no need to reiterate in there.

The internal values are the same.

Then the two tuples are the same.

However, none of this is particularly impressive.

if I just make an output function to convert arrays.

. Um Well that would be it.

This is just sugar around a utility function core.

Well now because you know, equality in JavaScript is not just a triple equal operator.

Equality matters in many places in the language.

For example, using ES map and sets.

Right?

In this example, I'm trying to look up an original TypeScript source path for a given produced JavaScript path, right?

As we're seeing here arrays, they're bound to their identity, so they're not reliable to just do those lookups.

Like if I produce on the spot in the get go here and an array to do the lookup, it's going to have a different identity than the one that I used to set your channel key.

So the lookup will not find anything and return me undefined.

This is really not practical if I have to try the identities of every one of my paths across my code base.

However, if I were to use tuples to do this, well, that would solve the situation.

So since map maps are sensible to broader equality, semantics, well maps will also compare tuples by value instead of identity.

So if I'm going to do that lookup, now I can do that look up at any point at any place.

And I can forge a new tuple any time I want to do that look up and it will look up rightly.

So in that case, yes, I forged dist.util path parts just before the lookup and I still get the source, looked up properly as I expected.

So now let's go back to a previous caveat-you can't mutate records and tuples.

Your only way to enact a change is to somehow update it by copy.

So how do I get, there?

So the path here.

Well, I can just concat spread.

So the yes, spread operation that exists on array.

and objects will work.

So it will by the way to work with both records.

And tuples right.

As a here, you're seeing examples of tuples, but that would similarly work with records.

And here in this example, I'm concatenating the route and then the relative path afterwards . And that is about it.

What about mutating functions on array, right?

Our prototype has push reverse sort all of those things.

And there are quite practical.

One might say, well, tuples have the same functions, but in the past tense.

So they are meant to represent the state of the tuple after the operation have has happened.

Right.

Push returns, the tuple with those new items, pushed and reverse.

returns the tuple as if the items were reversed, the original route.

Or the, original apps are still the same after the operation, but you now have essentially a copy of the tuple with the operation that has been done on it.

Please just note that.

those Specific methods and, specifically their naming is likely to change in the future.

We might even drop pushed because you could actually do what we are doing here with pushed using just the spread concats as we just saw earlier.

So this API is very much subject to change.

Now, those methods could be really practical on arrays actually, specifically reversed.

Or sorted, which is another one that we're introducing here.

And as a matter of fact, yes, that would be super practical.

So that's why we're actually proposing another proposal that is called changed.

change array by copy and that recently went to stage one.

What does stage one even mean?

Like I've been talking about TC 39 stages and whatnot, and it would actually be super practical to know what all of that jargon means.

So now I'm going to explain it.

So let's get into what TC 39 is and how proposals advance from being just an idea to an actual language feature.

First things first.

TC 39 stands for technical committee 39.

What a surprise.

It is the technical committee that's actually deciding on what is standard JavaScript or not, but actually it's not JavaScript.

It's ECMAscript because JavaScript is a trademark that's we can't really use but ECMAScript then is a standardized language.

What is ECMA?

So ECMA is an international body international standardization body that is representing a bunch of different companies working in the industry.

Those companies are member companies.

That will have different interests in different standardizations being done by ECMA.

So talking about TC 39 in particular, a lot of organizations that are involved come from different backgrounds.

For example, a browser implementers such as Google and Mozilla, Apple, Microsoft.

Or even smaller implementers, such as Modable, Modable works on XS which is a very small JavaScript engine that is spec compliant and that runs on embedded hardware.

And that's quite amazing if you asked me that this is even possible, but they managed to do it, and yes, they are one of the companies that are participating to TC 39.

And then you have just companies just like Bloomberg, right.

That have an interest in the web and interest in the technology behind it that participate as well in the standardization process.

And so that is coming down to why I am talking into you about this.

So each of these member organizations will send employees called delegates to TC 39 to represent the organization's interests.

Right.

I happen to be a Bloomberg delegate, as I said at the beginning, let's me champion proposals.

But you don't have to be a delegate to participate in all of this, right?

First of all, you can author a proposal if you're not a delegate being a delegate just means that you can champion things.

Then you don't even have to be a delegate to participate and even champion things.

Uh If, if you are an expert group presenting, some very important part of the JavaScript community well, you are going to be invited to participate in that process.

So for example, the Babel maintainers do happen to be invited experts because Babel is used super widely in the JavaScript world.

So we actually need them and their point of view as, as if they are an actual implementer.

Right.

So now why do all of these people decide?

What is standard JavaScript?

So again, actually, what is standard ECMAScript?

ECMAScript as a standard is something that is, gets published every year.

Right now we're talking about IES 2021 and every year is a new one essentially.

And so it's a kind of a continuous improvement, but yearly granularity of the standards still would make the job of standardizing quite difficult.

So in order to be even more continuous in how we change the language.

TC39 has what we call the stage process.

So at every meeting that is normally every two months, but now we are doing it remotely so it's every month we introduce proposals for stage advancement.

So at any point in time, every candidate feature has the stage characteristic associated to it.

So in the case of Record and Tuple, Record and Tuple is stage 2.

Stage zero is usually when you have an idea and you want to see if the committee is interested in pursuing it.

At stage one, the committee validated that the idea is worth pursuing in one way or another, and some work towards proposing some specification should be done.

At stage two, the first version of the proposal exists and the overall design is mostly figured out.

And there is some spec text already, however it is still subject to change a lot which brings us to stage three, where we are going to where the committee is going to consider that the proposal is now spec complete.

Unless some implementation details require some changes here and there.

And finally I'dat Stage 4 all major browsers are shipping it either in a pre-release version, So tech previews, things like this, or even in major versions.

And so the feature at this point is now queued to be actually edited in the following years spec cuts essentially.

So where are we with Records and Tuples.

As I said, we're stage two and we made that state shoe last summer.

And that means that we have a spec text draft.

That also means that we tried playing with it.

And that happens to be in Babel with.

a parser and a syntax transform.

Both of them are highly experimental.

And I'm going to emphasize on experimental here, don't watch this talk and just go away and download it on NPM.

Right.

That's probably not a good idea.

And we also have a polyfill that implements.

the runtime features behind the transforms.

And finally we have Niccolo one of the Babel maintainers that actually also helped us do the Babel things I just listed that also worked on a toy implementation, implementing Tuple in SpiderMonkey which is Firefox's e Engine.

So huge thanks to Niccolo by the way, for helping us with Babel and this toy implementation, because this is really useful information to gather at this point of the process.

And so we're now, since July, 2020 into stage two.

We're finalizing the spec text.

No, browser completely took an eye on implementing it yet.

We don't have test262 tests, but those are not required until we get to Stage 3.

And we still need to decide some semantics that will cover.

a few edge cases in the language, notably around boxes.

And we are still actually discussing the syntax and it could change if we see some interest in doing so from the community.

Right.

And so, yes, I just talked about Babel polyfills.

So that means that we are open with, with the experimentation experimentation and feedback will help us finish the proposal towards Stage 3.

And when I say experimentation again, I mean, it's, the syntax might change.

The Tuple methods names are also going to change as I said earlier.

And we don't exactly know exactly what are going to be the semantics around boxes.

So please do not use this in production.

However as I just said we need you to try it.

So how do you do that?

Well that's the best part I think of this toy is the, the playground that has been written by my colleague Ric Button again.

Usual partner in crime.

And that is it.

That is the playground.

You have an editor here and you can see the outputs on the other side.

So here, I just put the examples that I wrote in my slides earlier, and you can see that yes, you can use ther syntax using the hash syntax here to create records and also tuples.

And then we can actually check that the equality semantics I talked about are matching and so.

Just so that you see that I'm not lying to you is that I'm going to break that equality here.

So if a becomes two, now you see that record EQ is now false.

So let's put back to this.

Additionally, if I'm adding keys I will break again, the equality.

One thing to note by the way, is that the order in which you actually write your keys doesn't matter in records that's just a note to keep in mind here, the values or are still intrinsically the same and so on and so on You can see that I'm doing Source maps lookups here So util.js will look up util.ts As I showed earlier, and concatenation works and reversion also works.

We've reversed.

We can also use sorted just, just so you can see so obviously doesn't make sense to sort a path, but we can just do it.

All right.

So here are the main resources that you might want to look up later and that's it for me.

My name again is Robin Ricard and I work on JavaScript infrastructure at Bloomberg.

Thank you for tuning in today.

I hope you had fun with the seeing this.

And are as excited as I am about this proposal.

If you want to talk about the proposal you can contact me on Twitter and also note that Bloomberg is recruiting and we're doing tons of cool stuff in JavaScript.

So those links will tell you more.

And thank you again.

I hope you will also enjoy the other talks, it promises to be cool.

Record & Tuple

Immutable data structures in JS?

Web Directions Global Scope 2021

July 23, 2021

Robin Ricard

JavaScript Infrastructure Engineer

TC39 Delegate

A Tour of Record & Tuple

Record Syntax

const record = #{
	name: "Record & Tuple",
	stage: 2,
};

Tuple Syntax

const tuple = #["Record & Tuple", 2];

				Nested structures
const proposals = #[
	#{ name: "Record & Tuple", stage: 2, },
	#{ name: "Change Array by Copy", stage: 1, }, 
];

Immutability

const rt = #{ name: "Record & Tuple", stage: 2, };

rt.name = "Record & Toople"; // ❌

Deep immutability

const proposals = #[
	#{ name: "Record & Tuple", stage: 2, },
	#{ name: "Change Array by Copy", stage: 0, }, ];

proposals[0].name = "Record & Toople"; // ❌

Deep immutability / Object.freeze?

const config = {
	db: { driver: "postgres", host: "pg0", }, 
	// ...
};

Object.freeze(config);
await initDrivers(config);
assert(config.db.host === "pg0"); // ❓

Deep immutability / Defensive cloning

const config = {
	db: { driver: "postgres", host: "pg0", }, // ...
};

   const initConfig =
		JSON.parse(JSON.stringify(config)); // 🐌
    await initDrivers(initConfig);
    assert(config.db.host === "pg0"); // ✅

Deep immutability / No cloning, no changes!

const config = #{
db: #{ driver: "postgres", host: "pg0", },
 // ...
};
await initDrivers(initConfig);
assert(config.db.host === "pg0"); // ✅

Deep immutability / No objects in R&T!

const config = #{
db: { driver: "postgres", // ❌
		host: "pg0", },
		// ...
};

Deep immutability / Boxes: explicit interior identity

const config = #{ 
	db: #{
		driver: "postgres", 
		host: "pg0", 
		onConnect: Box(() => {
			// ...
	}),
 },
}; config.db.onConnect.unbox()();

⚠ Functions have identity!

Equality

[1, 2, 3] !== [1, 2, 3]
{ a: 1 } !== { a: 1 }
#{ a: 1 } === #{ a: 1 }
#[1, 2, 3] === #[1, 2, 3]

Equality / Identity

const srcPath = ["src", "index.ts"];
const distPath = ["dist", "index.js"];

assert(srcPath !== distPath);
assert(srcPath === ["src", "index.ts"]); // ❌

Equality / Identity-less-ness

const srcPath = #["src", "index.ts"];
const distPath = #["dist", "index.js"];

assert(srcPath !== distPath);
assert(srcPath === #["src", "index.ts"]);

Equality / Indexing by identity

const utilPath = ["dist", "util.js"];
const sourceMapping = new Map(); 
sourceMapping.set(["dist", "index.js"],["src", "index.ts"]);
sourceMapping.set(utilPath, ["src", "util.ts"]);

sourceMapping.get(utilPath);
// => ["src", "util.ts"]
sourceMapping.get(["dist", "util.js"]);
// => undefined

Equality / Indexing by value

const sourceMapping = new Map();
sourceMapping.set(#["dist", "index.js"],
		#["src", "index.ts"]);
sourceMapping.set(#["dist", "util.js"],
		#["src", "util.ts"]);
		
sourceMapping.get(#["dist", "util.js"]);
// => #["src", "util.ts"]

👍 Records can be looked up too!

Update by copy

const root = #["C:", "dev", "rt-project"];
const rel = #["src", "index.ts"];

const abs = // ?

Update by copy / Spread

const root = #["C:", "dev", "rt-project"];
const rel = #["src", "index.ts"];

const abs = #[...root, ...rel];
// => #["C:", "dev", "rt-project", "src", "index.ts"]

👍 Records can be spread too!

Update by copy / New Methods

const root = #["C:", "dev", "rt-project"];

const abs = root.pushed("src", "index.ts");
// => #["C:", "dev", "rt-project", "src", "index.ts"]

const rev = abs.reversed();
// => #["index.ts", "src", "rt-project", "dev", "C:"]

⚠ Subject to

change

Update by copy / Proposal: Change Array by Copy

const root = #["C:", "dev", "rt-project"]; const abs = root.pushed("src", "index.ts"); // => #["C:", "dev", "rt-project", "src", "index.ts"] const rev = abs.reversed(); // => #["index.ts", "src", "rt-project", "dev", "C:"]

⚠ Subject to

change

const root = ["C:", "dev", "rt-project"]; const abs = root.pushed("src", "index.ts"); // => ["C:", "dev", "rt-project", "src", "index.ts"] const rev = abs.reversed(); // => ["index.ts", "src", "rt-project", "dev", "C:"]

The TC39 Committee:

Advancement of Record & Tuple

TC39

Technical Committee deciding what is standard ECMAScript
Part of ECMA International, a standardization body
Members: Corporations, Small/Medium Enterprises and Non-Profit Organizations
Members send delegates to various TCs
Some individuals out of member orgs can be invited experts
Standard is published every year ES6 (ES2015) - … - ES2021
Continuous improvements through the Stage Process

TC39 / The Stage Process

Stage 0: Proposals are ideas
Stage 1: The committee is interested by the proposed idea
Stage 2: The committee intends to specify the proposal
Stage 3: The proposal has a spec and should land in the language with minor changes
Stage 4: The proposal is implemented in major browsers and will ship in the next yearly specification

Change Array

by Copy

Record & Tuple

Record & Tuple Status

✅ Spec Text Draft
✅ Babel Syntax Parser
✅ Babel Syntax Transform
✅ Polyfill
🚧 Tuple toy implementation in SpiderMonkey (Firefox) by Nicolò Ribaudo (Babel Maintainer Team)

Record & Tuple Status / Stage 2 Proposal