Memory Management in V8

Long garbage collection pauses in JavaScript, can lead to noticeable performance degradation and poor user-experiences. Understanding this garbage collection mechanism is essential to building advanced applications with effective memory management.

This talk will cover the fundamentals of automatic memory management in the V8 JavaScript engine. We will also discuss some upcoming optimisations that are aimed directly at tackling garbage collection jank. Finally, we will explore some common memory problems, as well as the tools and techniques available for diagnosing them.

Let’s Use JavaScript

V8 is the JavaScript runtime that powers Chrome and also Node on the server.

Understanding memory management in JavaScript is about more than just avoiding memory leaks.

JavaScript’s portability makes it an attractive target for building an app that uses distributed machine learning.

Doing mathematics with JavaScript is hard at the best of times, and using it for long running mathematical computations involving large amounts of data tends to create jank, crash tabs and generally stop doing anything useful.

To build benchmarks, it was decided to teach the app enough about sports to be able to take a news article and identify what sport was being referred to.

In this process, profiling the heap and the CPU showed that learning time increased proportional to the size of a document, which seemed to make sense.

It became clear that the heap size was also growing proportional to document size, which still seemed logical. The expectation was that this would taper off as the application learned words, stored the learning and would not have to learn those words again.

Not only did that not happen, but the heap was growing to triple the size of the document being learned. Something else was happening.

Profiling the actual classifier that was written, it became clear that the classifier’s memory footprint was growing with vocabulary size but proportionally less so as the documents increased in size. So it wasn’t the classifier that was creating the problem.

Heap comparison showed that learning would start at about 5Mb and finish at 258Mb and all the growth was happening in system properties, which can’t really be accessed from JavaScript code.

Chrome’s developer tools timeline showed that garbage collection operations were running very frequently. In fact, 30% of all processing time was being spent on garbage collection.

Memory Management

At this point an understanding of how memory management in JavaScript works became important.

All values are stored on the heap. A value can have one or more retaining paths, and a value is removed by cuttings its retaining path.

The compiler guarantees that any live record is reachable from the root. Garbage is defined as heap-allocated records that are not reachable by any pointers or chain of pointers.

Garbage collection is reclaiming the memory occupied by garbage and making it available for use when allocating new records. The more objects there are, the longer it takes to collect the garbage.

While the Weak Generational Hypothesis states that most objects die young, V8 splits values between young and old generations.

Young Generation

Since the young generation is small, allocation is fast and collections are fast and frequent, while the old generation is larger and although allocations are still fast, collection can be slower. The two generations use different algorithms to go through and identify and collect garbage.

The young generation uses a semi-space stop and copy scavenger algorithm in which objects are allocated, eventually room runs out and a collection process is triggered to free up space.

When that process is repeated, the retained live objects, having survived a garbage collection cycle, are candidates for promotion to the old generation. The cost of the garbage collection is proportional to the number of live objects retained.

Old Generation

The old generation uses a mark-sweep-compact algorithm. The amount of memory that’s allocated for the old generation is governed by a set of heuristics around the rates of object allocation and how much memory is being used.

As with the young generation, space is allocated to allow for growth – except this time it’s a single space – as objects are added. When there isn’t enough space for an added object, garbage collection is triggered.

Old generation garbage collection involves marking the objects to be retained, then sweeping through the heap and making the memory from non-marked objects available for use.

After a while that process will lead to fragmentation, so a third step compacts the space previously held by the non-marked objects, pushes the live objects together and leaving available memory at the end of the heap.

Old generation uses all the committed heap that’s available. The marking phase takes time proportional to the number of nodes to be marked (the live objects), and the sweep and compact phases can take time proportional to the size of the heap.

Note that the mark-and-sweep and compact processes can be made to be more incremental, concurrent, and in some cases, even run in parallel.

Orinoco

The V8 team is working on a project called Orinoco to do exactly that, have some of the old generation garbage collection phases run in parallel, have less jank and higher throughput.

Orinoco introduces incremental marking, in which a scavenging process as short as 10 millisecond marks a few values and lets the JavaScript execution continue, repeating this until a full garbage collection mark-sweep-compact process is triggered, which takes less time thanks to the incremental marking.

Sweeping is performed concurrently, and compaction is in parallel, leading to an overall reduction in the time your application freezes for garbage collection.

An additional improvement was being able to schedule garbage collection to take place at idle times.

Remember that every time a new object is allocated, that’s one step closer to triggering a garbage collection, which will pause your application and introduce latency in most cases. The larger the heap, the longer collection is going to take.

A further optimisation was to read in the document in chunks, rather than the whole 85Mb in one go, using Streams.

That had a big impact, causing a 96% reduction in heap size, learning time increasing from 69% to 94% and garbage collection time decreasing from 30% to 1.87%.

Total time to complete improved by about 22%.

More optimisations were introduced, such as using web workers, but it’s clear that understanding memory management and focusing on improving it was what made the big difference.

Understanding memory management in JavaScript will boost your developer productivity as you build applications and bake performance into absolutely everything you do.