Flyweight Pattern and Interning in JavaScript
I started reading about the release of the new unique package in Go, which implements interning to canonicalize values, enable comparison by pointers, and achieve a smaller overall memory footprint.
After some reading, I clicked on the wikipedia page linked by the release blog. The first thing mentioned was the Flyweight pattern and how interning is used there. As you might know, the flyweight pattern is a structural pattern that relies on interning for storing and sharing the intrinsic state of an object.
Not long after, I found myself diving a little deeper into how the Flyweight pattern would look like in JavaScript.
For example, if we have a book object that is being kept in multiple stores, it might be defined in the following way:
type Book = {
name: string,
ISBN: string,
storeId: number,
amount: number,
}
We can restructure it so that we share the common intrinsic Book properties between different instances and allocate memory only for the extrinsic properties:
type Book = {
name: string,
ISBN: string,
}
type StoredBook = {
__proto__: Book,
storeId: number,
amount: number
}
By using the reserved __proto__
keyword here,
JavaScript is able to resolve the intrinsic state by walking down the prototype chain.
That is, if name
doesn’t exist on StoredBook
, it checks on __proto__
, which is Book
.
As a matter of fact, this is what this odd page explaining the Flyweight Pattern in vanilla JS is suggesting. Nevertheless, they end up providing an example that does the exact opposite of the intent behind interning and the flyweight pattern.
Let’s explore why their example is incorrect
const createBook = (title, author, isbn) => {
const existingBook = books.has(isbn);
if (existingBook) {
return books.get(isbn);
}
const book = new Book(title, author, isbn);
books.set(isbn, book);
return book;
};
// ...
const bookList = [];
const addBook = (title, author, isbn, availability, sales) => {
const book = {
...createBook(title, author, isbn),
sales,
availability,
isbn,
};
bookList.push(book);
return book;
};
The createBook
function creates a new book object if the ISBN hasn’t been stored before, and returns the existing object if it has. So far, this works as intended.
The issue arises in the steps after that. The Flyweight pattern is a design strategy aimed at saving memory by sharing the intrinsic state between logically similar objects. However, in this implementation, the book objects are not shared because the interned structure is actually destructured.
When we destructure primitive values to use as keys and values in a new object, we end up creating new memory allocations. In the example, we destructure two properties — isbn
and name
— whose values represent the intrinsic state, which means that they’re expected to keep their value the same between different shared instances.
On subsequent initializations of the same string (in our case the intrinsic state), V8 is not allocating the same amount of memory for the strings; instead, it returns a reference to their location in the string pool. Therefore, we’re not necessarily using up memory for the string allocations for our intrinsic state strings, as it’s already allocated once, but for actually keeping the references to said strings.
If strings weren’t interned by the engine, desctructuring would have been way worse because we would’ve had to allocate space for each string in turn possibly increasing the cost exponentially.
Even if strings are interned and allocated just once, the desctructuring approach is still bad.
Every new property and its value take up space. If you look at it logically, it makes much more sense to keep one property pointing to a shared object than to allocate space for two new properties pointing to shared strings, right?
Let’s abstract the complexity and think of each property as pointing to a reference, which in turn points to a different thing in memory space. Let’s say all references are 64 bits in size.
You can see how we incur a memory cost that increases N times, where N is the count of the properties.
Well, this is simplified quite a bit and not necessarily the case, as you would see later that we incur a cost of just ~300-something bytes for 10 Book
objects.
JavaScript engines are optimized to minimize such costs. You wouldn’t store a pointer to a space where an interned string is stored; you would probably use some other mechanism that already knows where the pool is and fetches the string by some handle, which takes up less space than a regular pointer. To be fair, I’m just trying to show you why the flyweight pattern saves memory in regular programming terms.
The example given in the article is always creating new objects by destructuring the “main” object, whose intrinsic properties should be interned. It cannot even be considered a bad example of interning or the flyweight pattern because it literally does neither of the two.
Putting it in perspective
const intern = {
name: "Harry Potter",
ISBN: 1287834834
}
for (let i = 0; i < 10; i++) {
let obj = {
bookstoreId: i,
...intern // destructuring instead of extending the chain
}
}
// for (let i = 0; i < 10; i++) {
// let obj = {
// bookstoreId: 1,
// __proto__: intern
// }
// }
console.log(process.memoryUsage().heapUsed );
The script is run on V8 and no garbage sweeps are executed in the meantime. The difference between the two approaches here is around 344 bytes in favor of the proper interning approach.
PoC comparison
class Book {
constructor(title, author, isbn) {
this.title = title;
this.author = author;
this.isbn = isbn;
}
}
const books = [];
const book = new Book("Book Title", "Author Name", "1234567890");
for (let i = 0; i < 10000; i++) {
// books.push({
// ...book,
// someArbitraryData: "Some arbitrary data",
// });
books.push({
__proto__: book,
someArbitraryData: "Some arbitrary data",
});
}
I wanted to run the above script without invoking any GC cycles. Unfortunately, we cannot disable garbage collection in V8 that easily. Some of the suggestions on the internet are outdated, and even if you set the max old space size, garbage collection is still triggered.
Results
Last GC sweep out of 4 for the improper interning approach:
[87242:0x160008000] 12 ms: Scavenge 6.1 (8.3) -> 5.7 (9.3) MB, 0.38 / 0.00 ms (average mu = 1.000, current mu = 1.000) allocation failure;
And the only one incurred GC sweep for the proper interning approach:
[85419:0x110008000] 13 ms: Scavenge 3.9 (4.0) -> 3.5 (5.0) MB, 0.38 / 0.00 ms (average mu = 1.000, current mu = 1.000) allocation failure;
You can see that the used memory difference is around 2 MB, and the heap allocated size is almost twice as much. Not only that, but we had 4 sweeps instead of 1 when using the improper approach.
Conclusion
I can’t imagine the average JavaScript CRUD developer ever having a proper use case for interning their objects. As you can see in the example above, in order to yield any proper benefits, we have to push 10,000 new Book objects to save what feels like not that much compared to how cheap memory is nowadays (disregarding performance costs around GC). You will also probably have to freeze the interned object to avoid unexpected changes.
When will you ever have that many different Books relying on the same intrinsic state? Business logic-wise, the example on the page is not the best one.
Nevertheless, if you have the use case, this can surely be a simple yet efficient way to avoid unnecessary GC sweeps and save some memory.