Your Node is Leaking Memory? setTimeout Could be the Reason (2024)

written on Wednesday, June 5, 2024

This is mostly an FYI for node developers. The issue being discussed inthis post has caused usquite a bit of pain.It has to do with how node deals with timeouts. In short: you canvery easily create memory leaks [1] with the setTimeout API in node.You're probably familiar with that API since it's one that browsersprovide for many, many years. The API is pretty straightforward: youschedule a function to be called later, and you get a token backthat can be used to clear the timeout later. In short:

const token = setTimeout(() => {}, 100);clearTimeout(token);

In the browser the token returned is just a number. If you do the samething in node however, the token ends up being an actual Timeout object:

> setTimeout(() => {})Timeout { _idleTimeout: 1, _idlePrev: [TimersList], _idleNext: [TimersList], _idleStart: 4312, _onTimeout: [Function (anonymous)], _timerArgs: undefined, _repeat: null, _destroyed: false, [Symbol(refed)]: true, [Symbol(kHasPrimitive)]: false, [Symbol(asyncId)]: 78, [Symbol(triggerId)]: 6}

This “leaks” out some of the internals of how the timeout is implementedinternally. For the last few years I think this has been just fine.Typically you used this object primarily as a token similar to how youwould do that with a number. Might look something like this:

class MyThing { constructor() { this.timeout = setTimeout(() => { ... }, INTERVAL); } clearTimeout() { clearTimeout(this.timeout); }}

For the lifetime of MyThing, even after clearTimeout has been calledor the timeout runs to completion, the object holds on to this timeout.While on completion or cancellation, the timeout is marked as “destroyed”in node terms and removed from it's internal tracking. What howeverhappens is that this Timeout object is actually surviving until someoneoverrides or deletes the this.timeout reference. That's because theactual Timeout object is held and not just some token. This furthermeans that the garbage collector won't actually collect this thing at alland everything that it references. This does not seem too bad as theTimeout seems somewhat hefty, but not too hefty. The most problematicpart is most likely the _onTimeout member on it which might pull in aclosure, but it's probably mostly okay in practice.

However the timeout object can act as a container for more state which isnot quite as obvious. Annew API that has been added over the last coupleof years called AsyncLocalStorage which is getting some traction isattaching additional state onto all timeouts that fire. Async localsstorage is implemented in a way that timeouts (and promises and similarconstructs) carry forward hidden state until they run:

const { AsyncLocalStorage } = require('node:async_hooks');const als = new AsyncLocalStorage();let t;als.run([...Array(10000)], () => { t = setTimeout(() => { // const theArray = als.getStore(); assert(theArray.length === 10000); }, 100);});console.log(t);

When you run this, you will notice that the Timeout holds a reference tothis large array:

Timeout { _idleTimeout: 100, _idlePrev: [TimersList], _idleNext: [TimersList], _idleStart: 10, _onTimeout: [Function (anonymous)], _timerArgs: undefined, _repeat: null, _destroyed: false, [Symbol(refed)]: true, [Symbol(kHasPrimitive)]: false, [Symbol(asyncId)]: 2, [Symbol(triggerId)]: 1, [Symbol(kResourceStore)]: [Array] // reference to that large array is held here}

That's because every single async local storage that is created registersitself with the timeout with a custom Symbol(kResourceStore) which evenremains on there after a timeout has been cleared or the timeout ran tocompletion. This means that the more async local storage you use, themore “stuff” you hold on if you don't clear our the timeouts.

The fix seems obvious: rather than holding on to timeouts, hold on to theunderlying ID. That's because you can convert a Timeout into aprimitive (with for instance the unary + operator). The primitive isjust a number like it would be in the browser which then can also be usedfor clearing. Since a number holds no reference, this should resolve theissue:

class MyThing { constructor() { // the + operator forces the timeout to be converted into a number this.timeout = +setTimeout(() => { ... }, INTERVAL); } clearTimeout() { // clearTimeout and other functions can resolve numbers back into // under internal timeout object clearTimeout(this.timeout); }}

Except it doesn't (today). In fact today doing this will cause anunrecoverable memory leak because of a bug in node [2]. Once that will beresolved however that should be a fine way to avoid problem.

Workaround for the leak with a Monkey-Patch

Since the bug is only triggered when a timer manages to run to completion,you could in theory forcefully clear the timeout or interval on completionif node “allocated” a primitive ID for it like so:

const kHasPrimitive = Reflect .ownKeys(setInterval(() => {})) .find((x) => x.toString() === 'Symbol(kHasPrimitive)');function invokeSafe(t, callable) { try { return callable(); } finally { if (t[kHasPrimitive]) { clearTimeout(t); } }}const originalSetTimeout = global.setTimeout;global.setTimeout = (callable, ...rest) => { const t = originalSetTimeout(() => invokeSafe(t, callable), ...rest); return t;};const originalSetInterval = global.setInterval;global.setInterval = (callable, ...rest) => { const t = originalSetInterval(() => invokeSafe(t, callable), ...rest); return t;};

This obviously makes a lot of assumptions about the internals of node, itwill slow down every timer slightly created via setTimeout andsetInterval but might help you in the interim if you do run into thatbug.

Until then the second best thing you can do for now is to just be veryaggressive in deleting these tokens manually the moment you no longer needthem:

class MyThing { constructor() { this.timeout = setTimeout(() => { this.timeout = null; ... }, INTERVAL); } clearTimeout() { if (this.timeout) { clearTimeout(this.timeout); this.timeout = null; } }}

How problematic are timeouts? It's hard for me to say, but there are alot of places where code holds on to timeouts and intervals in node forlonger than is healthy. If you are trying to make things such as hot codereloading work, you are working with long lasting or recurring timeoutsit might be very easy to run into this problem. Due to how widespreadthese timeouts are and the increased use of async local storage I can onlyassume that this will become a more common issue people run into. It'salso a bit devious because you might not even know that you use asynclocal storage as a user.

We're not the first to run into issues like this. For instance Next.js istrying to work around related issues by periodically patching setTimeoutand setInterval to forcefully clearning out intervals to avoid memory leakagein the dev server. (Which unfortunately sometimes runs into the node bugmentioned above due to it's own use of toPrimitive)

How widespread is async local storage? It depends a bit on what you do.For instance we (and probably all players in the observability spaceincluding the OpenTelemetry project itself) use it to track tracinginformation with the local flow of execution. Modern JavaScriptframeworks also sometimes are quite active users of async local storage.In the particular case we were debugging earlier today a total of 7 asynclocal storages were attached to the timeouts we found in the heap dumps,some of which held on to massive react component trees.

Async local storage is great: I'm a huge proponent of it! If you haveever used Flask you will realizethat Flask is built on a similar concept (thread locals, nowadays contextvars) to give you access to the right request object. What however makesasync local storage a bit scary is that it's very easy to hold on tomemory accidentally. In node's case particularly easy with timeouts.

At the very least for timeouts in node there might be a simple improvementby no longer exposing the internal Timeout object. Node could in theoryreturn a lightweight proxy object that breaks the cycle after the timeouthas been executed or cleared. How backwards compatible this can be done Idon't know however.

For improving async local storage longer term I think the ecosystem mighthave to embrace the idea about shedding contextual state. It's incrediblyeasy to leak async local storage today if you spawn "background"operations that last. For instance today a console.log will on firstuse allocate an internal TTY resource which accidentally holds on to thecalling async local storage of completely unrelatedstuff. Whenever a thing such as console.log wants to create a longlasting resource until the end of the program, helper APIs could beprovided that automatically prevent all async local storage frompropagating. Today there is only a way to prevent a specific localstorage from propagating by disabling it, but that requires knowing whichones exist.

[1]Under normal circ*mstances these memory leaks would not bepermanent leaks. They would resolve themselves when you finally drop areference to that token. However due to a node bug it is currentlypossible for these leaks to be unrecoverable.
[2]How we found that bug might be worth a story for another day.

This entry was tagged javascript

Your Node is Leaking Memory? setTimeout Could be the Reason (2024)
Top Articles
Latest Posts
Article information

Author: Msgr. Benton Quitzon

Last Updated:

Views: 5736

Rating: 4.2 / 5 (63 voted)

Reviews: 86% of readers found this page helpful

Author information

Name: Msgr. Benton Quitzon

Birthday: 2001-08-13

Address: 96487 Kris Cliff, Teresiafurt, WI 95201

Phone: +9418513585781

Job: Senior Designer

Hobby: Calligraphy, Rowing, Vacation, Geocaching, Web surfing, Electronics, Electronics

Introduction: My name is Msgr. Benton Quitzon, I am a comfortable, charming, thankful, happy, adventurous, handsome, precious person who loves writing and wants to share my knowledge and understanding with you.