Node.js Ecosystem Deep Dive / Chapter 2

AI Tech /

E01_Six_Phases_One_Truth

# Six Phases, One Truth > The Node.js event loop is not a callback queue — it is a six-phase, kernel-synchronized multiplexer in which the poll phase, not the timer wheel, decides when your code actually runs. ## Key Takeaways - Node is "single-threaded" only for JavaScript. Underneath, libuv runs an I/O thread pool and dispatches kernel events to a single-threaded orchestrator. - The event loop has six phases — timers, pending callbacks, idle, prepare, poll, check, close callbacks — not one. - Timers do not fire on time. They fire when the poll phase lets them. A 100ms `setTimeout` running alongside a 95ms file read fires at 105ms, not 100ms. - `process.nextTick` is outside the event loop. The names `process.nextTick` and `setImmediate` should be swapped, but ecosystem lock-in prevents it. - EventEmitter is synchronous. If you emit before any listener attaches, the event is lost. - Inside an I/O callback, `setImmediate` always beats `setTimeout(_, 0)`. Outside I/O, the order is non-deterministic. --- Run this script: ```js const fs = require('node:fs'); const start = Date.now(); setTimeout(() => { console.log(`timer fired at ${Date.now() - start}ms`); }, 100); fs.readFile(__filename, () => { const cb = Date.now(); while (Date.now() - cb < 10) { /* busy 10ms */ } }); ``` If your mental model says "the timer fires at 100ms," your mental model is wrong. The timer fires at 105ms. That extra 5ms is not jitter. It is not noise. It is the runtime behaving exactly as specified. The official Node.js documentation works through this example line by line, and the line that matters is: "When the event loop enters the poll phase, it has an empty queue (`fs.readFile()` has not completed), so it will wait for the number of ms remaining until the soonest timer's threshold is reached. While it is waiting 95 ms pass, `fs.readFile()` finishes reading the file and its callback which takes 10 ms to complete is added to the poll queue and executed. When the callback finishes, there are no more callbacks in the queue, so the event loop will see that the threshold of the soonest timer has been reached then wrap back to the timers phase to execute the timer's callback." 105ms. Not 100ms. I started believing the event loop was a FIFO queue of pending callbacks, drained in order, with timers pre-empting whatever was running every time their threshold elapsed. That is the model the word "loop" suggests, and it is wrong in a specific way that affects every timeout you have ever written. Reading the official Node.js documentation on the event loop pushed me toward the correct model, which is that the loop is a six-phase, kernel-synchronized multiplexer in which *the poll phase is the kingmaker*. Timers fire when the poll phase lets them. I/O happens in the poll phase. Everything else is setup, teardown, or bookkeeping. ## Why the model has to exist at all Before there is a loop, there is a reason for one. Rod Vagg's "Why Asynchronous?" essay opens with a table of I/O latencies that, if you have not internalized them, will re-frame how you read every line of JavaScript you have ever written. The table is in nanoseconds: | Operation | Time (ns) | |------------------------------------|-----------------:| | L1 cache reference | 1 | | L2 cache reference | 4 | | Main memory reference | 100 | | SSD random-read | 16,000 | | Round-trip in same datacenter | 500,000 | | Physical disk seek | 4,000,000 | | Round-trip from US to EU | 150,000,000 | Physical disk I/O is four million times slower than an L1 cache reference. A cross-Atlantic network round-trip is one hundred and fifty million times slower than L1. These ratios are the economic argument for everything Node does. If I/O were fast, you would not need an event loop. You would call `readFileSync`, wait a few microseconds, and move on. I/O is not fast. I/O is the bottleneck of every program you have ever written that talks to a database, an HTTP service, or a disk. The thesis Vagg draws from the table is the thesis of Node: "Node.js is fast because programmers are forced to write fast programs by not introducing blocking I/O to the program flow." Other platforms let you write blocking code "if you go out of your way." Node makes the non-blocking path the path. The `fs.*Sync()` methods exist as the named escape hatch, and their existence-as-exception is itself the message: in most platforms, blocking is the default and non-blocking is opt-in; in Node, blocking is opt-in and non-blocking is default. ## The model, in one diagram The runtime's job is to keep the JavaScript thread busy on CPU work while I/O operations finish out-of-band. The mechanism is libuv, the C library that handles "the queueing and processing of asynchronous events." libuv's event loop is a six-stop subway line: ```mermaid flowchart TD Start([loop start]) --> Timers["timers<br/>setTimeout / setInterval"] Timers -->|nextTick drained| Pending["pending callbacks<br/>deferred TCP errors, etc."] Pending --> Idle["idle, prepare<br/>(internal use)"] Idle --> Poll["poll<br/>retrieve new I/O events;<br/>execute I/O callbacks;<br/>block here when appropriate"] Poll -->|nextTick drained| Check["check<br/>setImmediate() callbacks"] Check --> NextTick["nextTick drained<br/>between every phase"] NextTick --> Close["close callbacks<br/>socket.on('close', ...)"] Close --> Timers Poll -.setImmediate pending.-> Check Poll -.pqueue empty, no immediate.-> BlockWait["block until new I/O<br/>or timer threshold"] BlockWait --> Poll Poll -.timer threshold reached.-> Timers ``` The six phases are not arbitrary. They correspond to distinct kernel event sources: - **timers** is the timer wheel — the bucket of `setTimeout` / `setInterval` callbacks whose thresholds have elapsed. - **pending callbacks** is for system operations that some Unix variants want to defer to the next loop iterati

Chapter 2 of 2 11m Article Audio Video Learning path

Six Phases, One Truth

The Node.js event loop is not a callback queue — it is a six-phase, kernel-synchronized multiplexer in which the poll phase, not the timer wheel, decides when your code actually runs.

Key Takeaways

  • Node is "single-threaded" only for JavaScript. Underneath, libuv runs an I/O thread pool and dispatches kernel events to a single-threaded orchestrator.
  • The event loop has six phases — timers, pending callbacks, idle, prepare, poll, check, close callbacks — not one.
  • Timers do not fire on time. They fire when the poll phase lets them. A 100ms setTimeout running alongside a 95ms file read fires at 105ms, not 100ms.
  • process.nextTick is outside the event loop. The names process.nextTick and setImmediate should be swapped, but ecosystem lock-in prevents it.
  • EventEmitter is synchronous. If you emit before any listener attaches, the event is lost.
  • Inside an I/O callback, setImmediate always beats setTimeout(_, 0). Outside I/O, the order is non-deterministic.

---

Run this script:

const fs = require('node:fs');
const start = Date.now();
setTimeout(() => {
  console.log(`timer fired at ${Date.now() - start}ms`);
}, 100);
fs.readFile(__filename, () => {
  const cb = Date.now();
  while (Date.now() - cb < 10) { /* busy 10ms */ }
});

If your mental model says "the timer fires at 100ms," your mental model is wrong. The timer fires at 105ms. That extra 5ms is not jitter. It is not noise. It is the runtime behaving exactly as specified. The official Node.js documentation works through this example line by line, and the line that matters is: "When the event loop enters the poll phase, it has an empty queue (fs.readFile() has not completed), so it will wait for the number of ms remaining until the soonest timer's threshold is reached. While it is waiting 95 ms pass, fs.readFile() finishes reading the file and its callback which takes 10 ms to complete is added to the poll queue and executed. When the callback finishes, there are no more callbacks in the queue, so the event loop will see that the threshold of the soonest timer has been reached then wrap back to the timers phase to execute the timer's callback." 105ms. Not 100ms.

I started believing the event loop was a FIFO queue of pending callbacks, drained in order, with timers pre-empting whatever was running every time their threshold elapsed. That is the model the word "loop" suggests, and it is wrong in a specific way that affects every timeout you have ever written. Reading the official Node.js documentation on the event loop pushed me toward the correct model, which is that the loop is a six-phase, kernel-synchronized multiplexer in which *the poll phase is the kingmaker*. Timers fire when the poll phase lets them. I/O happens in the poll phase. Everything else is setup, teardown, or bookkeeping.

Why the model has to exist at all

Before there is a loop, there is a reason for one. Rod Vagg's "Why Asynchronous?" essay opens with a table of I/O latencies that, if you have not internalized them, will re-frame how you read every line of JavaScript you have ever written. The table is in nanoseconds:

| Operation | Time (ns) | |------------------------------------|-----------------:| | L1 cache reference | 1 | | L2 cache reference | 4 | | Main memory reference | 100 | | SSD random-read | 16,000 | | Round-trip in same datacenter | 500,000 | | Physical disk seek | 4,000,000 | | Round-trip from US to EU | 150,000,000 |

Physical disk I/O is four million times slower than an L1 cache reference. A cross-Atlantic network round-trip is one hundred and fifty million times slower than L1. These ratios are the economic argument for everything Node does. If I/O were fast, you would not need an event loop. You would call readFileSync, wait a few microseconds, and move on. I/O is not fast. I/O is the bottleneck of every program you have ever written that talks to a database, an HTTP service, or a disk.

The thesis Vagg draws from the table is the thesis of Node: "Node.js is fast because programmers are forced to write fast programs by not introducing blocking I/O to the program flow." Other platforms let you write blocking code "if you go out of your way." Node makes the non-blocking path the path. The fs.*Sync() methods exist as the named escape hatch, and their existence-as-exception is itself the message: in most platforms, blocking is the default and non-blocking is opt-in; in Node, blocking is opt-in and non-blocking is default.

The model, in one diagram

The runtime's job is to keep the JavaScript thread busy on CPU work while I/O operations finish out-of-band. The mechanism is libuv, the C library that handles "the queueing and processing of asynchronous events." libuv's event loop is a six-stop subway line:

flowchart TD
    Start([loop start]) --> Timers["timers<br/>setTimeout / setInterval"]
    Timers -->|nextTick drained| Pending["pending callbacks<br/>deferred TCP errors, etc."]
    Pending --> Idle["idle, prepare<br/>(internal use)"]
    Idle --> Poll["poll<br/>retrieve new I/O events;<br/>execute I/O callbacks;<br/>block here when appropriate"]
    Poll -->|nextTick drained| Check["check<br/>setImmediate() callbacks"]
    Check --> NextTick["nextTick drained<br/>between every phase"]
    NextTick --> Close["close callbacks<br/>socket.on('close', ...)"]
    Close --> Timers
    Poll -.setImmediate pending.-> Check
    Poll -.pqueue empty, no immediate.-> BlockWait["block until new I/O<br/>or timer threshold"]
    BlockWait --> Poll
    Poll -.timer threshold reached.-> Timers

The six phases are not arbitrary. They correspond to distinct kernel event sources:

  • timers is the timer wheel — the bucket of setTimeout / setInterval callbacks whose thresholds have elapsed.
  • pending callbacks is for system operations that some Unix variants want to defer to the next loop iteration (e.g. ECONNREFUSED).
  • idle, prepare is internal libuv work; application code never lands here.
  • poll is the heart of I/O. It retrieves new I/O events from the kernel (epoll on Linux, kqueue on macOS, IOCP on Windows), executes their callbacks, and — when there is nothing to do — blocks waiting for the next event, up to the threshold of the soonest pending timer.
  • check runs setImmediate callbacks — code that wants to run "right after the poll phase, before the loop continues."
  • close callbacks runs close handlers — e.g. socket.on('close', ...).

Each phase is a FIFO queue. The loop enters a phase, performs its specific operation, then drains the queue until either the queue is empty or the system-dependent callback limit is hit. The loop then moves to the next phase. After close callbacks, it wraps back to timers. And that wrap is where the 105ms example's 5ms came from.

The poll phase is the kingmaker

If you take one thing from this chapter, take this: the poll phase controls when timers actually fire.

The official documentation says it plainly, and it is worth quoting in full: "Technically, the poll phase controls when timers are executed." A timer's threshold is the *earliest* it can fire, not the time it will fire. The actual firing time is whenever the poll phase notices that a timer's threshold has passed and decides to yield control back to the timers phase.

The poll phase's logic, in pseudocode:

on entering poll phase:
  if timers are pending:
    let wait_ms = min(time_until_soonest_timer)
  else:
    wait_ms = ∞
  if poll queue is non-empty:
    drain it synchronously until empty or system limit
    return to timers phase
  else if setImmediate is scheduled:
    proceed to check phase
  else:
    block on kernel for up to wait_ms
    when kernel returns (I/O ready OR timer threshold reached):
      if I/O ready: execute its callbacks
      if timer threshold reached: wrap to timers phase

The 105ms example makes this concrete. The timer is scheduled for t=100. The poll phase enters at some point before t=100 with an empty queue. It computes wait_ms = (100 - now). It blocks on the kernel. At t=95, the file read finishes and the kernel marks the file descriptor ready. At t=95, libuv wakes up. At t=95, the I/O callback is added to the poll queue. The poll phase drains the queue — the callback runs for 10ms, finishing at t=105. The poll queue is empty again. The loop checks the timers: the 100ms threshold has been reached. The loop wraps to the timers phase. The timer callback fires at t=105.

The 5ms is the cost of the file read callback *finishing*. If your callback had taken 50ms instead of 10ms, the timer would have fired at t=145. If you had set the timer to 1ms instead of 100ms, the timer would still have waited for the 95ms read. Timers never preempt poll-phase work. This is by design — if they could, every async I/O operation in your process could be interrupted at an arbitrary point by an arbitrary timer, and the entire async-I/O model would collapse into re-entrant chaos.

process.nextTick is outside the loop

This is the part of the runtime that I find the most counterintuitive and the most important to internalize. process.nextTick is not part of the event loop. It does not have a phase. It is processed *after the current operation completes*, regardless of which phase the loop is in. The docs put it as bluntly as the rest of the architecture: "any time you call process.nextTick() in a given phase, all callbacks passed to process.nextTick() will be resolved before the event loop continues."

The semantic consequence is that recursive process.nextTick calls can starve the event loop. If your callback schedules another nextTick, which schedules another, the loop never returns to any phase. The official documentation calls this out by name: "this can create some bad situations because it allows you to 'starve' your I/O by making recursive process.nextTick() calls, which prevents the event loop from reaching the poll phase."

The names are wrong. The official docs admit this: "In essence, the names should be swapped. process.nextTick() fires more immediately than setImmediate(), but this is an artifact of the past which is unlikely to change. Making this switch would break a large percentage of the packages on npm." This is a rare case of the runtime admitting a naming choice was wrong, then keeping it anyway because the ecosystem has calcified around the mistake. The official recommendation, straight from the docs: "use setImmediate() in all cases because it's easier to reason about."

The reason process.nextTick exists at all is that the API needed an escape hatch for "run this after the call stack has unwound but before the event loop continues." The canonical use is inside an EventEmitter constructor. If you emit synchronously in a constructor, no listener can have attached yet, so the event is lost. If you emit via process.nextTick, listeners have a chance to attach first. The same trick is used to delay error emission so callers can attach error handlers after construction.

EventEmitter is synchronous

This is the second piece of common folk-wisdom that the source material upends. EventEmitter is a synchronous API. The emit function runs all of its listeners synchronously, in registration order, before returning. It only *appears* asynchronous because it is often used to signal the completion of an asynchronous operation.

Trevor Norris, writing on the NodeSource blog, gives the canonical demonstration:

me.doStuff();
// Output:
// before
// emit fired
// after

The emit fired line runs *inside* the emit call. It runs before console.log('after'). If you have ever built a system where the response to "thing1 fired" is logged but the producer's continuation runs first, you have built a system where EventEmitter's synchronicity is biting you.

The fix is the same as the constructor case: defer with setImmediate or process.nextTick if you need to emit before listeners have attached. The pattern of "emit immediately if listeners exist, otherwise queue for next tick" is so common that newer Node APIs do it for you. Older code in your codebase might not.

fs.*Sync() is the smoking gun

I want to close on the one Node API that proves the model by counterexample. fs.readFileSync, fs.writeFileSync, and the rest of the *Sync family *block the JavaScript thread*. They are not faster than their async counterparts on a single call. They are slower — because they go through a different code path that doesn't use libuv's pool. They exist because sometimes you genuinely need synchronous I/O — at startup, in CLI scripts, in test setup — and the cost of blocking is paid by the developer who chose the *Sync variant knowingly.

The fact that the synchronous variants are *named* — that the API surface has readFile *and* readFileSync — is the smoking gun for the entire design philosophy. In a runtime that defaulted to blocking, you would have readFile and the non-blocking variant would be readFileAsync. In Node, the blocking variant is the one with the explicit name. The default is async. The default is what you should be using. The *Sync family is the marked exception, and the marking is the lesson.

What the runtime and the package manager share

You started this chapter with a 5ms discrepancy that turned out to be the runtime behaving as specified. You started the previous chapter with a transitive publish that turned out to be the package manager behaving as documented. Both stories are about non-determinism. The registry moves; the lockfile pins it. The kernel moves; the poll phase synchronizes to it. Node is the only mainstream runtime I know of where both the dependency layer and the execution layer treat non-determinism as the design constraint, not as an inconvenience to be papered over by threads or by luck.

The lockfile and the event loop are not parallel stories. They are the same story, told twice. The package manager locks the install. The event loop locks the run. Both are about making a moving target into a predictable surface, and both succeed only because the people who built them were honest about which guarantees are real and which are aspirational. The non-determinism that remains — a timer firing 5ms late, a transitive publish slipping through a stale lockfile — is the runtime telling you the truth. That truth is what lets you reason about the system.

---

References: