learning path

Node.js Ecosystem Deep Dive

A focused exploration of the Node.js ecosystem covering package management and the event loop architecture. Two chapters examining how npm works and how Node asynchronous I/O model enables high concurrency.

2 chapters 2 audio lessons 2 videos 3 free previews Fresh topic

Start here

1. E00_The_Lockfile_Was_the_Whole_Point

The Lockfile Was the Whole Point

package-lock.json exists because npm install is not a pure function — and treating it as one is the most expensive mistake you can make in JavaScript supply-chain management.

Key Takeaways

package.json is a *range*; package-lock.json is a *point*. Treat them as different artifacts, not redundant ones.
A transitive publish can change your node_modules tree without your package.json changing at all. The lockfile is the only thing standing between you and silent resolution drift.
npm install is for development; npm ci is for build. They are not synonyms. The CI you wrote six months ago is probably wrong.
The npm/pnpm/yarn/bun split is a *design-philosophy* split, not a feature shoot-out. Pick by what your project actually does, not by who's fastest this week.
The lockfile was the whole point. Everything else in the package-management layer — workspaces, registries, semantic-version constraints — exists to make lockfile reproducibility meaningful.

---

Imagine you ship a service today. Your package.json declares a dependency on B with the range <0.1.0. B declares a dependency on C with the range <0.1.0. Today, B@0.0.1 and C@0.0.1 are the only versions of those packages on the registry. You run npm install, your tree resolves, you ship. Three weeks later you ship a hotfix. Your package.json is *byte-identical* to the one from three weeks ago. Your node_modules is not. B published 0.0.2 the day after your first release, and your second install pulled it. C is still 0.0.1. The shape of the tree changed. Your code now runs against a transitive dependency you never named.

That is the bug. That is *the* bug.

flowchart TD
    A["package.json<br/>A: 0.1.0<br/>deps: { B: <0.1.0 }"]
    B["package.json<br/>B: 0.0.1<br/>deps: { C: <0.1.0 }"]
    B2["package.json<br/>B: 0.0.2<br/>deps: { C: <0.1.0 }"]
    C["package.json<br/>C: 0.0.1"]
    A --> B
    A -.fresh install after B@0.0.2 published.-> B2
    B --> C
    B2 --> C

This is the example the npm documentation itself uses to introduce the lockfile. I cite it because it cuts through a fog that the JavaScript community has tolerated for a decade. The fog says: "package.json is the source of truth for my dependencies." The fog is wrong. package.json is the source of truth for the *ranges* you will accept. It is not the source of truth for the *versions* you will get. There is no version in package.json that cannot be silently renegotiated by a fresh npm install, and there is no test you can write that catches the renegotiation before it hits production.

The pure-function ideal that npm keeps failing

The npm docs put the point as plainly as I have ever seen a piece of infrastructure documentation put anything: "In an ideal world, npm would work like a pure function: the same package.json should produce the exact same node_modules tree, any time." That sentence is followed by an enumeration of the four ways reality refuses the ideal — npm version drift, new published versions of direct deps, new published versions of transitive deps, and registry mutation. The lockfile is the escape hatch from non-determinism. Without it, npm install is a coin flip every time.

This is the lesson I missed for most of my career. I treated the lockfile as a generated artifact, the way you treat .DS_Store or tsconfig.tsbuildinfo — something to .gitignore because it "shouldn't be committed." I am not alone. There is an entire generation of JavaScript engineers whose mental model of package management has been: "package.json is committed, lockfiles are not, CI re-resolves." That model is wrong, and the cost of being wrong is paid in incidents.

You can see the community beginning to internalize this — the latest npm guidance is that lockfiles are required reading. The popular curated catalog of Node packages, sindresorhus/awesome-nodejs, lists four package managers — npm, pnpm, yarn, bun — but the supporting tooling around them (np, npm-name, npm-home, npm-hub, npm-check-updates, patch-package, semver, npm semver calculator, David, awesome-npm) all assume the lockfile is the canonical record. The whole metascrubber exists to manage the gap between the range and the point. The gap is the lockfile's reason to exist.

The lockfile's actual rules

The lockfile wins when its resolved versions satisfy the package.json ranges. When they don't, package.json wins and the lockfile is rewritten. The rules look bureaucratic on a first read. They are not bureaucratic. They are the precise definition of which artifact is authoritative in which conflict.

if lockfile satisfies package.json ranges:
    use lockfile
else:
    resolve fresh, write new lockfile

The point is asymmetry. The lockfile is a *cache of a previous resolution*. The cache is invalidated only when the manifest moves outside the range it captured. This is the same idea as a content-addressed cache in any other system — the cache is invalidated by changes to inputs that move the output outside the cached key. npm's contribution is to make the cache content a separate file (package-lock.json) that you commit, instead of an implicit filesystem state you don't.

`npm install` versus `npm ci`

Here is the test for whether you understand the lockfile. If your CI runs npm install, you don't. If your CI runs npm ci, you probably do. The two commands look sim

10m / Article + audio + video

2. E01_Six_Phases_One_Truth

Six Phases, One Truth

The Node.js event loop is not a callback queue — it is a six-phase, kernel-synchronized multiplexer in which the poll phase, not the timer wheel, decides when your code actually runs.

Key Takeaways

Node is "single-threaded" only for JavaScript. Underneath, libuv runs an I/O thread pool and dispatches kernel events to a single-threaded orchestrator.
The event loop has six phases — timers, pending callbacks, idle, prepare, poll, check, close callbacks — not one.
Timers do not fire on time. They fire when the poll phase lets them. A 100ms setTimeout running alongside a 95ms file read fires at 105ms, not 100ms.
process.nextTick is outside the event loop. The names process.nextTick and setImmediate should be swapped, but ecosystem lock-in prevents it.
EventEmitter is synchronous. If you emit before any listener attaches, the event is lost.
Inside an I/O callback, setImmediate always beats setTimeout(_, 0). Outside I/O, the order is non-deterministic.

---

Run this script:

const fs = require('node:fs');
const start = Date.now();
setTimeout(() => {
  console.log(`timer fired at ${Date.now() - start}ms`);
}, 100);
fs.readFile(__filename, () => {
  const cb = Date.now();
  while (Date.now() - cb < 10) { /* busy 10ms */ }
});

If your mental model says "the timer fires at 100ms," your mental model is wrong. The timer fires at 105ms. That extra 5ms is not jitter. It is not noise. It is the runtime behaving exactly as specified. The official Node.js documentation works through this example line by line, and the line that matters is: "When the event loop enters the poll phase, it has an empty queue (fs.readFile() has not completed), so it will wait for the number of ms remaining until the soonest timer's threshold is reached. While it is waiting 95 ms pass, fs.readFile() finishes reading the file and its callback which takes 10 ms to complete is added to the poll queue and executed. When the callback finishes, there are no more callbacks in the queue, so the event loop will see that the threshold of the soonest timer has been reached then wrap back to the timers phase to execute the timer's callback." 105ms. Not 100ms.

I started believing the event loop was a FIFO queue of pending callbacks, drained in order, with timers pre-empting whatever was running every time their threshold elapsed. That is the model the word "loop" suggests, and it is wrong in a specific way that affects every timeout you have ever written. Reading the official Node.js documentation on the event loop pushed me toward the correct model, which is that the loop is a six-phase, kernel-synchronized multiplexer in which *the poll phase is the kingmaker*. Timers fire when the poll phase lets them. I/O happens in the poll phase. Everything else is setup, teardown, or bookkeeping.

Why the model has to exist at all

Before there is a loop, there is a reason for one. Rod Vagg's "Why Asynchronous?" essay opens with a table of I/O latencies that, if you have not internalized them, will re-frame how you read every line of JavaScript you have ever written. The table is in nanoseconds:

| Operation | Time (ns) | |------------------------------------|-----------------:| | L1 cache reference | 1 | | L2 cache reference | 4 | | Main memory reference | 100 | | SSD random-read | 16,000 | | Round-trip in same datacenter | 500,000 | | Physical disk seek | 4,000,000 | | Round-trip from US to EU | 150,000,000 |

Physical disk I/O is four million times slower than an L1 cache reference. A cross-Atlantic network round-trip is one hundred and fifty million times slower than L1. These ratios are the economic argument for everything Node does. If I/O were fast, you would not need an event loop. You would call readFileSync, wait a few microseconds, and move on. I/O is not fast. I/O is the bottleneck of every program you have ever written that talks to a database, an HTTP service, or a disk.

The thesis Vagg draws from the table is the thesis of Node: "Node.js is fast because programmers are forced to write fast programs by not introducing blocking I/O to the program flow." Other platforms let you write blocking code "if you go out of your way." Node makes the non-blocking path the path. The fs.*Sync() methods exist as the named escape hatch, and their existence-as-exception is itself the message: in most platforms, blocking is the default and non-blocking is opt-in; in Node, blocking is opt-in and non-blocking is default.

The model, in one diagram

The runtime's job is to keep the JavaScript thread busy on CPU work while I/O operations finish out-of-band. The mechanism is libuv, the C library that handles "the queueing and processing of asynchronous events." libuv's event loop is a six-stop subway line:

flowchart TD
    Start([loop start]) --> Timers["timers<br/>setTimeout / setInterval"]
    Timers -->|nextTick drained| Pending["pending callbacks<br/>deferred TCP errors, etc."]
    Pending --> Idle["idle, prepare<br/>(internal use)"]
    Idle --> Poll["poll<br/>retrieve new I/O events;<br/>execute I/O callbacks;<br/>block here when appropriate"]
    Poll -->|nextTick drained| Check["check<br/>setImmediate() callbacks"]
    Check --> NextTick["nextTick drained<br/>between every phase"]
    NextTick --> Close["close callbacks<br/>socket.on('close', ...)"]
    Close --> Timers
    Poll -.setImmediate pending.-> Check
    Poll -.pqueue empty, no immediate.-> BlockWait["block until new I/O<br/>or timer threshold"]
    BlockWait --> Poll
    Poll -.timer threshold reached.-> Timers

The six phases are not arbitrary. They correspond to distinct kernel event sources:

timers is the timer wheel — the bucket of setTimeout / setInterval callbacks whose thresholds have elapsed.
pending callbacks is for system operations that some Unix variants want to defer to the next loop iterati

11m / Article + audio + video

Node.js Ecosystem Deep Dive

Start here

1. E00_The_Lockfile_Was_the_Whole_Point

The Lockfile Was the Whole Point

Key Takeaways

The pure-function ideal that npm keeps failing

The lockfile's actual rules

npm install versus npm ci

2. E01_Six_Phases_One_Truth

Six Phases, One Truth

Key Takeaways

Why the model has to exist at all

The model, in one diagram

`npm install` versus `npm ci`