Five Million Pages Deep, the Database Stops
The expensive part of pagination is not the page the user actually fetches. It is the pages they almost fetched, the rows that moved under the cursor while they were scrolling, and the keys that every paginated query has to touch at once. Those are the three edges that turn a UI choice into a database incident, and they are the edges nobody writes tests for.
Key Takeaways
- Deep offset is a linear scan with extra steps. Page 50,000 of a 100M-row table is not a query; it is a reading assignment the database accepted on your behalf.
- Cursor pagination does not eliminate the cost — it concentrates it. Every cursor walks the same hot prefix of the index, and the prefix becomes a write-amplification target.
- Write-during-scroll consistency is the contract nobody reads. Rows that move between page requests create duplicates and gaps the user notices and the tests do not.
- The "total count" field is the most expensive number your API exposes, and most APIs expose it for free in their documentation and at full cost in their storage layer.
---
Six seconds. That is the number that opens the incident report. Page 50,000 of a paginated endpoint on a 100-million-row table, p99 latency 6,012 ms, three other queries blocked behind it, the replication lag climbing. The team that owns the endpoint has never seen a user click that deep. The crawler has. The crawler is the only thing that has ever asked for page 50,000, and the crawler is asking for it 4,000 times a minute.
The fix is not "add a covering index." The index is already there. The fix is to read the contract that the endpoint is honoring, name the cost that contract imposes on the storage layer, and decide whether the caller is paying that cost or whether the storage layer is paying it on the caller's behalf.
This chapter is about the second part. The contract was signed in the API definition. The cost is paid in the storage engine. The user is never the one who discovers the gap.
---
Edge one: the deep page is not a page
The most quoted metric in pagination performance is "the cost is linear in the offset." The number is correct and the framing is incomplete. The cost is linear in the offset, *and* the cost is incurred whether or not the user fetches the page. The EXPLAIN plan from chapter one is not the plan of a page; it is the plan of a *reading assignment*. The database reads 9,998,000 rows, throws away 9,998,000 minus 20 of them, and returns 20. The work that the database is willing to do is governed by the offset, not by the limit.
This matters because most engineering teams measure the cost of the page the user actually fetched, not the cost of the page the user almost fetched. Page 1 returns in 4 ms. Page 100 returns in 80 ms. Page 10,000 returns in 800 ms. The user is on page 10. The cost the team is paying for page 10 is the cost of page 1, which is fine. The cost the team *might* be paying for page 10,000 is the cost the database is willing to spend on page 10,000 — and the database does not know that the user is not on page 10,000. The database is *ready* to spend that cost. The cost is in the plan, not in the request.
A depth charge is a weapon that sits on the seafloor and does nothing until something passes over it. Then it explodes. The deep-page EXPLAIN plan is a depth charge. It does nothing until a user — or a crawler, or a bored engineer with psql — passes over it. Then the storage layer fires. The whole point of the design is that the cost is paid on demand, but the demand is not the user's demand; it is the contract's demand, and the contract does not know the difference.
The mitigation is well known: cursor pagination, keyset predicates, "do not let the offset grow unbounded." What is less well known is that the mitigation only works if the sort order is indexed, and only if the cursor is a tuple of indexed columns, and only if the caller never asks "how many pages are there?" — because the moment the caller asks, the total count becomes the new depth charge. The total count is the deepest page of all, and it is one page the user almost always fetches.
Edge two: the cursor that creates a hot key
You migrate the infinite feed to cursor pagination. The p99 latency on page 1 drops from 200 ms to 8 ms. The team celebrates. Three days later, ops pages you: the database is hot on one index range, the write throughput is halved, replication is falling behind, and the only thing that changed is the pagination scheme.
What happened is straightforward and entirely predictable. The cursor is a tuple of (created_at, id). Every paginated request on the feed walks the same hot prefix of the index — the most recent rows. The reads are clustered on the same index range. Worse, the cursor asks for "rows that sort strictly before the last one I gave you." The last one I gave you is *the most recent row the user has seen*, which is *the row at the leading edge of the index*, which is *the row that every write in the system is also touching*, because every new insert lands at the leading edge.
Read amplification concentrates at the leading edge. Write amplification concentrates at the leading edge. The leading edge of the index is now the most contested 16 KB page in the database. The cursor did not introduce the contention — the *choice of sort order* did. The cursor made the contention visible by routing every read to the same prefix. The team traded "every user pays linear cost on demand" for "every user pays constant cost, but every user pays it on the same rows."
This is the trade that cursor pagination makes, and it is the trade that nobody documents. The cursor is a trail-marker, but the trail everyone is following is the same trail, and the trail-marker has to live at the head of the trail. The head of the trail is where the writes are. The cursor is, by construction, a hot key. The hot key is, by construction, a write-amplification target. The write amplification is, by construction, a storage engine problem, and the storage engine does not care that you thought you were solving a frontend problem.
The mitigation is harder than the offset mitigation. You can shard the index, you can decouple the read path from the write path with a snapshot, you can introduce a fan-out, you can change the sort order so that the leading edge is not where the writes are. None of these are "pagination library" fixes. They are storage-engine fixes, applied to a problem that arrived in the codebase labeled as a frontend bug.
Edge three: the row that moved
The third edge is the one that the tests never cover, because the tests run in transactions and the rows do not move while the transaction is open. In production, the rows move.
sequenceDiagram
participant U as User
participant A as API
participant D as Database
U->>A: GET /feed?cursor=A
A->>D: SELECT ... WHERE (ts,id) < A LIMIT 20
D-->>A: rows [A, A1, A2, ..., A19]
A-->>U: page 1
Note over D: meanwhile: row R is inserted<br/>between A1 and A2
U->>A: GET /feed?cursor=A19
A->>D: SELECT ... WHERE (ts,id) < A19 LIMIT 20
D-->>A: rows [A20..A38, R inserted at correct position]
A-->>U: page 2 (includes R)
U->>A: GET /feed?cursor=A38
A->>D: SELECT ... WHERE (ts,id) < A38 LIMIT 20
D-->>A: rows [A39..A57]
A-->>U: page 3 (no R — R is already past)
The timeline above is the cursor contract *honored*. Row R is inserted between pages 1 and 2. Page 2 includes R at the correct position. Page 3 does not duplicate R because R is now strictly before the cursor on page 3. The contract is stable. The user has not seen a duplicate. The user has not seen a gap. The contract is doing what it promised.
Now the same timeline with the offset contract:
sequenceDiagram
participant U as User
participant A as API
participant D as Database
U->>A: GET /feed?page=1&size=20
A->>D: SELECT ... ORDER BY ts DESC LIMIT 20 OFFSET 0
D-->>A: rows [A, A1, A2, ..., A19]
A-->>U: page 1 (rows 1-20)
Note over D: meanwhile: row R is inserted at top,<br/>row A0 is deleted
U->>A: GET /feed?page=2&size=20
A->>D: SELECT ... ORDER BY ts DESC LIMIT 20 OFFSET 20
D-->>A: rows [A1, A2, ..., A20] (R is now row 1, A0 is gone)
A-->>U: page 2 (row A1 appears again, R is missing from page 2)
Note over U: user sees: A1, A2 dup. R never seen.<br/>A0 forgotten.
A1 appears on page 1 and again on page 2. R is missing from page 2 because R is now on page 1, and the user already saw page 1. A0 is gone because A0 was on page 1, and the user already saw page 1, and the user's mental model of page 1 still includes A0. The contract is honored. The contract said "the 21st through 40th rows." The 21st through 40th rows are exactly what the contract promised. The user is confused. The tests pass. The user is right to be confused.
This is the consistency edge. The offset contract does not promise stability. The user assumed stability. The user is reading the *caller's* contract, which is "give me the next 20 I have not seen." The API is honoring the *storage* contract, which is "give me the 21st through 40th." The two contracts are different, and the difference is the bug.
The mitigation is "switch to cursor." Switching to cursor changes the storage contract to match the caller's contract, and the user's confusion goes away. The cost is the hot key from edge two. The trade is real.
---
What the "free" total count actually costs
The "total count" field at the bottom of a paginated UI is the most expensive number in the system, and most APIs expose it for free. To return a total count, the storage layer has to compute a number that may not be indexable, may not be cacheable, may not be stable across requests, and may not be cheap to keep fresh. A COUNT(*) against a filtered table is a sequential scan, or it is an index-only scan, or it is a maintained counter, and all three of those are different shapes of expense.
A maintained counter is the only honest answer at scale, and a maintained counter is a different system from the table it counts. It has its own consistency model, its own lag, its own failure mode. The "you have 1,247 unread items" indicator is a number that came from somewhere, and the somewhere is a system that is allowed to be wrong for a few seconds, and the contract that exposes the number is allowed to be wrong for a few seconds, and the documentation should say so, and it almost never does.
This is the last edge, and it is the edge that exposes the contract the most visibly. The total count is the only number in the paginated UI that the user reads as a fact. The fact is, in most systems, an estimate. The estimate is, in most systems, presented as a fact. The user is making decisions based on a number the API does not promise.
---
Bringing the three edges together
The three edges are not separate bugs. They are three views of the same gap between the contract the API exposes and the cost model the storage layer can deliver. The deep page is the cost model refusing to honor the contract at scale. The hot key is the storage layer paying a constant cost that turns out to be non-uniform. The row that moved is the contract refusing to honor the caller's mental model.
The librarian from the introduction had two ways of answering the question. Both librarians were honest. The first librarian's honesty was the source of every offset bug I have ever debugged. The second librarian's honesty was the source of every cursor hot key I have ever been paged about. The contract was correct in both cases. The contract was the source of the cost in both cases.
The next chapter takes the two librarians off the shelf and turns the question back to you: which promise can your storage layer keep, which promise does your caller assume, and what do you do in the space between?