AI Tech /

E02_The_Economics_of_Speed

# The Economics of Speed > Performance is not a metric. It is a line item on the income statement, a multiplier on the cost of customer acquisition, and a structural tax on the engineering organization that compounds for as long as the page stays slow. ## Key Takeaways - The cost of slowness is paid three times: in lost conversion, in inflated infrastructure spend, and in burned engineering hours. - A one-hundred-millisecond regression on a high-traffic e-commerce page is, on the historical evidence, worth more than the median engineer's annual salary. - Performance is the only engineering decision whose cost is borne by a different team than the one that shipped the regression. - Most "performance work" is misclassified. It is not engineering hygiene. It is product work — and should be funded and prioritized as such. --- A one-percent drop in revenue, per one hundred milliseconds of latency, is the number. It is the number Greg Linden published from Amazon's A/B test data in 2008, and it has been quietly doing its work ever since. At Amazon's 2008 revenue, one hundred milliseconds was a seven-figure annual number per percentage point. The number has been broadly corroborated since then: Google's own experiments in 2006 found that a five-hundred-millisecond increase in result-page latency dropped search traffic by approximately twenty percent. Yahoo found a four-hundred-millisecond slowdown cost them five to nine percent of full-page traffic. The number, in short, is not a peculiarity of one company. It is a structural property of human attention interacting with a network. Most engineers I have worked with know that number. What they often do not know is what to do with it, because the number lives in a different organizational room than the engineering decisions that produce it. The number is the product manager's number. The decisions are the engineer's decisions. The two rooms do not talk to each other, and that is where the performance paradox lives. This chapter is the part of the analysis I keep coming back to, because the economics are what convert performance from a "nice to have" into a "must have." The mechanism matters. The mechanism is the work of the previous chapter. The economics are what make the mechanism load-bearing. ### Three invoices, one bug I think of the cost of slowness as three separate invoices, all of which get paid on the same regression. The first is the most familiar: the user-experience invoice, paid in conversion rate, bounce rate, and customer lifetime value. The second is the infrastructure invoice, paid in CDN egress, compute time, and bandwidth bills. The third is the engineering invoice, paid in the form of the next team that has to debug a page that is mysteriously slow, the support ticket that "the site feels sluggish today," and the design compromise that the team accepts because the page is already so heavy that nothing else will fit. The three invoices compound. A 100ms regression does not produce a 100ms effect; it produces a 100ms effect on the user, a 100ms effect on every server in the serving path, and an N×100ms effect on the engineering team that has to deal with the symptoms for the next eighteen months. The arithmetic is the reason performance work has a higher return on investment than almost any other category of engineering work, and it is also the reason performance work is systematically underfunded: the return is distributed across three budgets, and no single budget owns it. Imagine you are a director of engineering, and your quarterly planning meeting is next week. The product organization has a slate of seven features they want to ship, and your platform organization has a slate of three performance initiatives. The product features are individually well-scoped. The performance initiatives are individually well-scoped. The meeting will go the way these meetings always go: the product features will get prioritized, the performance initiatives will get deprioritized to "next quarter," and the page will get slower by another 200ms over the course of the year. The compounding is the thing. The compounding is what makes this pattern a structural tax rather than a tactical miss. I started writing this chapter believing the right way to argue for performance was the conversion-rate argument. I have come to think that argument is necessary and insufficient. The conversion-rate argument is correct, but it is the argument the finance team has already heard. The argument that lands is the total-cost-of-ownership argument, which is what the rest of this chapter is built around. ### The conversion-rate invoice, examined The conversion-rate effect of latency is well-documented and well-repli

Chapter 3 of 4 9m Article Learning path

The Economics of Speed

Performance is not a metric. It is a line item on the income statement, a multiplier on the cost of customer acquisition, and a structural tax on the engineering organization that compounds for as long as the page stays slow.

Key Takeaways

The cost of slowness is paid three times: in lost conversion, in inflated infrastructure spend, and in burned engineering hours.
A one-hundred-millisecond regression on a high-traffic e-commerce page is, on the historical evidence, worth more than the median engineer's annual salary.
Performance is the only engineering decision whose cost is borne by a different team than the one that shipped the regression.
Most "performance work" is misclassified. It is not engineering hygiene. It is product work — and should be funded and prioritized as such.

---

A one-percent drop in revenue, per one hundred milliseconds of latency, is the number. It is the number Greg Linden published from Amazon's A/B test data in 2008, and it has been quietly doing its work ever since. At Amazon's 2008 revenue, one hundred milliseconds was a seven-figure annual number per percentage point. The number has been broadly corroborated since then: Google's own experiments in 2006 found that a five-hundred-millisecond increase in result-page latency dropped search traffic by approximately twenty percent. Yahoo found a four-hundred-millisecond slowdown cost them five to nine percent of full-page traffic. The number, in short, is not a peculiarity of one company. It is a structural property of human attention interacting with a network.

Most engineers I have worked with know that number. What they often do not know is what to do with it, because the number lives in a different organizational room than the engineering decisions that produce it. The number is the product manager's number. The decisions are the engineer's decisions. The two rooms do not talk to each other, and that is where the performance paradox lives.

This chapter is the part of the analysis I keep coming back to, because the economics are what convert performance from a "nice to have" into a "must have." The mechanism matters. The mechanism is the work of the previous chapter. The economics are what make the mechanism load-bearing.

Three invoices, one bug

I think of the cost of slowness as three separate invoices, all of which get paid on the same regression. The first is the most familiar: the user-experience invoice, paid in conversion rate, bounce rate, and customer lifetime value. The second is the infrastructure invoice, paid in CDN egress, compute time, and bandwidth bills. The third is the engineering invoice, paid in the form of the next team that has to debug a page that is mysteriously slow, the support ticket that "the site feels sluggish today," and the design compromise that the team accepts because the page is already so heavy that nothing else will fit.

The three invoices compound. A 100ms regression does not produce a 100ms effect; it produces a 100ms effect on the user, a 100ms effect on every server in the serving path, and an N×100ms effect on the engineering team that has to deal with the symptoms for the next eighteen months. The arithmetic is the reason performance work has a higher return on investment than almost any other category of engineering work, and it is also the reason performance work is systematically underfunded: the return is distributed across three budgets, and no single budget owns it.

Imagine you are a director of engineering, and your quarterly planning meeting is next week. The product organization has a slate of seven features they want to ship, and your platform organization has a slate of three performance initiatives. The product features are individually well-scoped. The performance initiatives are individually well-scoped. The meeting will go the way these meetings always go: the product features will get prioritized, the performance initiatives will get deprioritized to "next quarter," and the page will get slower by another 200ms over the course of the year. The compounding is the thing. The compounding is what makes this pattern a structural tax rather than a tactical miss.

I started writing this chapter believing the right way to argue for performance was the conversion-rate argument. I have come to think that argument is necessary and insufficient. The conversion-rate argument is correct, but it is the argument the finance team has already heard. The argument that lands is the total-cost-of-ownership argument, which is what the rest of this chapter is built around.

The conversion-rate invoice, examined

The conversion-rate effect of latency is well-documented and well-replicated. A page that loads in one second has a conversion rate materially higher than a page that loads in five seconds. The size of the effect varies by industry, by device, and by the type of conversion being measured, but the sign of the effect is consistent across hundreds of public studies.

What is less well-understood is the *non-linearity* of the effect. The relationship between latency and conversion is not linear. It is closer to logarithmic: a one-second improvement on a five-second page produces a larger conversion lift than a one-second improvement on a one-second page. This is a consequence of human attention, which degrades non-linearly with waiting time. The first second of waiting costs you very little. The third second costs you a lot. The fifth second costs you the user.

This is the reason the famous "100ms = 1%" number is not a small number. The relationship between latency and conversion is a curve, not a line, and the curve is steepest in the range where most shipping pages live. A page that loads in 4.2 seconds — the page from the previous chapter — is in the steepest part of the curve. The marginal 100ms at 4.2 seconds is worth more than the marginal 100ms at 1.5 seconds, by an order of magnitude.

graph LR
    A[Latency regression] --> B[User invoice: lost conversion]
    A --> C[Infrastructure invoice: more compute, more egress]
    A --> D[Engineering invoice: debug time, support load]
    B --> E[Quarterly revenue impact]
    C --> F[Cloud bill impact]
    D --> G[Engineering velocity drag]
    E --> H[Compounded annual cost]
    F --> H
    G --> H

The infrastructure invoice, examined

The infrastructure cost of a slow page is more direct than the conversion cost, but it is less discussed. A page that ships 2MB of JavaScript is not just slow on the user side; it is expensive on the server side. The server has to send the bytes. The CDN has to cache the bytes. The client has to download the bytes. Every byte, on every request, from every user, paid in cash.

For a page that serves ten million requests per day, an extra 100KB of JavaScript per request is roughly a terabyte of egress per day. At typical CDN pricing, that is a non-trivial monthly line item. For a page that serves one hundred million requests per day, the same 100KB is a line item that gets budget meetings.

The infrastructure cost is the easiest of the three invoices to measure, and the easiest to make visible to a finance team. The same is not true of the engineering invoice, which is the most expensive of the three and the hardest to quantify.

The engineering invoice, examined

The engineering cost of a slow page is paid in two forms. The first is direct: every minute spent debugging a slow page is a minute not spent shipping a feature. The second is structural: a slow page makes every other engineering decision harder, because every feature has to fit into a page that is already over budget.

The structural cost is the one I want to call out, because it is the one that compounds. A team that has let their page grow to 2MB of JavaScript is not the same team that started with 200KB. The team that started with 200KB could ship a feature that added 50KB and stay under budget. The team at 2MB cannot ship a feature that adds 50KB without going further over budget, and so they ship the feature anyway, and the budget grows, and the next feature is harder to ship, and the team gets slower, and the cycle repeats. This is the same loop from chapter 0, drawn at the level of the engineering team.

The engineering cost of slowness is, in my experience, larger than the conversion cost and the infrastructure cost combined. It is also the cost that is hardest to make visible in a quarterly planning meeting, because it does not appear on any single line item. It appears in the form of features that took longer than they should have, in the form of bugs that were harder to debug than they should have been, in the form of a team that is tired in a way that has no single cause.

The funding problem

The funding problem is the reason performance work is systematically underfunded, and it is worth naming directly. Performance work is a public good. Its benefits accrue to the entire organization — to product, to infrastructure, to engineering velocity — but no single budget line item claims it. The team that ships the performance improvement does not capture the conversion lift. The team that captures the conversion lift does not pay for the performance improvement. The team that pays for the performance improvement is the team that, in most organizations, has the least political capital to spend on it.

The way out of the funding problem is to treat performance as product work, not as platform work. The argument is straightforward: a 100ms improvement to the page is, on the evidence above, a measurable improvement to the product. The improvement is product-shaped, not platform-shaped. It should be prioritized in the product roadmap, funded from the product budget, and measured with the same rigor as any other product investment.

The teams that are consistently fast have, almost without exception, made this organizational change. They have moved performance work out of the platform organization and into the product organization. They have given it a product manager. They have given it OKRs. They have given it a seat at the planning table. The work has not gotten easier. The work has gotten funded.

The compounding, in one sentence

Performance is the only engineering decision whose cost compounds across user, business, and infrastructure simultaneously, and the compounding is the reason the funding problem is the most important problem to solve. The compounding is also the reason the next chapter exists. The mechanism is in the previous chapter. The economics are in this one. What is left is the discipline — the meta-pattern of how a team thinks about performance in a way that survives the next reorg and the next product cycle and the next framework migration.

---

References:

Greg Linden — "Make Data Useful" (Amazon A/B test data, 2006/2008)
Marissa Mayer — "In search of... a faster web" (Google latency experiments, 2006)
BBC — "The need for speed: BBC News online" (Steve Souders, BBC case study, 2010)
Pinterest Engineering — "A faster Pinterest: rebuilding the home feed" (2016)
HTTP Archive — State of the Web reports (annual web-performance data)

---

Most engineers I have spoken to assume the next chapter is going to be a checklist. It is not. Checklists are what you do when you do not have a discipline. The discipline is what you have when the checklist no longer matters, and that is the harder thing to build.