Retry Patterns and Idempotency Keys
An idempotency key turns "retry on failure" from a hazard into a guarantee.
Key Takeaways
- Retries are mandatory; duplicates are the cost
- An idempotency key is a unique ID per logical operation
- The server stores the first response and replays it on duplicates
- Combine idempotency with backoff and a circuit breaker
The Bug
Your payment service calls the bank. The card is charged, but the response times out. The client retries. The card is charged again. The customer is double-billed.
The network is unreliable. Retries are mandatory. But retries duplicate work.
The Pattern
The client attaches a unique idempotency key to each logical operation. The server uses it as a deduplication token: check key → response; on hit, replay; on miss, execute, store with a TTL, return. Stripe pioneered this; the Idempotency-Key HTTP header is now standard.
Cooperate
Idempotency makes retries *safe*. Two more pieces make them *polite*:
- Exponential backoff with jitter prevents thundering herds.
- Circuit breaker stops hammering a failing service.
Retriable: 5xx, 408, 429, network timeouts. Other 4xx are not — the request is wrong.
Boundaries
- TTL: keys must outlive the longest plausible retry window (~24h), not forever.
- Scope: per-resource, not global. Two accounts need different keys.
- Ambiguous outcomes: a timeout may have succeeded. Treat as "pending" until a status check confirms.
Distributed operations need identity beyond their network packets. Idempotency keys give operations that identity.