Deepmox / AI Tech

learning path

Worker Stability Test

Real Claude CLI e2e — 1 chapter

1 chapters 1 audio lessons Article-first 3 free previews Fresh topic

Start here

1. E00_Retries_Are_Easy_Idempotency_Is_the_Whole_Game

Retries Are Easy; Idempotency Is the Whole Game

Distributed workers don't fail because they retry too much. They fail because their retries aren't safe to repeat.

Key Takeaways

  • Every task that touches external state will run twice. Make the second run harmless.
  • Idempotency keys are the default. Conditional writes are the fallback. Fencing tokens are the last resort.
  • Don't paper over a duplicate-execution bug with more retries; fix the safety boundary.

Imagine your worker crashes 200 ms before committing a payment. The supervisor restarts it. The new attempt charges the customer again. Same task, same code, two outcomes — that isn't a flaky network, that's a missing idempotency contract.

The fix is not "retry smarter." The fix is making the side effect repeatable.

flowchart LR
    A[Task arrives] --> B{Already done?}
    B -- Yes --> C[Return 

1m / Article + audio