Claude's Cache Has a Hidden 62.5-Minute Rule That Could Save or Cost You Thousands
Summary
Claude's prompt cache follows a hidden 62.5-minute break-even rule — refresh a cached prefix before expiry or pay full rewrite costs, a ratio-based threshold that scales into thousands of dollars for large, expensive models like Opus 4.7, while context compaction carries its own hidden costs requiring up to 65 future turns to break even.
Key Points
- A model-independent '62.5-minute rule' governs Claude's prompt cache: if a cached prefix will be needed again within 62.5 minutes, refreshing it with a keep-alive read is cheaper than letting it expire and rewriting it, regardless of model tier or prefix size.
- The break-even point is derived from the fixed ratio of cache write cost (1.25x base) to cache read/refresh cost (0.10x base), meaning token count and model pricing cancel out, though the actual dollar impact scales significantly with larger prefixes and pricier models like Opus 4.7.
- Context compaction is not cost-free — its break-even depends entirely on the compression ratio achieved, with a 10:1 compression requiring at least 8 future turns to recover costs, while poor compression ratios near 2:1 can require over 65 turns, making verbose summaries a potential net loss.