A detailed data-driven investigation reveals that Claude Code's prompt cache TTL silently regressed from 1 hour to 5 minutes around March 6–8, 2026. Analysis of 119,866 API calls across two machines shows a clean 33-day window of 1h-only caching in February, followed by a sharp reversion to 5-minute TTL in early March. The practical impact is severe: any session pause longer than 5 minutes forces a full cache re-creation at 12.5× the read rate, causing 20–32% cost inflation and pushing subscription users into quota limits they had never previously hit. The author requests Anthropic confirm whether this was intentional, clarify the intended TTL default, and consider restoring 1h TTL or making it user-configurable.

7m read timeFrom github.com
Post cover image
Table of contents
SummaryDataCost impactQuota impactHypothesisRequestMethodology

Sort: