Users on Claude Code's Pro Max 5x plan are experiencing quota exhaustion within 1.5 hours despite moderate usage. Investigation points to cache_read tokens potentially counting at full rate against rate limits rather than the expected 1/10 rate, negating prompt caching benefits for quota purposes. Community analysis using a
Table of contents
SummaryEnvironmentData Collection MethodMeasured Token ConsumptionThe ProblemContext Size ProgressionSpecific IssuesReproductionExpected BehaviorActual BehaviorSuggested ImprovementsSort: