[BUG] Pro Max 5x Quota Exhausted in 1.5 Hours Despite Moderate Usage · Issue #45756 · anthropics/claude-code

Users on Claude Code's Pro Max 5x plan are experiencing quota exhaustion within 1.5 hours despite moderate usage. Investigation points to cache_read tokens potentially counting at full rate against rate limits rather than the expected 1/10 rate, negating prompt caching benefits for quota purposes. Community analysis using a telemetry interceptor (claude-code-cache-fix) collected data across multiple quota windows and found that cache_read=0.0x (not counting toward quota) best fits the data, contradicting the full-rate hypothesis. Additional compounding factors include a silent regression reducing cache TTL from 1 hour to 5 minutes (causing frequent expensive cache_create events), context window inflation, background sessions consuming shared quota, and auto-compact spikes. Workarounds suggested include keeping sessions short, using /compact aggressively, and front-loading context in CLAUDE.md.

#anthropic

#claude-code

Apr 12•7m read time•From github.com

Table of contents

Summary Environment Data Collection Method Measured Token Consumption The Problem Context Size Progression Specific Issues Reproduction Expected Behavior Actual Behavior Suggested Improvements

Comment

Bookmark

Copy

Sort: