A practical tutorial on building a tiered model routing system for AI agents that minimizes API costs. The 'cost curve' pattern routes tasks through three tiers: deterministic Python checks (free), Claude Haiku for ambiguous cases (~$0.0001/call), and Claude Sonnet only for semantic judgment (~$0.006/call). Applied to an SEO audit agent, this reduced per-URL cost from $0.006 to effectively $0 for most pages. The tutorial includes full Python implementation of each tier, a router function with backward-compatible tiered/non-tiered modes, graceful fallback on API errors, and unit tests using mocks. The pattern generalizes to customer support, code review, and content moderation agents.

14m read timeFrom freecodecamp.org
Post cover image
Table of contents
What You'll BuildPrerequisitesTable of ContentsThe Problem with Calling Claude on EverythingThe Cost Curve ExplainedProject SetupTier 1: Deterministic PythonTier 2: Claude Haiku for Ambiguous CasesTier 3: Claude Sonnet for Semantic JudgmentThe Router: audit_url()Graceful FallbackTesting the Cost CurveApplying This Pattern to Your AgentWrapping Up

Sort: