Shipping an AI Agent that Lies to Production: Lessons Learned
This title could be clearer and more informative.Try out Clickbait Shieldfor free (5 uses left this month).
A detailed account of building and deploying an AI Mentor feature for a coding education platform. The team implemented an event-driven system using Go, Pub/Sub, and Server-Sent Events to help students debug their code. Key challenges included handling LLM hallucinations in production, implementing proper testing (evals), managing costs and rate limits, and building reliable agent systems. The project revealed that AI development is 80% traditional software engineering and 20% AI-specific work, with most complexity lying in orchestration rather than the LLM calls themselves.
Table of contents
Why AI Mentor?MilestonesThe “Help Me!” ButtonCalling LLMsPrompts & Context EngineeringFree-form chatSolving Complex ProjectsFixing the SolutionGo With The Domain Three Dots LabsQA: Tests & EvalsFailing on productionRAG and SourcesPredictabilityShifting the Mental ModelAgentic systems or autonomous agents?Where is the complexity?ModelsLimits & CostsObservability & ToolingModeratorEncouraging students to ask for helpThe UIOutcomesSort: