Agent Harness Engineering

Addy Osmani argues that the real leverage in AI coding agents lies not in the model itself but in the 'harness' — the scaffolding of prompts, tools, hooks, sandboxes, context policies, and feedback loops wrapped around it. A decent model with a great harness beats a great model with a bad one. The post introduces 'harness engineering' as a discipline: treating every agent mistake as a permanent signal that tightens the harness (the ratchet principle), designing components by working backwards from desired behaviour, and maintaining AGENTS.md as a concise, failure-traced rulebook. Key harness primitives covered include filesystem/Git for durable state, bash execution in sandboxes, context rot mitigation (compaction, tool-call offloading, skills with progressive disclosure), long-horizon execution patterns (Ralph Loops, planner/generator/evaluator splits, sprint contracts), and hooks as the enforcement layer. The post also covers the Harness-as-a-Service (HaaS) trend, where SDKs from Anthropic, OpenAI, and others provide the loop and tooling out of the box, and notes that as models improve, harness complexity doesn't shrink — it moves to new failure modes at higher capability ceilings.

#ai-agents

#context-engineering

#llm

#mcp

Apr 20•17m read time•From addyosmani.com

Table of contents

What is a harness, really?The “skill issue” reframe The ratchet: every mistake becomes a rule Working backwards from behaviour Harnesses don’t shrink, they move Harness-as-a-Service Where this is going

Comment

Bookmark

Copy

Sort: