Long-running Agents

Long-running AI agents that operate over hours, days, or weeks represent a significant shift from single-session chat-based agents. Three core engineering challenges must be solved: finite context windows, lack of persistent state between sessions, and unreliable self-verification. The post surveys how Anthropic, Cursor, and Google have converged on similar architectural patterns — separating the model loop (brain) from execution sandboxes (hands) and durable session logs — while differing in surface area and productization. Practical patterns covered include checkpoint-and-resume, delegated human approval, memory-layered context, ambient processing, and fleet orchestration. The Ralph loop (a simple bash-based task iteration pattern) is presented as a minimal viable implementation. Key unsolved challenges include cost control, security, alignment drift over many context windows, and the human skill of writing precise enough specs for autonomous execution.

#llm

#ai-agents

#claude

Apr 30•21m read time•From addyo.substack.com

Table of contents

What “long-running” actually means Why this matters The three walls every long-running agent hits The Ralph loop: one of the simpler practitioner versions of long-running agents Anthropic: harnesses, then the brain/hands/session split Cursor: planners, workers, judges Google: long-running agents on the Agent Platform Five patterns for long-running agents in production So how do you actually build one today?There are some real limitations right now Where this is going

Comment

Bookmark

Copy

Sort: