An experiment in building cognitive architectures for LLM agents using text adventure games as evaluation tasks. The author implements multiple harnesses for Claude to play Anchorhead, starting with a simple chat history approach that works but becomes expensive due to token usage. A memory-based harness with limited context

10m read time From borretti.me
Post cover image
Table of contents
The Trivial HarnessMemoryAside: Small WorldsFuture WorkCode

Sort: