LangChain has added an autonomous context compression tool to the Deep Agents SDK (Python) and CLI. Rather than compacting at a fixed token threshold, the tool lets the agent itself decide when to compress its context window. Ideal compaction moments include task boundaries, after extracting results from large contexts, or before starting complex multi-step processes. The feature is implemented as middleware via `create_summarization_tool_middleware` and is opt-in in the SDK but enabled by default in the CLI. Testing showed agents are conservative about triggering compaction but choose appropriate moments when they do. The feature reflects a broader design philosophy of giving models more control over their own working memory instead of relying on hand-tuned harness rules.

5m read timeFrom blog.langchain.com
Post cover image
Table of contents
MotivationWhen should we compact?What happens when the tool is called?How to useOur experience with this feature

Sort: