Context engineering is the discipline of managing everything an LLM receives during inference — not just the prompt, but all tokens in the context window including system instructions, conversation history, retrieved documents, tool outputs, and working state. Unlike prompt engineering, it is a systematic architectural practice required for reliable multi-step AI agents. Four core operations are defined: Write (store context externally), Select (retrieve only relevant context per step via RAG), Compress (reduce token count while preserving signal), and Isolate (prevent context pollution across tasks). Infrastructure requirements include low-latency key-value access, vector search, hybrid retrieval (BM25 + semantic), real-time data ingestion, and semantic caching. Redis is presented as a unified platform covering all these primitives — vector search, in-memory short/long-term agent memory, and semantic caching via Redis LangCache.
Table of contents
What is context engineering?Why agents need context engineering to workBuild fast, accurate AI apps that scaleThe four operations of context engineeringInfrastructure requirements for context assemblyGive your AI apps real-time contextWhere Redis fits in the context engineering stackYou've made it this farBuild your context engine on RedisSort: