Standard RAG systems are stateless — they retrieve but never learn. This tutorial builds a knowledge reflection layer on top of a Cloudflare Workers RAG system that automatically synthesises insights after every document ingest. After ingesting a new document, the system finds semantically related existing documents, uses Kimi K2.5 to generate a three-sentence synthesis, and stores it as a retrievable artifact with a 1.5× ranking boost. Reflections are periodically consolidated into higher-level summaries. The result is a knowledge base that grows smarter with each addition, surfacing cross-document insights that no single chunk contains. The full stack uses Cloudflare Vectorize, D1, and Workers AI, deploys with a single command, and costs roughly $1–5/month at 10,000 queries/day.
Table of contents
Table of ContentsWhat You Will BuildPrerequisitesHow to Set Up the Base SystemWhy Standard RAG Has a Memory ProblemStep 1: Schema UpdateStep 2: The Reflection EngineStep 3: ConsolidationStep 4: Wire It Into Your Ingest HandlerStep 5: Boost Reflections in SearchStep 6: Filtering by doc_typeWhat Changes After You Build ThisDeployingWhat to Build NextSort: