A practitioner shares lessons from integrating RAG (Retrieval-Augmented Generation) into legacy archiving software for government clients. Key topics include: choosing between Spring AI and LangChain4j (switched for better metadata support), selecting a lightweight self-hostable vector database (Milvus, Elasticsearch), retrieval algorithms (KNN vs. Approximate Nearest Neighbor, Maximum Marginal Relevance for diversity), text quality and chunking strategies using tools like Dockling, data sovereignty via self-hosted LLMs (Llama), enforcing document-level access rights by storing document UUIDs as metadata and filtering before LLM calls, adding audit trails and references for trust, input/output guardrails, and cost observability.
•14m watch time
Sort: