Please subscribe to our YouTube channel @ https://www.youtube.com/@DevoxxForever
Subscribe to LinkedIn @ https://www.linkedin.com/company/voxxed-days-amsterdam
Follow us on Twitter @ https://twitter.com/voxxedamsterdam

Enterprise and government document systems hold terabytes of valuable unstructured information, yet most still rely on keyword and metadata search with little semantic context. Retrieval-Augmented Generation (RAG) promises a breakthrough, but tutorials rarely prepare you for regulated, large-scale environments.

In this talk, I’ll share lessons from building a RAG stack with Spring Boot, Elasticsearch, LangChain4j, Docker, and ActiveMQ, using both Azure OpenAI and Ollama. Expect concrete learnings on document chunking, enforcing access control, and keeping LLMs grounded in facts. Practical takeaways for anyone bringing RAG from demo to production.

Devoxx

A practitioner shares lessons from integrating RAG (Retrieval-Augmented Generation) into legacy archiving software for government clients. Key topics include: choosing between Spring AI and LangChain4j (switched for better metadata support), selecting a lightweight self-hostable vector database (Milvus, Elasticsearch), retrieval algorithms (KNN vs. Approximate Nearest Neighbor, Maximum Marginal Relevance for diversity), text quality and chunking strategies using tools like Dockling, data sovereignty via self-hosted LLMs (Llama), enforcing document-level access rights by storing document UUIDs as metadata and filtering before LLM calls, adding audit trails and references for trust, input/output guardrails, and cost observability.

RAG in the wild: real-world lessons from modernizing legacy systems by Susanne Pieterse