A conference talk transcript covering how to build a sovereign model-as-a-service (MaaS) platform for enterprise AI. Key themes include: abstracting LLM infrastructure behind an API gateway, using inference servers like vLLM and distributed inference with LMD, managing AI sovereignty and data residency concerns (including US Cloud Act implications), reducing developer cognitive load through platform engineering and scaffolding templates (Backstage), comparing RAG vs fine-tuning vs agentic approaches, implementing service meshes for security and advanced deployment patterns (mirroring, canary), and applying guardrails on both input and output prompts. Practical demos show a RAG chatbot bootstrapped from templates and input/output filtering with Trusty AI. The overarching principle is 'keep your options open' to avoid vendor lock-in.

41m watch time

Sort: