Running LLMs on Kubernetes introduces a distinct threat model that Kubernetes itself cannot address. While the infrastructure handles scheduling and isolation, it has no awareness of what the workloads do with untrusted input. Four OWASP LLM Top 10 risks are particularly relevant for Kubernetes operators: prompt injection (LLM01), sensitive information disclosure (LLM02), supply chain risks (LLM03), and excessive agency (LLM06). Each maps to familiar infrastructure security patterns — input validation, output filtering, image provenance, and least-privilege access — but applied to probabilistic, language-based systems. The recommended approach is a dedicated policy layer (an LLM-aware API gateway) sitting in front of the model runtime, keeping inference and policy concerns separate. Tools like LiteLLM, Kong AI Gateway, Portkey, and kgateway are highlighted as options. A follow-up post will cover a reference implementation.
Table of contents
Understanding what you’re actually runningOWASP LLM Top 10: A framework for understanding risksFour risks that Kubernetes operators need to understandWhere these controls belongChoosing a policy layerSort: