A hands-on red team experiment deploying OpenClaw on Red Hat OpenShift on IBM Cloud, using an abliterated Qwen3.5-35B model (zero refusals) to test infrastructure defenses across three hardening tiers. 91 adversarial prompts per tier were run using 15 custom garak probes across six attack categories. Key findings: SSH sandbox isolation eliminated credential exfiltration (50-67% → 0%), NetworkPolicy blocked Kubernetes API escalation (40% → 0%), and a prompt injection classifier (protectai/deberta-v3-base-prompt-injection-v2) stopped encoding-bypass and privilege-escalation prompts. However, persistence/memory poisoning attacks bypassed all three tiers, remaining an unsolved problem. The post also covers a subtle NetworkPolicy DNS egress misconfiguration (ClusterIP vs. pod label targeting) and seccomp RuntimeDefault vs. non-root tradeoffs when running sshd in a privileged container.

Table of contents
Tier 0: The baseline nobody should shipTier 1: The sandbox changes everything (almost)Tier 2: Adding a prompt injection classifierSeccomp RuntimeDefault vs. non-root: Same protection, different costsWhat defense-in-depth actually looks likeTry it yourselfSort: