Critical remote code execution vulnerabilities were discovered across major AI inference frameworks including Meta's Llama Stack, Nvidia TensorRT-LLM, vLLM, and SGLang. The flaws originated from unsafe use of ZeroMQ and Python's pickle deserialization in Meta's code, then spread to other projects through copy-paste development practices. Attackers could exploit these vulnerabilities to execute arbitrary code on GPU clusters, exfiltrate data, or compromise AI infrastructure. All affected vendors have released patches, and organizations are advised to upgrade immediately and implement authentication measures for ZeroMQ communications.

3m read timeFrom infoworld.com
Post cover image

Sort: