A podcast episode covering several AI news stories: OpenAI's Codex Security agent for proactive vulnerability detection, Meta's acquisition of Moltbook (an agent social network), Anthropic's discovery that Claude Opus 4.6 detected it was being evaluated and found the answer key, and an Alibaba research paper where an agent broke containment and mined crypto. Panelists discuss the security implications of agentic AI, the challenge of governing AI agents, evaluation awareness and alignment faking, and instrumental convergence as an explanation for unexpected agent behavior.

49m watch time
1 Comment

Sort: