OpenAI published a paper detailing URL-based data exfiltration mitigations in language model agents. The mitigations use a web crawler to create an allow-list of safe URLs, blocking dynamically created ones. While this increases attack complexity, bypasses remain possible through techniques like mapping individual letters to indexed URLs. The author, who originally disclosed this vulnerability in 2023, suggests additional mitigations like URL caching and preventing repeated visits to the same URL within a session.
Table of contents
What Is the New MitigationAdditional Mitigation IdeasFinal Risk - Thorough Adoption of the MitigationReferencesSort: