An investigative project called TBOTE has published a notice documenting systematic automated reconnaissance by Meta and Microsoft/OpenAI crawlers against their research repository. Meta's corporate crawler (meta-externalagent/1.1) made 1,285 requests from 70 rotating IPs, while OpenAI's GPTBot/OAI-SearchBot on Microsoft Azure infrastructure made 1,659 requests. The crawlers performed full repository archive downloads, version diffing, authorship analysis, RSS feed setup for change monitoring, and authentication probing. The project connects these companies to a systemd birthDate merge controversy via Amutable GmbH. Since the original notice, scanning has expanded to include Google Cloud Platform, Palo Alto Networks' Cortex Xpanse, and Censys. The project states all access is logged with full headers and TLS fingerprints in append-only off-site archives, and warns that if contributors face retaliation, the logs will be published alongside findings.
Table of contents
What We ObservedOn Meta's PositionOn Microsoft's PositionUpdate: The Scanning Got WorseWhat Happens NextSort: