Google released Magika 1.0, an AI-powered file type detection system completely rewritten in Rust. The stable release doubles file type support to over 200 formats, including specialized types for data science, modern programming languages, and DevOps configurations. The new Rust engine processes hundreds of files per second on a single core using ONNX Runtime and Tokio. Training challenges were addressed using SedPack for handling 3TB datasets and Gemini for generating synthetic samples of rare file types. Available as a native CLI tool and library for Python, TypeScript, and Rust.
Sort: