A conference talk by Prince Canuma (Arcee) covering MLX, Apple's array framework for on-device AI on Apple Silicon. The speaker demonstrates running large vision and language models (including Gemma 4) entirely on MacBooks and iPhones without internet, using MLX VLM for real-time image analysis, MLX Audio for text-to-speech and speech recognition, and a modular speech pipeline. Key highlights include 1.5M downloads and 4,000+ ported models, TurboQuant enabling 1M context windows on-device, and community projects like robotics, video generation, and native Swift apps. The talk is motivated by a personal story about building accessibility tools for a blind family member in a low-connectivity region.
•23m watch time
Sort: