Google DeepMind has launched Gemma 4, a family of open models (Apache 2.0) designed for on-device agentic AI workflows. Key capabilities include multi-step planning, autonomous action, offline code generation, audio-visual processing, and support for 140+ languages — all without specialized fine-tuning. Developers can access Gemma 4 via Android's AICore Developer Preview or Google AI Edge. The new Agent Skills feature in Google AI Edge Gallery enables fully on-device autonomous workflows. LiteRT-LM, the underlying inference runtime, delivers strong performance: processing 4,000 tokens in under 3 seconds on mobile GPUs, and achieving 133 tokens/sec prefill on a Raspberry Pi 5. A new Python package and CLI tool (available on Linux, macOS, and Raspberry Pi) make it easy to experiment with Gemma 4 without writing code, including tool-calling support for agentic use cases.
Sort: