Bring state-of-the-art agentic skills to the edge with Gemma 4

Google DeepMind has launched Gemma 4, a family of open models (Apache 2.0) designed for on-device agentic AI workflows. Key capabilities include multi-step planning, autonomous action, offline code generation, audio-visual processing, and support for 140+ languages — all without specialized fine-tuning. Developers can access Gemma 4 via Android's AICore Developer Preview or Google AI Edge. The new Agent Skills feature in Google AI Edge Gallery enables fully on-device autonomous workflows. LiteRT-LM, the underlying inference runtime, delivers strong performance: processing 4,000 tokens in under 3 seconds on mobile GPUs, and achieving 133 tokens/sec prefill on a Raspberry Pi 5. A new Python package and CLI tool (available on Linux, macOS, and Raspberry Pi) make it easy to experiment with Gemma 4 without writing code, including tool-calling support for agentic use cases.

#edge-computing

#agentic-ai

#gemma

Apr 02•4m read time•From developers.googleblog.com

Table of contents

Leverage Gemma 4 across devices with LiteRT-LM Run on any device

Comment

Bookmark

Copy

Sort: