Learn how to deploy and experiment with Gemma 4, the latest open model family from Google DeepMind. This guide covers text, image, and video input, Mixture-of-Experts architecture, and more. Get started with Red Hat AI Inference Server today.

Rhdev is a blog and resource hub dedicated to Ruby on Rails development, a popular web application framework written in Ruby. Developers can explore tutorials, best practices, and case studies for building web applications with Ruby on Rails. Additionally, Rhdev covers topics such as ActiveRecord ORM, RESTful APIs, and frontend integration using JavaScript frameworks, offering insights for both beginners and experienced Rails developers.

Red Hat Developer

Gemma 4, Google DeepMind's latest open model family, is available for immediate deployment via vLLM and Red Hat AI Inference Server. The family spans four models (2B to 31B parameters), all supporting multimodal input (text, image, video), with the two smallest also handling audio. The 26B A4B model uses a Mixture-of-Experts architecture, activating only 3.8B parameters per forward pass for efficient inference. All models support thinking mode, native function calling, long context windows (128K–256K tokens), and 140+ languages under Apache 2.0. The guide provides step-by-step instructions for deploying the 26B A4B model using Podman and Red Hat AI Inference Server, including examples for chat, reasoning, multimodal, and function calling via the OpenAI-compatible API.

Run Gemma 4 with Red Hat AI on Day 0: A step-by-step guide

Get started using Red Hat AI Inference Server