Discover how Red Hat OpenShift AI 3.4's Models-as-a-Service (MaaS) capability streamlines AI inference by acting as an integrated AI gateway within the platform, providing centralized governance and routing requests to both self-hosted models and external providers.

Rhdev is a blog and resource hub dedicated to Ruby on Rails development, a popular web application framework written in Ruby. Developers can explore tutorials, best practices, and case studies for building web applications with Ruby on Rails. Additionally, Rhdev covers topics such as ActiveRecord ORM, RESTful APIs, and frontend integration using JavaScript frameworks, offering insights for both beginners and experienced Rails developers.

Red Hat Developer

When AI applications call model provider APIs directly, switching models requires code changes and creates tight coupling. An LLM gateway solves this by acting as a unified routing layer between applications and multiple providers. Using Red Hat OpenShift AI 3.4's Models-as-a-Service (MaaS) capability alongside LiteLLM Proxy, teams can expose a single OpenAI-compatible endpoint that routes requests to OpenAI, Gemini, or self-hosted Llama models running via vLLM on OpenShift. The tutorial covers deploying LiteLLM as a Kubernetes deployment with a ConfigMap-based model list, deploying a Llama-3.1-8B model using KServe InferenceService and vLLM ServingRuntime, and testing all three backends through the same API endpoint with simple curl commands.

How to route external and local LLMs with Models-as-a-Service

The challenge of switching between model providers

Deploy the self-hosted model on OpenShift AI