How BMW is testing SLMs — not LLMs — for in-vehicle voice commands.

BMW Group and Google Cloud completed a proof of concept for deploying small language models (SLMs) in vehicles for voice commands. Unlike cloud-dependent LLMs, SLMs can run on-device, avoiding network latency issues. The team built an automated pipeline on Vertex AI to handle the full workflow: model compression (quantization, pruning, knowledge distillation), quality enhancement (LoRA fine-tuning, RL methods like DPO and GRPO), and rigorous evaluation using ROUGE/BLEU metrics and LLM-as-a-judge approaches. The pipeline tests models against BMW's 'Head unit in the cloud' — an AOSP-based infotainment system running natively on cloud compute instances — enabling scalable testing without physical hardware. Source code is published on GitHub.

#deep-learning

#vertex-ai

#edge-ai

Mar 04•11m read time•From cloud.google.com

Table of contents

Small language models: small concept, big potential Challenges of Integrating foundation models into vehicles Converting LLMs to SLMs Post-Compression Quality Enhancement Evaluating Performance for Automotive Tasks The Challenge of Finding the Optimal Configuration Solution: An Automated Workflow for SLM Optimization Implementation: An Automated Workflow with Vertex AI Pipelines

Comment

Bookmark

Copy

Sort: