Vespa now supports ONNX models that exceed the 2 GB protobuf limit by storing weights in external data files. Starting from Vespa 8.544, embedders automatically detect and download external data files when a model references them. This applies to URL-referenced models and Vespa Model Hub models (e.g., EmbeddingGemma 300M, Multilingual-E5-large). Key limitations include: external data support is only for embedders (not ranking expressions), only URL or model-id referenced models are supported (not bundled application package models), and external data files must be co-located with the .onnx file.

3m read timeFrom blog.vespa.ai
Post cover image
Table of contents
The 2 GB limitationWhat changedHow to use itCurrent limitations

Sort: