From 48 Seconds to 130 Milliseconds: Vector Search in Tinybird

A customer with 19.35 million social media posts and 768-dimensional embeddings needed fast semantic similarity search without adding a dedicated vector database. Running vector search in Tinybird (powered by ClickHouse) initially produced 15–48 second query times and frequent timeouts. Three key insights drove a ~1,800x improvement: (1) eliminating 58 fragmented HNSW graphs by removing the monthly partition key to create a single global graph; (2) increasing the vector similarity index cache from the default 5 GB to 40 GB so the entire 35 GB index stays resident in RAM, dropping latency from 170 seconds to 26 ms; (3) discovering that HNSW traversal time is nearly constant regardless of top-K, enabling retrieval of 1,000 results in under 200 ms. The final result: stable 80–200 ms queries without a dedicated vector search service.

#vector-search

#clickhouse

#tinybird

May 12•9m read time•From tinybird.co

Table of contents

What the customer needed How vector similarity indexes work The journey: three insights that changed everything The results What this means for you Coming in Part 2: making it work in production

Comment

Bookmark

Copy

Sort: