A practical walkthrough of adding semantic search to a Rails blog using sqlite-vec (a lightweight SQLite vector extension), a self-hosted embedding service running Google's EmbeddingGemma-300m model in Docker, and Kamal for deployment. The author benchmarks three embedding models (EmbeddingGemma-300m, multilingual-E5-base, E5-large) on memory and latency, settles on Gemma for its balance of quality and resource usage, and describes the full Rails architecture: a Python/sentence_transformers microservice exposing a /embed endpoint, ArticleEmbedding upsert flow, and kNN search via sqlite-vec. The setup complements existing BM25/FTS5 full-text search rather than replacing it, with RRF hybrid fusion planned next.
Sort: