Amazon OpenSearch Service now supports backing knn_vector fields with Amazon S3 Vectors via a new s3vector engine, enabling cost-efficient hybrid search at scale. This trades higher latency (50-300ms vs 5-60ms for hot HNSW) for dramatically lower vector storage costs — up to 90% cheaper for large datasets. The setup involves creating an index with the s3vector engine at index creation time (immutable), after which standard OpenSearch hybrid queries combining BM25 lexical scoring and k-NN vector search work transparently. Key limitations include a max k of 100, post-filtering only, no snapshot support, and no recall tuning knobs. The recommended production pattern is tiered: cold embeddings in S3 Vectors queried via OpenSearch for agentic/RAG workloads, with hot subsets promoted to OpenSearch Serverless HNSW for latency-sensitive traffic.
Table of contents
Why hybrid search, and why this matters for costThe cost-latency tradeoff in practiceSetting up OpenSearch with S3 VectorsLimitations to plan aroundPromoting hot data to OpenSearch ServerlessKey takeawaysSort: