Google Cloud launched the Vertex AI Ranking API, a semantic reranking service that improves search relevance and RAG system performance. The API offers two models: semantic-ranker-default-004 for accuracy and semantic-ranker-fast-004 for speed, both achieving state-of-the-art performance on BEIR benchmarks. It supports up to 200k tokens per request and integrates with existing search systems, RAG engines, AlloyDB, and popular AI frameworks to enhance result precision in minutes rather than months.
Sort: