Meta Engineering introduces SilverTorch, a unified model-based retrieval system for recommendation that replaces a microservice mesh with a single PyTorch neural network. Under the 'Index as Model' paradigm, every retrieval component — ANN search, eligibility filtering, neural reranking, and multi-task scoring — becomes a tensor or nn.Module inside one model. Key innovations include a Bloom index filter (291–523× faster than CPU inverted index), fused Int8 ANN search (2.2–14.7× faster than FAISS-GPU), and probe-then-filter co-design. In an 80M-item production evaluation, SilverTorch achieves 23.7× higher throughput and 20.9× better compute cost efficiency versus a CPU baseline, while enabling neural reranking and multi-task scoring within sub-100ms latency. The architecture also supports streaming index updates for near-real-time content freshness and provides a natural integration point for LLMs as additional model modules.
Table of contents
Moving From Microservice Mesh to One Integrated Neural NetworkThe Redesign: Pure PyTorch Modules for Every StageBenefits — What Shows Up Outside the SystemEngineering for Scale and FreshnessThe Evolution of SilverTorch and What’s NextLooking AheadRead the PaperAcknowledgmentsSort: