Lyft built LyftLearn Serving, an ML platform handling millions of predictions per second using a microservices architecture. Instead of a shared monolithic system, they generate independent microservices for each team via configuration templates. The platform separates data plane concerns (runtime performance, inference execution) from control plane concerns (deployment, versioning, testing). Key features include automated model self-tests, flexible library support (TensorFlow, PyTorch), and dual interfaces for engineers and data scientists. The architecture uses Flask/Gunicorn for HTTP serving, Kubernetes for orchestration, and Envoy for load balancing. Over 40 teams migrated from the legacy system, achieving team autonomy while maintaining platform consistency.
Table of contents
✂️ Cut your QA cycles down to minutes with automated testing (Sponsored)Two Planes of ComplexityThe Requirements ProblemCut Code Review Time & Bugs in Half (Sponsored)The Microservices SolutionThe Runtime ArchitectureThe Configuration GeneratorModel Self-TestsHow an Inference Request Flows Through the SystemDevelopment Workflow and DocumentationConclusionSPONSOR US1 Comment
Sort: