Uber uses a mix of open-source and closed-source models to optimize the performance of large language models (LLMs) for various applications such as Uber Eats recommendations, customer support chatbots, and code development. The training infrastructure leverages robust tools like PyTorch, Kubernetes, Ray, and DeepSpeed for distributed training on both on-premises and cloud-based NVIDIA GPUs. Through continuous pre-training and fine-tuning, Uber enhances models to handle large-scale traffic efficiently, achieving performance comparable to industry-leading models like GPT-4.
Table of contents
Infrastructure StackTraining StackDistributed Training PipelineTraining ResultsAcknowledgementsSort: