Toto 2.0: Time series forecasting enters the scaling era

Datadog releases Toto 2.0, a family of five open-weights time series forecasting models ranging from 4M to 2.5B parameters. For the first time in the field, a time series foundation model (TSFM) demonstrates reliable improvement with scale—every size outperforms the one below it with no saturation at 2.5B. Toto 2.0 achieves state-of-the-art results on BOOM, GIFT-Eval, and TIME benchmarks, despite being trained only on observability and synthetic data (no public forecasting datasets). It is also 7× more parameter-efficient than Toto 1.0 and dramatically faster at inference thanks to contiguous patch masking (CPM). The post discusses remaining open challenges: closing the long-horizon gap with classical baselines, principled data curation, treating observability metrics as a distinct modality, and building multimodal world models for distributed systems. All model weights and the distributed u-μP training library are released under Apache 2.0.

#ai

#machine-learning

#observability

#time-series-forecasting

May 14•11m read time•From datadoghq.com

Table of contents

Results What’s next for TSFMs?Release Quick start

Comment

Bookmark

Copy

Sort: