Alibaba's Tongyi Lab introduces Tongyi DeepResearch, the first fully open-source web agent matching OpenAI's DeepResearch performance. The 30B-parameter model achieves state-of-the-art results across multiple benchmarks through a complete training pipeline combining agentic continual pre-training, supervised fine-tuning, and on-policy reinforcement learning. The system uses fully synthetic data generation, including a novel knowledge graph-based approach and automated PhD-level question creation. It supports multiple inference modes: native ReAct for straightforward reasoning and Heavy Mode (IterResearch) for complex multi-step research tasks. The model is already deployed in production applications including Gaode's navigation assistant and legal research tools, with all code and models released publicly.
Table of contents
From Chatbot to Autonomous Agent #Continual Pre‑training and Post‑training Empowered by Fully Synthetic Data #Rollout Mode #End-to‑End Agent Training Pipeline #Real‑World Applications and Impact #Limitations #Series Work #Sort: