Alibaba's Tongyi Lab introduces Tongyi DeepResearch, the first fully open-source web agent matching OpenAI's DeepResearch performance. The 30B-parameter model achieves state-of-the-art results across multiple benchmarks through a complete training pipeline combining agentic continual pre-training, supervised fine-tuning, and on-policy reinforcement learning. The system uses fully synthetic data generation, including a novel knowledge graph-based approach and automated PhD-level question creation. It supports multiple inference modes: native ReAct for straightforward reasoning and Heavy Mode (IterResearch) for complex multi-step research tasks. The model is already deployed in production applications including Gaode's navigation assistant and legal research tools, with all code and models released publicly.

13m read timeFrom tongyi-agent.github.io
Post cover image
Table of contents
From Chatbot to Autonomous Agent #Continual Pre‑training and Post‑training Empowered by Fully Synthetic Data #Rollout Mode #End-to‑End Agent Training Pipeline #Real‑World Applications and Impact #Limitations #Series Work #

Sort: