German consulting firm TNG Technology Consulting has released DeepSeek-TNG R1T2 Chimera, a variant of DeepSeek's R1-0528 model that delivers 90% of the original's performance while generating responses with 60% fewer tokens. This efficiency gain translates to 200% faster inference and lower compute costs. The improvement comes from TNG's Assembly-of-Experts (AoE) method, which merges weight tensors from multiple pre-trained models without additional training. The model combines reasoning capabilities from three DeepSeek variants and is available under MIT license on Hugging Face, though it has limitations for function calling and tool use.

7m read timeFrom venturebeat.com
Post cover image
Table of contents
How Assembly-of-Experts (AoE) Differs from Mixture-of-Experts (MoE)Performance and Speed: What the Benchmarks Actually ShowDeployment Considerations and AvailabilityAbout TNG Technology Consulting GmbHWhat It Means for Enterprise Technical Decision-Makers

Sort: