•kache reposted

Teortaxes▶️ (DeepSeek 推特🐋铁粉 2023 – ∞) @teortaxesTex
I am pessimistic on Taalas because I don't see how it scales to large contexts (read @zephyr_z9 on memory). 60 ms to 1K completion with a 8B model is… neat? A neat parlor trick. What *would* excite me is 17Kt/s for 2M tokens. <2 minutes for 40 thoughts of 50K tokens each. https://t.co/pitENO6rzY https://t.co/SVO6Pbv0ZC
Sort: