Improved performance on complex chat, multilingual, reasoning, and agent use cases. Better utilization of available VRAM. Splitting the model between GPU and CPU on macOS. Fixed issues with hanging and errors. New contributors.
•1m read time• From github.com
Sort: