Improved performance on complex chat, multilingual, reasoning, and agent use cases. Better utilization of available VRAM. Splitting the model between GPU and CPU on macOS. Fixed issues with hanging and errors. New contributors.

1m read time From github.com
Post cover image
Table of contents
New modelsWhat's ChangedNew Contributors

Sort: