A developer shares their experience replacing a ChatGPT Plus subscription with a locally-run Qwen2.5-Coder 32B model for coding tasks. The specialized coding model, run via Ollama and paired with the Continue extension, delivers focused, concise responses that outperform ChatGPT for programming work. The tradeoff requires significant GPU VRAM (20GB for the 32B model, less for 14B), but the cost savings, privacy benefits, and coding-specific performance make it worthwhile for developers whose AI usage is primarily code-related.

3m read timeFrom xda-developers.com
Post cover image
Table of contents
It earns its keep where I actually spend my timeThe case for keeping your subscriptionFree and almost as good beats expensive and slightly better

Sort: