A hands-on comparison of three local LLMs — Gemma 4 E4B, GPT-OSS 20B, and Qwen 3.5 9B — tested on real personal workflows using LM Studio on an RTX 3070 with 8GB VRAM. GPT-OSS 20B excels at structured reasoning and content generation but is limited by context window constraints on lower VRAM hardware. Qwen 3.5 9B is the most versatile, handling long context, knowledge tasks, and even image analysis well, making it the go-to general-purpose pick. Gemma 4 E4B stands out for detailed visual/multimodal analysis but has an unusual UX where its reasoning and response are blended together. The key takeaway: no single local model wins at everything, and rotating between models based on task type — just like with cloud AI — yields the best results.

5m read timeFrom xda-developers.com
Post cover image
Table of contents
Before we get into itGPT-OSS 20B pulls ahead with structureQwen 3.5 9b is the knowledge and context generalistGemma 4 E4B is the multimodal specialist

Sort: