XDA Developers

A hands-on comparison of three local LLMs — Gemma 4 E4B, GPT-OSS 20B, and Qwen 3.5 9B — tested on real personal workflows using LM Studio on an RTX 3070 with 8GB VRAM. GPT-OSS 20B excels at structured reasoning and content generation but is limited by context window constraints on lower VRAM hardware. Qwen 3.5 9B is the most versatile, handling long context, knowledge tasks, and even image analysis well, making it the go-to general-purpose pick. Gemma 4 E4B stands out for detailed visual/multimodal analysis but has an unusual UX where its reasoning and response are blended together. The key takeaway: no single local model wins at everything, and rotating between models based on task type — just like with cloud AI — yields the best results.

I tested 3 local LLMs on my actual work — and each model won at something different

Qwen 3.5 9b is the knowledge and context generalist