A hands-on course exploring how to run open-source LLMs (Gemma 4, Kimi, GLM, Qwen, MiniMax) locally and in the cloud for use as coding assistants. The instructor walks through setting up Ollama on Windows/WSL with an RTX 4060, discovering hardware limitations (8GB VRAM is insufficient for 32K context windows needed by Claude Code), and evaluating multiple coding harnesses including Claude Code, Codex, Goose CLI, Kilo, and Pi Coding Agent. Key findings: Kimi 2.5 performed best among open models for agentic coding tasks, Claude Code worked surprisingly well with open models via Ollama Cloud, and most consumer hardware cannot run these models locally at the context sizes required. Ollama Cloud ($20-30/month subscription) is recommended as the most practical cloud serving option.
Sort: