CyberSecQwen-4B is a 4B-parameter LLM fine-tuned on cybersecurity threat intelligence tasks (CWE classification, CVE-to-CWE mapping, CTI Q&A), designed to run locally on a 12 GB consumer GPU. Trained on a single AMD Instinct MI300X using LoRA with FlashAttention-2 and ROCm 7, it outperforms Cisco's Foundation-Sec-Instruct-8B on CTI-MCQ by +8.7 percentage points while using half the parameters. The core argument is that defensive security practitioners need small, specialized, locally-runnable models because sensitive data cannot leave the premises, air-gapped environments are common, and per-call API costs are prohibitive. A companion 2B model (Gemma4Defense-2B) trained with the same recipe achieves similar results, validating the approach is recipe-driven rather than substrate-specific. The model is Apache 2.0 licensed and available on Hugging Face with a live demo.
Table of contents
Why this mattersWhy a small specialized model, not just a small modelA 5-minute walkthroughWhy AMD MI300XThe training dataThe recipeCompanion model: same recipe, different substrateChallenges and fixesTry it yourselfIntended useWhat's nextClosingSort: