Qwen3.5-9B scores 93.8% on 96 real security AI tests — within 4 points of GPT-5.4 — running entirely on Apple Silicon. Full benchmark results and methodology.

Hacker News is a community-driven platform for sharing and discussing technology news, startups, and programming-related topics. Through user submissions and comments, Hacker News offers insights into emerging technology trends, industry developments, and entrepreneurial ventures. Readers can participate in discussions, share their insights, and stay informed about the latest advancements in technology and innovation.

Hacker News

HomeSec-Bench is a domain-specific benchmark evaluating LLMs on real home security assistant workflows across 96 tests in 15 suites. Results show Qwen3.5-9B running locally on a MacBook Pro M5 via llama.cpp scores 93.8%, only 4.1 points behind GPT-5.4, while using 13.8 GB of unified memory at 25 tok/s. The benchmark covers tool use, security classification, event deduplication, prompt injection resistance, privacy compliance, and more. The key finding is that a 9B local model can match near-frontier cloud performance on specialized tasks with zero API costs and full data privacy.

HomeSec-Bench — Local AI vs Cloud Benchmark