The context window has been shattered: Subquadratic debuts a 12-million-token window

Subquadratic, a Miami-based AI startup, has launched its first model featuring a 12-million-token context window powered by a novel architecture called Subquadratic Selective Attention (SSA). Unlike transformer-based models where attention cost scales quadratically with context length, SSA claims linear scaling in both compute and memory. The model reportedly runs 52x faster than dense attention at 1M tokens, scores 83% on MRCR v2 (beating GPT-5.5's 74%), and achieves 92.1% on needle-in-a-haystack retrieval at 12M tokens. It also claims 82.4% on SWE-Bench Verified, edging out Anthropic's Opus 4.6 and Google's Gemini 3.1 Pro. The company is launching an API with the full 12M-token window and a CLI coding agent (SubQ Code), with a 50M-token window targeted for Q4. Subquadratic has raised $29M at a $500M valuation. The article notes caveats including single-run benchmarks and a cautionary parallel with Magic.dev's 100M-token claims that never materialized publicly.

#llm

Yesterday•7m read time•From thenewstack.io

Table of contents

What came before What SSA says it does differently The benchmarks What Subquadratic is shipping now Funding

Comment

Bookmark

Copy

Sort: