Sub Quadratic's subQ: Sparse Attention Claims Under Scrutiny

May 6, 20262 min read|agenticonsult Intelligence

Sub Quadratic Launches subQ — Big Architecture Bet, Thin Evidence

Sub Quadratic dropped its "subQ" model with a striking headline: a 12-million-token context window powered by sparse attention, running 52× more compute-efficient than FlashAttention at 1M tokens, at under 5% of Anthropic Opus pricing. The architecture thesis is technically legitimate — sparse attention pre-selects semantically relevant tokens anywhere in a context rather than scoping to a local window, avoiding the quadratic compute blowup that caps dense attention. If the claims bear out, it would matter for both cloud inference costs and local model viability.

What the Source Actually Says

Tim Carambat (AnythingLLM) published a technical audit on launch day. His central finding: every published benchmark tests the 1M-preview model, not the headline 12M model. The 12M model had no public benchmarks and no early access — Carambat applied and expected to receive only the 1M-preview.

On SWEBench Verified, the 1M-preview scored 81.8 against frontier-level competition, but Opus 4.7 scored higher. On MRCRv2 long-context retrieval at 1M tokens, Carambat spotted a direct inconsistency: the video shows 62%, the company's website shows 65.9% for the same test. The video also omitted Opus 4.6 and GPT-5.5 comparison rows visible on the website table — presenting the benchmark more favorably by leaving out higher-scoring rivals.

No technical report accompanied the launch. The "98% accuracy" framing amplified on social media is not traceable to any published benchmark artifact. Carambat's broader framing is cautious optimism: DeepSeek v4 shipped hybrid attention the prior week with similar long-context efficiency goals, confirming the direction is real even if subQ's specific claims remain unverified.

Strategic Take

Sparse attention is a credible long-context efficiency path — the convergence with DeepSeek v4's hybrid approach signals a genuine industry trend, not a one-off claim. But subQ's launch mixes real architectural ambition with unverified headline numbers and measurable benchmark inconsistencies. Hold off on roadmap commitments until an independent evaluation of the actual 12M model is available.

AI Intelligence Newsletter

Curated AI insights — sent when there's something worth your inbox.

This briefing was assembled with AI assistance from curated sources. All facts have been verified against original publications.

Sub Quadratic's subQ: Sparse Attention Claims Under Scrutiny

Sub Quadratic Launches subQ — Big Architecture Bet, Thin Evidence

What the Source Actually Says

Strategic Take

AI Intelligence Newsletter

Sources

Related Articles

DeepSeek-V4 and Kimi-K2.6 Shift the Open-Weights Agentic Baseline

DeepSeek V4-Pro: 10× KV Cache Efficiency at Open-Source Scale

DeepSeek V4 Shocks Users with Cost Differential vs Claude at 10M+ Tokens

AI Intelligence Newsletter