Abstract Chain-of-Thought Paper Claims 11.6× Fewer Reasoning Tokens
A new paper introduces Abstract Chain-of-Thought (Abstract CoT), a two-stage training approach for efficient reasoning. In stage one, models learn to map reserved abstract tokens onto the meanings of real reasoning chains. In stage two, reinforcement learning sharpens those abstract tokens into a model-invented private shorthand. Final answers remain in natural language; only the intermediate reasoning happens in the compressed token space. The method reportedly matches or beats verbal chain-of-thought on math benchmarks, multi-hop QA, and instruction-following tasks, with 11.6× fewer reasoning tokens, and claims cross-family generalization.
Why It Matters
If the 11.6× token reduction reproduces across model families and task types, it represents an order-of-magnitude cost reduction for any reasoning-heavy agentic workflow—the kind of step-change that makes previously cost-prohibitive reasoning chains commercially viable.