HuggingFace Releases ml-intern: Autonomous ML Research and Training Agent

HuggingFace has launched ml-intern as a public CLI and web app — an autonomous ML research agent that independently researches papers, walks citation graphs, implements ideas in GPU sandboxes, and iterates training runs. The agent's first output, nanowhale (100M-parameter MoE), improved a scientific reasoning benchmark from 10% to 32% in under 10 hours and beat Codex on HealthBench by 60% using synthetically generated healthcare training data. First users receive $1,000 in GPU resources plus Anthropic credits.

Why It Matters

ml-intern's GPQA score (32%) outperforms Claude Code's benchmark performance (22.99%) on the same task — marking a concrete milestone where autonomous AI research agents exceed human-guided coding assistants on structured scientific reasoning.