1 articles

#distributed-training

Google DeepMind's Decoupled DiLoCo trains a 12B Gemma model across four US regions with mixed TPU generations and self-healing failure recovery.

Curated AI insights — sent when there's something worth your inbox.