Meta FAIR Self-Improving Pretraining: 36.2% Factuality Gain at Training Time
Meta FAIR has published a Self-Improving Pretraining approach that uses a strong post-trained model as both a rewriter and evaluator during the pretraining of the next generation model. Rather than optimising for next-token prediction, the method uses RL-shaped sequence generation guided by the post-trained judge. Results on NLP Newsletter's curated weekly paper list: 36.2% factuality gain, 18.5% safety gain, and an 86.3% generation-quality win-rate over standard pretraining baselines. The approach implements the self-improvement loop at the pretraining layer — where behavioral commitments actually lock in — rather than only at fine-tuning or RL post-training.
Why It Matters
Applying self-improvement at the pretraining stage is structurally different from RLHF or SFT corrections applied afterward. Factuality and safety improvements baked into pretraining are more durable than post-hoc patches and propagate to all fine-tuned descendants of the base model.