"Memory Laundering": Toxic Context Survives AI Summarization Below Detector Thresholds

New research from Wang et al. demonstrates that toxic or adversarially hostile context injected into memory-augmented agents can survive the summarization step as compressed memory entries that fall below standard toxicity detector thresholds — yet retain enough hostile framing to influence downstream generations. The paper introduces the sub-threshold propagation gap (SPG) as a formal metric. Critically: applying sanitization to the finished summary is insufficient. The hostile influence is preserved through the laundering process; only pre-summarization sanitization prevents propagation.

Why It Matters

This directly impacts any pipeline that compresses conversation history, news items, or agent outputs into memory stores before re-ingestion. Pre-compaction sanitization is now a required step, not an optional refinement, for any production agent memory system.