5 articles

#rag

Memori Claims 81.95% LoCoMo Accuracy at 4.97% of Full-Context Tokens

Memori hits 81.95% LoCoMo accuracy at just 1,294 tokens/query — 67% smaller prompts than Zep, 20x cheaper than full-context — with MCP server and multi-agent attribution model.

April 27, 20261 min read

Researchbreaking

Skill-RAG Triggers Retrieval Only When LLM Is About to Fail

Skill-RAG predicts LLM failure via hidden-state probing, retrieves only when needed, and routes failure types to specialized skills — beating RAG benchmarks on efficiency and accuracy.

April 27, 20261 min read

Technologybreaking

OpenAI Deprecating text-embedding-3-small Embedding Model

OpenAI is deprecating text-embedding-3-small, prompting calls to open-source the model so trillions of indexed tokens remain queryable after the closed-source model is retired.

April 23, 20261 min read

Technologyreport

Context Engineering in Production: Patterns from 50 Enterprise Deployments

An analysis of context engineering patterns emerging from 50 production AI deployments — covering RAG architectures, knowledge graph integration, multi-layer memory systems, and the shift from prompt engineering to structured context pipelines.

March 28, 202622 min read

Technologyreport

Knowledge Graphs Meet LLMs: Integration Patterns for Grounded AI Systems

How leading organizations combine knowledge graphs with LLMs to build AI systems that reason over structured relationships — covering GraphRAG architectures, entity resolution, and the emerging graph-native context engineering paradigm.

March 1, 202620 min read

AI Intelligence Newsletter

Curated AI insights — sent when there's something worth your inbox.