A scientific note on what the current evidence actually supports
This note separates three questions that are easy to conflate in prefix-cache work: exact-prefix serving fairness against SGLang's RadixCache, server-side long-context correctness, and tiered compression in the Hugging Face Qwen path. The current evidence supports a narrow but re...