Efficient KV-Cache Prompt Leakage
Research Paper
OptiLeak: Efficient Prompt Reconstruction via Reinforcement Learning in Multi-tenant LLM Services
View PaperDescription: A vulnerability in multi-tenant LLM serving frameworks allows attackers to reconstruct the private prompts of other users via an active Key-Value (KV) cache side-channel. Frameworks that utilize shared KV caches alongside specific scheduling policies, such as Longest Prefix Match (LPM), prioritize waiting requests based on the length of their matched prefix tokens. An attacker can exploit this by iteratively sending batches of guessed tokens mixed with dummy queries. If a guessed token matches a victim's cached prompt, the cache hit causes the scheduling engine to prioritize the response, creating a measurable gap in the response ordering or Time to First Token (TTFT). By using a locally optimized model to generate high-probability domain-specific guesses, an attacker can efficiently reconstruct sensitive user prompts token-by-token.
Examples: To reconstruct a victim's prompt $s^{ ext{victim}}$ at token position $t$, an attacker can execute the following exploit:
- Generate a set of candidate token guesses $Q_{ ext{gen}}$ for the next token using a local adversarial model fine-tuned on the target domain.
- Construct a batch of queries $Q = { ilde{q}^1, ..., ilde{q}^k}$ consisting of the $Q_{ ext{gen}}$ candidate queries placed between two groups of $m$ dummy queries. All generated queries share the known prefix $q_{<t}$ but differ in the final token $q_t$. Dummy queries use an intentionally low-probability token.
- Send the entire batch $Q$ to the targeted LLM server.
- Observe the response sequence position (or TTFT) of each query. Under an LPM scheduling policy, a correct token guess will hit the KV cache and be prioritized over the dummy queries.
- Identify a cache hit using the detection formula: $\left[\min_{ ilde{q}\in Q_{ ext{gen}}\setminus{ ilde{q}^{i}}} ext{pos}( ilde{q}) ight] - ext{pos}( ilde{q}^{i}) > \lceil heta\cdot m ceil$. If the positional gap exceeds the threshold, the token in $ ilde{q}^i$ is confirmed as the victim's actual token.
Impact: Unauthorized disclosure of sensitive information. Attackers sharing a multi-tenant LLM infrastructure (e.g., in a cloud environment) can accurately reconstruct the proprietary or confidential queries (such as medical symptoms or financial data) submitted by other tenants. The use of reinforcement-learning-optimized models reduces the attack cost (Average Requests Per Token) by up to 12.48x, making real-world prompt extraction highly efficient and practical.
Affected Systems:
- Multi-tenant LLM inference and serving engines employing shared KV-cache pools across disparate users.
- Frameworks utilizing Longest Prefix Match (LPM) scheduling policies (e.g., SGLang).
- Frameworks that support token-level KV cache matching and exhibit observable Time-To-First-Token (TTFT) or First-Come-First-Served (FCFS) priority side channels (e.g., vLLM, LightLLM).
- Multi-tenant semantic cache implementations (e.g., GPTCache).
Mitigation Steps:
- User-level Cache Isolation: Restrict KV-cache sharing strictly to intra-tenant requests, completely eliminating cross-tenant cache sharing.
- Selective KV-cache Sharing: Implement fine-grained permission management for cache resources, only allowing cache sharing among explicitly authorized user groups.
- Response Timing Obfuscation: Inject random timing noise into LLM service responses to obscure TTFT and disrupt response ordering signals, neutralizing the attacker's ability to reliably detect cache hits.
- Proactive Cache Eviction: Deploy an automated risk assessment simulator (e.g., an internal model running the OptiLeak pipeline) to evaluate newly cached entries. Automatically evict KV cache entries that can be reconstructed below a specific threshold of request attempts.
© 2026 Promptfoo. All rights reserved.