Zero-Training Cross-Domain Inversion
Research Paper
Zero2Text: Zero-Training Cross-Domain Inversion Attacks on Textual Embeddings
View PaperDescription: A cryptographic weakness exists in the privacy assumptions of vector embeddings used in Retrieval-Augmented Generation (RAG) systems and Vector Databases. The vulnerability, designated "Zero2Text," allows an unauthenticated attacker to reconstruct raw text from captured vector embeddings without access to the victim model's parameters, gradients, or training data. Unlike prior embedding inversion attacks that require training large decoders on domain-specific datasets, this vulnerability leverages a training-free, recursive online alignment mechanism. An attacker utilizes a local pre-trained Large Language Model (LLM) to generate token candidates and iteratively refines a linear projection matrix via Ridge Regression using a limited number of API queries to the victim embedding model. This enables the high-fidelity recovery of sensitive cross-domain text (e.g., medical records recovered using a general-purpose model) solely through black-box API interaction.
Examples: To reproduce the inversion attack against a target embedding $e_v$ (e.g., generated by OpenAI Text-Embedding-3-large):
- Initialization: The attacker instantiates a local LLM generator (e.g., Qwen3-0.6B) and a local embedder (e.g., all-mpnet-base-v2).
- Iterative Reconstruction: For each token step $t$:
- Generation: The local LLM predicts logits $y^t$ for the next token.
- Diversity Filtering: Select $K_S$ (e.g., 1000) candidates where pairwise cosine similarity is below threshold $Th_w$ (e.g., 0.9).
- Online Optimization:
- Send a subset of candidate sentences ($K_A \times \gamma^{t-1}$) to the victim API to obtain ground-truth embeddings $\tilde{E}^t$.
- Solve a Ridge Regression problem to update the projector matrix $W^t$: $$W^t = (E^{t\top}E^t + \lambda I)^{-1} E^{t\top}\tilde{E}^t$$
- Verification: Score candidates using a hybrid function of the LLM prior and the projected cosine similarity: $$S(e_i, t) = Z(y_i) + \text{conf}_t \times Z(\text{cos}(e_i W^{t-1}, e_v))$$
- Selection: Apply Beam Search (e.g., beam size 10) to select the best partial sequence.
- Result: The process repeats until
[EOS]is generated, outputting the reconstructed raw text.
See the methodology in "Zero2Text: Zero-Training Cross-Domain Inversion Attacks on Textual Embeddings".
Impact:
- Data Leakage: Attackers can recover Personally Identifiable Information (PII), trade secrets, and proprietary documents stored as "secure" embeddings in vector databases.
- Privacy Bypass: The attack demonstrates that vector embeddings do not function as non-invertible one-way functions, violating the privacy guarantees of RAG architectures.
- Cross-Domain Exploitation: Sensitive domain-specific data (e.g., medical or legal) can be recovered even if the attacker only possesses a general-purpose language model.
Affected Systems:
- Vector Databases and RAG pipelines exposing embedding vectors.
- Closed-source Embedding APIs (e.g., OpenAI Text-Embedding-3-small/large).
- Open-source Embedding Models (e.g., GTR-Base, Qwen3-Embedding).
Mitigation Steps:
- Access Control: Restrict public access to embedding generation APIs and vector database query endpoints; ensure only authorized applications can submit text for embedding or retrieve raw vectors.
- Traffic Analysis: Implement rate limiting and anomaly detection to identify the specific query patterns of the attack (iterative, recursive batch queries used for online alignment/Ridge Regression updates).
- Note on Standard Defenses: The research indicates that standard defenses such as Random Noise Injection and Local Differential Privacy (LDP) (including Normalized Planar Laplace and Purkayastha Mechanism) are insufficient, as Zero2Text maintains high reconstruction fidelity (e.g., BLEU-1 score of 20.36 under heavy noise) compared to baselines.
© 2026 Promptfoo. All rights reserved.