LMVD-ID: 4e219d50
Published March 1, 2026

LLM Universal Graph Subversion

Affected Models:Llama 3 8B, Llama 4 17B, Mistral 7B, DeepSeek-V3 671B, Qwen 2.5

Research Paper

Can LLMs Fool Graph Learning? Exploring Universal Adversarial Attacks on Text-Attributed Graphs

View Paper

Description: An evasion vulnerability in Text-Attributed Graph (TAG) learning models allows attackers to induce targeted misclassifications via LLM-generated, coordinated perturbations to both graph topology and textual semantics. By identifying a semantically distant "influencer" node, an attacker can use a separate LLM to selectively delete highly relevant edges, insert a deceptive edge connecting the target to the influencer, and slightly modify the target node's text to include a keyword aligned with the influencer's category. This creates a stealthy "cross-modal shortcut" that enforces an incorrect label prediction. Because the perturbations are highly localized and budget-constrained (e.g., removing one edge, adding one edge, and shifting text semantics), the attack preserves global graph homophily, allowing it to bypass standard graph defenses and homophily-based anomaly detection in a strictly black-box setting.

Examples: The attack coordinates structural and textual manipulation by prompting an LLM (such as DeepSeek or LLaMA) with the target node's local neighborhood and text.

Topology Manipulation (Identifying edges to delete and insert):

"Step 1: Analyze the target node and its neighboring set. Summarize why the nodes in the neighboring set are adjacent to the target node. Be sure to highlight the most prominent factors that guide their strong correlation. Step 2: From the neighboring set, choose the node that is most relevant to the target node. Let’s break it down step by step to ensure we accurately evaluate the correlation. (Note: The chosen node's edge is deleted) Step 3: Based on the inferred prominent factors from Step 1, exclude the node from the following Candidate List that is least related to the target node and analyze why. (Note: The chosen node is inserted as a new fake edge)"

Textual Manipulation (Aligning target text with the new fake edge):

"Step 1: Given the target node titled {influencer_node_title}, identify one keyword that reflects its category. Step 2: Given the paper P1 titled {target_node_title}, your task is to generate a new paper by modifying P1 title so that it meets the following requirements:

  1. It must retain some of the original words from the P1 title.
  2. It should include the keyword identified in Step 1 and be aligned with the target node category determined in Step 1."

Impact: This vulnerability causes severe degradation of node classification accuracy (drops of up to 76.3% in LLM-based reasoners and up to 52% in GNN-based models). It enables an attacker to deterministically control the predicted category of targeted nodes without requiring gradient access or model internals. The attack successfully defeats advanced graph defenses, including adversarial training and edge-reweighting mechanisms (like RUNG).

Affected Systems:

  • Graph Neural Network (GNN) pipelines operating on Text-Attributed Graphs (e.g., GCN, GIN, GraphSAGE, TAGCN, SGCN, R-GCN).
  • LLM-as-Reasoner frameworks (e.g., systems utilizing DeepSeek, Mistral, LLaMA) configured for zero-shot or few-shot node classification tasks on graph data.
  • Note: Nodes with lower degrees (fewer neighbors) are disproportionately susceptible to this vulnerability.

Mitigation Steps:

  • Implement Expressive Text Encoders: Transition from shallow node feature encoding (e.g., TF-IDF) to highly expressive semantic representations (e.g., SBERT, TAPE). The paper demonstrates that richer text features significantly reduce the impact of structural attacks on GNN-based reasoners.
  • Enforce Cross-Modal Consistency Checks: Deploy anomaly detection mechanisms that specifically monitor for suspicious localized alignment between topological shifts (new edge insertions) and semantic drift (keyword insertion in text), as this synergy is the core mechanism of the attack.
  • Apply Low-Degree Node Protections: Introduce stricter confidence thresholds and targeted robust aggregation techniques for low-degree nodes, which exhibit statistically higher vulnerability to localized edge manipulation.

© 2026 Promptfoo. All rights reserved.