Optimal Transport Bot Cloaking
Research Paper
Optimal Transport-Guided Adversarial Attacks on Graph Neural Network-Based Bot Detection
View PaperDescription: Graph Neural Network (GNN)-based social bot detection systems are vulnerable to an Optimal Transport (OT)-guided evasion attack that manipulates local graph structures under realistic domain constraints. By modeling $k$-hop ego-neighborhoods as probability measures over spatio-temporal features, an attacker can compute an optimal transport plan to identify "cloak templates" (existing bots near the decision boundary that are misclassified as humans). The attacker can then decode this plan into a sparse set of edge additions and deletions (node editing or node injection). Because this method strictly respects real-world constraints—such as strict edge budgets and the inability to force human "follow-backs"—the resulting perturbations bypass graph structure analysis and cause the GNN to misclassify adversarial bot accounts as legitimate human users.
Examples: A concrete case study from the TwiBot-22 dataset evaluated against a BotRGCN detector (detailed in Appendix U):
- Target: Bot node ID
81779(total degree 5: 2 in-neighbors [bots], 3 out-neighbors [2 bots, 1 human], account age ~5.59 years). Initially correctly classified as a "bot". - Cloak Template: Bot node ID
41575(sparse 1-hop neighborhood, dominated by outgoing edges, account age ~2.64 years). - Attack Execution: The attacker wipes the target's original incident edges and adds exactly 4 constraint-matched edges guided by the OT plan: 2 incoming edges from helper bots, and 2 outgoing edges (1 to a bot, 1 to a human).
- Result: The minimal relative perturbation overwhelms the local neighborhood evidence, causing the BotRGCN detector to flip the prediction of node
81779from "bot" to "human".
Impact: Adversaries can inject new bot accounts or cloak existing ones with near-perfect evasion success (reaching 90–99%+ misclassification rates on benchmarks like Cresci-15 and TwiBot-22) using minimal edge budgets (e.g., $\Delta \le 5$ edge edits). The attack successfully bypasses State-Of-The-Art (SOTA) bot detectors and prominent GNN adversarial defenses (including GNNGuard, GRAND, and RobustGCN), allowing malicious botnets to operate undetected, distort engagement, and amplify misinformation at scale.
Affected Systems:
- GNN-based node classification models deployed for social bot detection that rely on local neighborhood aggregation.
- Specific vulnerable architectures explicitly tested include Heterogeneous GNNs (BotRGCN, Simple-HGNN, Relational Graph Transformer [RGT]) and standard GNNs (GCN, GAT).
- Defense variants utilizing feature similarity pruning (GNNGuard), stochastic regularization (GRAND), or variance propagation (RobustGCN) remain vulnerable.
Mitigation Steps:
- Ego-Neighborhood Distribution Checking: Deploy explicit "cloak detectors" that compute the $k$-hop ego-neighborhood distribution of an account and measure its divergence against known human and bot reference distributions, rather than relying solely on standard GNN logit outputs.
- Constraint-Aware Adversarial Hardening: Train detection models using adversarial training that simulates feasible, domain-constrained perturbations (e.g., incident-edge edits bounded by strict budgets, directionality rules, and temporal/age plausibility) rather than unconstrained global rewiring.
- Signal Diversification: Complement topological graph structure analysis with temporally constant behavioral and content signals that are difficult for an adversary to manipulate or maintain consistently under long-lived evasion constraints.
© 2026 Promptfoo. All rights reserved.