mlflow-audit

Static analysis tool to detect missing MLFLOW_ALLOW_PICKLE_DESERIALIZATION guards in Python codebases using MLflow's pickle deserialization.

Background

MLflow uses MLFLOW_ALLOW_PICKLE_DESERIALIZATION as a security control to block unsafe pickle deserialization by default. Every pickle load in the codebase is supposed to check this guard before proceeding.

This tool was built after discovering that MLflow's LangChain integration was missing this guard in _load_from_pickle() — reported via GitHub private security advisory GHSA-cxjq-35gw-4m9f.

The finding: while sklearn, tensorflow, pytorch, pmdarima, dspy, and evaluation artifacts all check the guard — the LangChain integration did not. An attacker could craft a malicious LangChain model and achieve RCE on anyone who loads it, bypassing the security control entirely.

This tool automates the audit so you can check your own MLflow-based codebase for the same class of issue.

Installation

pip install -e .

Usage

# Scan a directory
mlflow-audit ./your-mlflow-project

# Scan MLflow source itself
mlflow-audit ./mlflow

# Show guarded loads too
mlflow-audit ./mlflow --show-guarded

Example Output

[*] Scanning 847 Python files in ./mlflow...

mlflow-audit: Pickle Deserialization Guard Scanner Total pickle calls found : 14 Unguarded (HIGH) : 1 Guarded (OK) : 13

[!] UNGUARDED PICKLE LOADS — REVIEW REQUIRED [HIGH] ✗ UNGUARDED File: mlflow/langchain/utils/logging.py:452 Call: cloudpickle.load Context: 449 | def _load_from_pickle(path): 450 | with open(path, "rb") as f:

452 | return cloudpickle.load(f) 453 |

What It Detects

Pickle and cloudpickle load calls that are missing any of:

MLFLOW_ALLOW_PICKLE_DESERIALIZATION
allow_pickle
weights_only

within a 10-line context window.

What It Does Not Do

This is not a malware scanner. It does not inspect pickle file contents for malicious payloads — tools like PickleScan do that.

This tool specifically audits whether your MLflow integration code consistently applies the security guard that MLflow provides.

Vulnerability Reference

Advisory: GHSA-cxjq-35gw-4m9f
CWE: CWE-502 Deserialization of Untrusted Data
Severity: High
Reporter: Kartik Nair

Author

Kartik Nair — github.com/Carrtik — medium.com/@contact.kartikn

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
mlflow_audit		mlflow_audit
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

mlflow-audit

Background

Installation

Usage

Example Output

[*] Scanning 847 Python files in ./mlflow...

mlflow-audit: Pickle Deserialization Guard Scanner Total pickle calls found : 14 Unguarded (HIGH) : 1 Guarded (OK) : 13

What It Detects

What It Does Not Do

Vulnerability Reference

Author

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

mlflow-audit

Background

Installation

Usage

Example Output

[*] Scanning 847 Python files in ./mlflow...

mlflow-audit: Pickle Deserialization Guard Scanner Total pickle calls found : 14 Unguarded (HIGH) : 1 Guarded (OK) : 13

What It Detects

What It Does Not Do

Vulnerability Reference

Author

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages