fix(security): replace pickle-based torch.save/load with safetensors#9
Merged
fix(security): replace pickle-based torch.save/load with safetensors#9
Conversation
Replace all torch.save/torch.load(weights_only=False) usage with the safetensors library to prevent arbitrary code execution via pickle deserialization of untrusted model files. New checkpoint format stores tensors in safetensors binary format and non-tensor metadata as JSON in the safetensors header. No legacy .pth/.pkl loading is retained — all torch.load calls are removed. - Add anomaly_match/data_io/checkpoint_io.py with save_checkpoint() and load_checkpoint() functions - Convert test_model.pth fixture to test_model.safetensors - Update all file extension references from .pth to .safetensors - Add safetensors to dependencies - Add unit tests for checkpoint_io (round-trip, security)
giusgal
reviewed
Mar 25, 2026
Comment on lines
187
to
223
Collaborator
There was a problem hiding this comment.
It seems like this function is not used anymore (session.py only uses save_model)
Collaborator
Author
There was a problem hiding this comment.
Good catch. Removed both save_model_checkpoint and load_model_checkpoint — neither had production callers (session.py only uses save_model/load_model). Also removed their tests and vulture whitelist entries. See adcf233.
Remove save_model_checkpoint and load_model_checkpoint from SessionIOHandler — neither is called from production code (session.py only uses save_model/load_model). Remove their tests and vulture whitelist entries.
giusgal
approved these changes
Mar 27, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
torch.save/torch.load(weights_only=False)with safetensors to eliminate arbitrary code execution risk from pickle deserializationcheckpoint_io.pymodule handles all checkpoint serialization: tensors in safetensors binary format, metadata as JSON in the header.pth/.pklloading retained — alltorch.loadcalls removed entirelytest_model.pthfixture totest_model.safetensorsSecurity
The safetensors format stores only raw tensor bytes and plain JSON strings — no pickle, no arbitrary object deserialization, no code execution possible. The
_checkpoint_object_hookrestores enum values via name lookup only (not eval/exec).Test plan
checkpoint_io.py(round-trips, optimizer state, security)