Skip to content

Security: Unsafe torch.load() without weights_only=True creates arbitrary code execution vulnerability #22

@rajpratham1

Description

@rajpratham1

Summary

The codebase contains an unsafe use of torch.load() without the weights_only=True parameter in lingbot_map/aggregator/base.py at line 222, which creates a critical security vulnerability allowing arbitrary code execution when loading pretrained weights.

Vulnerability Details

Location: lingbot_map/aggregator/base.py, line 222

Current Code:

try:
    ckpt = torch.load(pretrained_path)  # ⚠️ UNSAFE
    del ckpt['pos_embed']
    logger.info("Loading pretrained weights for DINOv2")

Issue:
PyTorch's torch.load() uses Python's pickle module by default, which can execute arbitrary code during deserialization. This is a well-known security risk documented in PyTorch's official security guidelines.

Impact

Severity: HIGH

An attacker could:

  1. Create a malicious checkpoint file that executes arbitrary code when loaded
  2. Distribute it as a "pretrained model"
  3. Gain full system access when users load the checkpoint

This is especially dangerous because:

  • The code loads external pretrained weights from user-specified paths
  • Users may download checkpoints from untrusted sources
  • The vulnerability affects the core model initialization

Affected Code Locations

  1. lingbot_map/aggregator/base.py:222 (Critical)

    ckpt = torch.load(pretrained_path)
  2. demo.py:130 (Already Fixed)

    ckpt = torch.load(args.model_path, map_location=device, weights_only=False)

    Note: This explicitly sets weights_only=False, which is intentional for backward compatibility but should be documented.

Recommended Fix

Option 1: Use weights_only=True (Recommended)

try:
    ckpt = torch.load(pretrained_path, weights_only=True)
    del ckpt['pos_embed']
    logger.info("Loading pretrained weights for DINOv2")

Option 2: Add explicit warning and validation

try:
    # Security: Only load checkpoints from trusted sources
    # Using weights_only=False for backward compatibility with older checkpoints
    logger.warning(f"Loading checkpoint from {pretrained_path}. Only use trusted sources!")
    ckpt = torch.load(pretrained_path, weights_only=False)
    del ckpt['pos_embed']
    logger.info("Loading pretrained weights for DINOv2")

PyTorch Documentation Reference

From PyTorch 2.0+ documentation:

"It is recommended to use weights_only=True when loading checkpoints from untrusted sources to prevent arbitrary code execution."

See: https://pytorch.org/docs/stable/generated/torch.load.html

Reproduction Steps

  1. Create a malicious checkpoint:
import torch
import os

class Exploit:
    def __reduce__(self):
        return (os.system, ('echo "Arbitrary code executed!"',))

malicious_checkpoint = {'exploit': Exploit()}
torch.save(malicious_checkpoint, 'malicious.pt')
  1. Load it using the vulnerable code:
ckpt = torch.load('malicious.pt')  # Code executes here!

Suggested Changes

File: lingbot_map/aggregator/base.py

- ckpt = torch.load(pretrained_path)
+ ckpt = torch.load(pretrained_path, weights_only=True)

If backward compatibility with older checkpoints is required:

  try:
+     # Try loading with weights_only=True first (secure)
+     try:
+         ckpt = torch.load(pretrained_path, weights_only=True)
+     except Exception:
+         # Fall back to unsafe loading for older checkpoints
+         logger.warning(
+             f"Loading {pretrained_path} with weights_only=False. "
+             "Only use checkpoints from trusted sources!"
+         )
+         ckpt = torch.load(pretrained_path, weights_only=False)
-     ckpt = torch.load(pretrained_path)
      del ckpt['pos_embed']

Additional Recommendations

  1. Add security documentation in README.md warning users to only download checkpoints from official sources
  2. Implement checkpoint verification using checksums/signatures
  3. Update all torch.load() calls to use weights_only=True by default
  4. Add security policy (SECURITY.md) for reporting vulnerabilities

Environment

  • PyTorch version: 2.0+
  • Python version: 3.10+
  • Affected versions: All current versions

References


Priority: High
Type: Security Vulnerability
Effort: Low (simple fix)
Impact: High (prevents arbitrary code execution)

I'm happy to submit a PR with the fix if needed.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions