Skip to content

[REVIEW] model-supply-chain: add remote-code and final-artifact provenance gates #1593

@sato820

Description

@sato820

Skill Being Reviewed

Skill name: model-supply-chain
Skill path: skills/ai-security/model-supply-chain/

False Positive Analysis

Benign code that triggers a false positive:

from huggingface_hub import snapshot_download

PINNED_REVISION = "0f3c9b2d1b7f4b1d7b0f4e8e9b8e3a3bb4f9c012"
model_dir = snapshot_download(
    repo_id="acme-internal-mirror/mistral-7b-instruct-v0.3",
    revision=PINNED_REVISION,
    allow_patterns=["*.safetensors", "config.json", "tokenizer.json"],
)
# CI verifies a signed SLSA/in-toto import attestation whose subject digest
# matches every artifact in model_dir and links back to the upstream revision.

Why this is a false positive:
The skill classifies a model from a non-original publisher as High. That is right for arbitrary mirrors, but over-flags an internal promotion registry with immutable upstream revision, signed provenance, manifest digest verification, and restricted write access. The risk should depend on attestation strength, not only original-publisher status.

Coverage Gaps

Missed variant 1:

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained(
    "research-lab/custom-architecture-llm",
    revision="main",
    trust_remote_code=True,
)
tokenizer = AutoTokenizer.from_pretrained(
    "research-lab/custom-architecture-llm",
    revision="main",
    trust_remote_code=True,
)

Why it should be caught:
The skill searches for from_pretrained and unpinned revisions, but it does not explicitly classify trust_remote_code=True as a critical/high condition. In Transformers this can execute repository-provided Python modeling/tokenizer code during load, so it belongs next to pickle.load and unsafe torch.load.

Missed variant 2:

FROM ollama/ollama:latest
RUN ollama pull hf.co/unverified-org/customer-support-model:Q4_K_M
# Modelfile
FROM hf.co/unverified-org/customer-support-model:Q4_K_M
TEMPLATE "{{ .System }}\n{{ .Prompt }}"

Why it should be caught:
The skill globs for *.gguf and mentions adapters, but it does not require provenance for the final served artifact: quantized GGUF/GGML files, Ollama Modelfiles, LoRA/PEFT adapters, tokenizer assets, and chat templates. A verified base model can still be subverted by an unverified quantization or adapter merge.

Edge Cases

Most production deployments serve a bundle, not one upstream file: base weights plus adapter, tokenizer, chat template, quantization metadata, and runtime config. The review should verify the final artifact manifest and signed promotion path. It should also distinguish digest stored in the same untrusted repo, unsigned internal manifest, and signed attestation from a trusted builder.

Remediation Quality

  • Fix resolves the vulnerability
  • Fix doesn't introduce new security issues
  • Fix doesn't break functionality
  • Issues found: Add explicit gates for trust_remote_code=True, custom remote model/tokenizer classes, final-artifact manifests, and signed internal mirror attestations.

Suggested detection additions:

Grep: "trust_remote_code\s*=\s*True|AutoConfig.from_pretrained|AutoTokenizer.from_pretrained" in **/*.py
Grep: "PeftModel.from_pretrained|LoraConfig|adapter_model|merge_and_unload" in **/*.py
Grep: "ollama pull|FROM hf.co|Modelfile|gguf|ggml" in **/*.{Dockerfile,dockerfile,txt,md,yaml,yml}
Grep: "chat_template|tokenizer_config|special_tokens_map" in **/*.{json,jinja,txt}

Comparison to Other Tools

Tool Catches this? Notes
Semgrep Partial Can catch trust_remote_code=True, pickle, and unsafe torch.load with custom rules; weak on model/adaptor/quantization lineage.
CodeQL Partial Good for Python deserialization/command execution flows; weak for Hugging Face/Ollama provenance without custom queries.
SLSA / in-toto / Sigstore Partial Strong for proving provenance when integrated, but they do not decide whether the final served bundle includes unverified adapters/tokenizers/templates.

Overall Assessment

Strengths: Strong provenance, data-lineage, unsafe-deserialization, SLSA, PoisonGPT, and ShadowRay coverage.

Needs improvement: Treat the final composed inference bundle as the security boundary, and avoid treating signed internal mirrors like arbitrary third-party model repos.

Priority recommendations:

  1. Add a critical/high finding for trust_remote_code=True.
  2. Require manifests covering weights, adapters, tokenizers, chat templates, GGUF/GGML quantizations, and Modelfiles.
  3. Accept internal mirrors only when signed provenance links the mirror artifact to an immutable upstream revision and approved promotion workflow.

Bounty Info

  • I have read and agree to the CONTRIBUTING.md bounty terms
  • Preferred payment method: Crypto after maintainer acceptance

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions