Skip to content

feat: lockfile-driven reproducible installs for Artifactory proxies#401

Open
chkp-roniz wants to merge 2 commits intomicrosoft:mainfrom
chkp-roniz:feature/artifactory-enhancements
Open

feat: lockfile-driven reproducible installs for Artifactory proxies#401
chkp-roniz wants to merge 2 commits intomicrosoft:mainfrom
chkp-roniz:feature/artifactory-enhancements

Conversation

@chkp-roniz
Copy link
Contributor

Summary

This PR enhances the Artifactory VCS support added in #354 to make the lockfile the single source of truth for package provenance — ensuring reproducible, auditable installs in enterprise and air-gapped environments.

Why This Matters

The Lockfile Integrity Problem

PR #354 introduced JFrog Artifactory as a first-class package source. However, when a package was installed through an Artifactory proxy, the lockfile recorded host: github.com — the original host, not the actual download source. This created several problems:

  1. Broken reproducibility: A developer installs via Artifactory with ARTIFACTORY_BASE_URL. They commit the lockfile. A colleague runs apm install — but the lockfile says github.com, so APM tries to fetch directly from GitHub instead of Artifactory. In an air-gapped network, this fails silently or unexpectedly.

  2. Supply chain opacity: The lockfile couldn't answer "where did this package actually come from?" — a critical audit question in regulated environments. A package fetched through a corporate-approved proxy was indistinguishable from one fetched directly from the internet.

  3. Air-gap leaks: With ARTIFACTORY_ONLY=1, virtual subdirectory packages (e.g., github/awesome-copilot/skills/review-and-refactor) bypassed the enforcement check and fell through to direct git clone — breaking the air-gap guarantee.

  4. Stale lockfile ambiguity: If a team transitions from direct GitHub access to Artifactory-only, there was no detection of the mismatch between the lockfile (locked to github.com) and the new policy (ARTIFACTORY_ONLY=1). Installs would fail with confusing downloader errors instead of a clear remediation path.

The Principle

The lockfile must be self-contained. It should capture everything needed to reproduce the exact same install — including where each package was fetched from. No environment variables should be required for a lockfile-driven reinstall. This is the same principle that makes package-lock.json, Cargo.lock, and poetry.lock reliable in their ecosystems.

Changes

1. Lockfile records actual download host (lockfile.py, install.py)

When a package is installed through an Artifactory proxy, the lockfile now stores the full proxy path in the host field:

# Before (PR #354):
- repo_url: anthropics/skills
  host: github.com           # ← wrong: actual source was Artifactory

# After:
- repo_url: anthropics/skills
  host: artifactory.example.com/artifactory/apm   # ← actual source

The host field stores hostname/repo-path so that {host}/{repo_url} reconstructs the full download URL. The repo_url remains unchanged (owner/repo) for consistent identity and key matching.

2. Lockfile host drives re-installs (drift.py)

build_download_ref() now prefers the lockfile's host over the manifest's dep_ref.host. This means:

  • With lockfile: Fetches from the recorded host (Artifactory, GHE, etc.) — no env vars needed
  • Without lockfile (first install): Falls back to dep_ref.host + env var routing (existing behavior)
  • With --update: Ignores lockfile, re-resolves from manifest (existing behavior)

This also applies to the transitive dependency download callback in install.py.

3. ARTIFACTORY_ONLY conflict detection (install.py)

When ARTIFACTORY_ONLY=1 is set but the lockfile contains dependencies locked to github.com, APM now exits with a clear error:

ARTIFACTORY_ONLY is set but lockfile contains dependencies locked to direct sources:
  - anthropics/skills/skills/skill-creator (host: github.com)
  - org/some-package (host: github.com)
Re-run with 'apm install --update' to re-resolve through Artifactory, or unset ARTIFACTORY_ONLY.

Additionally, cached packages with github.com in the lockfile are not silently reused when ARTIFACTORY_ONLY is active — they are forced through the download path.

4. ARTIFACTORY_ONLY enforcement for virtual packages (github_downloader.py)

Closes a gap in #354 where virtual file, collection, and subdirectory packages bypassed the ARTIFACTORY_ONLY check:

  • is_virtual_file() — now blocked when ARTIFACTORY_ONLY is set without a proxy
  • is_virtual_collection() — same
  • is_virtual_subdirectory() — already had partial handling, now also blocks when proxy is unavailable

5. Public get_resolved_host() API (github_downloader.py)

New public method on GitHubPackageDownloader that returns the actual download host for a dependency (e.g., the Artifactory proxy path). This replaces direct access to private _parse_artifactory_base_url() / _should_use_artifactory_proxy() from install.py.

Files Changed

File Lines What
src/apm_cli/commands/install.py +79/-6 Lockfile host in tuples, conflict check, cache enforcement, display
src/apm_cli/deps/github_downloader.py +34 get_resolved_host() API, virtual package ARTIFACTORY_ONLY enforcement
src/apm_cli/deps/lockfile.py +12/-7 host_override param in from_dependency_ref and from_installed_packages
src/apm_cli/drift.py +12/-7 build_download_ref prefers lockfile host, works without resolved_commit
tests/unit/test_artifactory_support.py +193 11 new tests across 3 test classes
CHANGELOG.md +11 Unreleased entries

Test Plan

  • All 2131 unit tests pass (3 pre-existing Windows symlink failures unrelated to this PR)
  • 11 new tests covering lockfile host override, build_download_ref lockfile preference, conflict detection, backward compatibility
  • E2E: Install via ARTIFACTORY_BASE_URL → lockfile records Artifactory host
  • E2E: Delete apm_modules, run apm install without env vars → fetches from Artifactory using lockfile host
  • E2E: Lockfile with github.com + ARTIFACTORY_ONLY=1 → clear error with remediation
  • E2E: ARTIFACTORY_ONLY=1 without ARTIFACTORY_BASE_URL → virtual subdirectory packages blocked (not silently cloned)
  • E2E: Non-Artifactory installs unaffected — lockfile records github.com, no regressions
  • Backward compatible: 4-element installed_packages tuples still work, older APM can read new lockfiles

Backward Compatibility

  • Lockfile format: The host field can now contain hostname/path values (e.g., art.example.com/artifactory/apm). Older APM versions read this as a plain string — LockedDependency.from_dict() stores whatever value is there. The field is not used for path computation, so older versions are unaffected.
  • Tuple protocol: from_installed_packages() accepts both 4-element and 5-element tuples via entry[:4] / len(entry) > 4 pattern.

🤖 Generated with Claude Code

Copilot AI review requested due to automatic review settings March 21, 2026 15:41
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR makes installs reproducible in Artifactory-proxied / air-gapped environments by treating the lockfile as the source of truth for dependency provenance (host), and tightening ARTIFACTORY_ONLY enforcement.

Changes:

  • Record the actual resolved download host in the lockfile (including Artifactory proxy path) and prefer that host during re-installs.
  • Add ARTIFACTORY_ONLY lockfile conflict detection and prevent “direct source” cached reuse under ARTIFACTORY_ONLY.
  • Close ARTIFACTORY_ONLY enforcement gaps for virtual packages; add unit tests and changelog entries.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
src/apm_cli/commands/install.py Uses lockfile host for transitive downloads, adds ARTIFACTORY_ONLY conflict detection + cache enforcement, improves output.
src/apm_cli/deps/github_downloader.py Adds get_resolved_host() and enforces ARTIFACTORY_ONLY for virtual file/collection/subdir packages.
src/apm_cli/deps/lockfile.py Adds host_override plumbing so lockfile can store resolved hosts; backward-compatible tuple parsing.
src/apm_cli/drift.py Makes build_download_ref() prefer lockfile host when rebuilding download refs.
tests/unit/test_artifactory_support.py Adds unit tests for lockfile host override, build_download_ref host preference, and conflict detection logic.
CHANGELOG.md Adds Unreleased entries describing the new behavior.

…ackages

The lockfile now records the actual download host (including Artifactory
proxy path) so that subsequent installs fetch from the exact same source
without requiring ARTIFACTORY_BASE_URL to be set. This makes the lockfile
the single source of truth for package provenance.

Key changes:
- Lockfile host field stores the resolved proxy host+path (e.g.
  art.example.com/artifactory/apm) instead of the original github.com
- build_download_ref prefers lockfile host over manifest host
- ARTIFACTORY_ONLY conflict detection: hard error when lockfile has
  github.com deps but ARTIFACTORY_ONLY=1 is set
- ARTIFACTORY_ONLY enforcement for all virtual package types (files,
  collections, subdirectories) — closes a gap where subdirectory
  packages bypassed the check and fell through to direct git clone
- Public get_resolved_host() API on GitHubPackageDownloader

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@chkp-roniz chkp-roniz force-pushed the feature/artifactory-enhancements branch from e1e669b to f705aaf Compare March 21, 2026 16:01
- build_download_ref preserves locked_dep.resolved_ref when no commit SHA
  is available (Artifactory downloads)
- Add tests for no-commit + pinned ref path and host-only override
- Update authentication docs with lockfile reproducibility section
- CHANGELOG entries include PR number (microsoft#401)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants