From 3b391ff817fd999cf0d16101f1e4c58ec2266805 Mon Sep 17 00:00:00 2001 From: Daniel Meppiel Date: Sun, 10 May 2026 18:31:17 +0200 Subject: [PATCH 1/5] refactor(tests): retire script enumeration; pytest discovers tests/integration/ PR2 of #1166. Per acceptance criteria: - scripts/test-integration.sh no longer enumerates individual pytest files; it invokes pytest tests/integration/ once and lets the marker registry (PR1, #1167) handle per-test gating. Script shrinks 778 -> 402 lines. - New test files dropped into tests/integration/ are picked up automatically; the only contract is to add the right requires_* marker (or hermetic = no marker). - Adds requires_apm_binary / requires_github_token / requires_runtime_copilot pytestmarks to 11 files that needed them now that pytest discovers the full directory (test-coverage-expert audit). - Adds 'live' to the addopts deselect list so test_skill_bundle_live.py preserves its current 'opt-in only' behaviour. - Adds APM_RUN_INTEGRATION_TESTS=1 to ci-integration.yml so network-integration tests (transport selection) actually run in CI. - CI integration-tests job timeout 20 -> 30 min to absorb the newly-discovered tests. - Docs (integration-testing.md) reframe the script as the CI orchestrator, not legacy. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --- .github/workflows/ci-integration.yml | 3 +- CHANGELOG.md | 1 + .../docs/contributing/integration-testing.md | 23 +- pyproject.toml | 2 +- scripts/test-integration.sh | 474 ++---------------- tests/integration/test_apm_dependencies.py | 2 + tests/integration/test_global_scope_e2e.py | 2 + .../test_install_subdir_dedup_e2e.py | 2 + .../test_install_verbose_redaction_e2e.py | 2 + .../integration/test_intra_package_cleanup.py | 2 + tests/integration/test_link_rewrite_e2e.py | 2 + tests/integration/test_local_content_audit.py | 2 + tests/integration/test_local_install.py | 2 + tests/integration/test_marketplace_e2e.py | 2 + .../test_mcp_env_var_copilot_e2e.py | 5 + tests/integration/test_skill_integration.py | 7 +- .../integration/test_uninstall_dry_run_e2e.py | 5 +- tests/integration/test_uninstall_multi_e2e.py | 5 +- 18 files changed, 104 insertions(+), 439 deletions(-) diff --git a/.github/workflows/ci-integration.yml b/.github/workflows/ci-integration.yml index 78fc55a64..7f0673630 100644 --- a/.github/workflows/ci-integration.yml +++ b/.github/workflows/ci-integration.yml @@ -148,12 +148,13 @@ jobs: - name: Run integration tests env: APM_E2E_TESTS: "1" + APM_RUN_INTEGRATION_TESTS: "1" GITHUB_APM_PAT: ${{ secrets.GH_CLI_PAT }} ADO_APM_PAT: ${{ secrets.ADO_APM_PAT }} run: | chmod +x scripts/test-integration.sh uv run ./scripts/test-integration.sh - timeout-minutes: 20 + timeout-minutes: 30 release-validation: name: Release Validation (Linux) diff --git a/CHANGELOG.md b/CHANGELOG.md index c14c14ced..78b986773 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -21,6 +21,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 - Registry proxy now warns when `PROXY_REGISTRY_TOKEN` is set and `PROXY_REGISTRY_URL` uses `http://`, since the bearer token would be transmitted in plaintext; set `PROXY_REGISTRY_ALLOW_HTTP=1` to silence the warning for trusted internal proxies. (#1149) - Integration tests now use marker-driven discovery: 21 `pytestmark = pytest.mark.skipif(...)` chains across `tests/integration/` are replaced with declarative `requires_*` markers, with precondition logic centralized in `tests/integration/conftest.py` and auto-skipping at collection time. PR1 of #1166. (#1167) - Integration test apm-binary resolution now prefers the local build (`./dist/apm--/apm`) over a system-wide `apm` on `PATH`, so contributors validating the binary under test are not silently shadowed by a global install; the bearer-token marker (`requires_ado_bearer`) discards the captured JWT immediately and persists only the boolean outcome. (#1167) +- `scripts/test-integration.sh` is now a thin orchestrator: it builds/locates the apm binary, sets up runtimes and tokens, then invokes `pytest tests/integration/` exactly once. The 28 per-file pytest enumerations were removed; the marker registry handles per-test gating, and new test files dropped into `tests/integration/` are picked up automatically. PR2 of #1166. (#PR_NUMBER) ### Fixed diff --git a/docs/src/content/docs/contributing/integration-testing.md b/docs/src/content/docs/contributing/integration-testing.md index c58d93f0b..284eab1e9 100644 --- a/docs/src/content/docs/contributing/integration-testing.md +++ b/docs/src/content/docs/contributing/integration-testing.md @@ -98,14 +98,21 @@ system install: skip reason) and declare the marker in `pyproject.toml`. That is the only place the precondition needs to live. -### Legacy: `scripts/test-integration.sh` - -`scripts/test-integration.sh` is the legacy wrapper that built a -binary, set up runtimes, and shelled out to pytest. It is being -retired (see `microsoft/apm#1166`); prefer the direct `pytest` -invocations above. The script is still wired into CI for the moment -and continues to work, but new test plumbing belongs in the marker -registry, not in the bash script. +### CI orchestrator: `scripts/test-integration.sh` + +`scripts/test-integration.sh` is the thin orchestrator the CI +integration job invokes. Its sole responsibilities are: resolve +GitHub / ADO tokens, detect platform, locate or build the apm +PyInstaller binary, install runtimes (codex / copilot / llm), +install python test dependencies, and run +`pytest tests/integration/` once. All per-test gating lives in the +marker registry described above; the script no longer enumerates +individual test files. New integration tests dropped into +`tests/integration/` are picked up automatically. + +For local iteration prefer the direct `pytest` invocations earlier +on this page; the orchestrator script is mainly intended for +reproducing the full CI environment end-to-end. ## CI/CD Integration diff --git a/pyproject.toml b/pyproject.toml index b84885fc5..01808a14b 100644 --- a/pyproject.toml +++ b/pyproject.toml @@ -128,7 +128,7 @@ warn_return_any = true warn_unused_configs = true [tool.pytest.ini_options] -addopts = "-m 'not benchmark'" +addopts = "-m 'not benchmark and not live'" markers = [ "integration: marks tests as integration tests that may require network access", "live: marks tests that hit real GitHub repos (requires network + optional GITHUB_TOKEN)", diff --git a/scripts/test-integration.sh b/scripts/test-integration.sh index 84babe7fd..23fba0df7 100755 --- a/scripts/test-integration.sh +++ b/scripts/test-integration.sh @@ -1,24 +1,32 @@ #!/bin/bash -# Integration testing script for both CI and local environments +# Integration testing orchestrator for both CI and local environments. # -# DEPRECATED (PR2 of #1166): This script is being retired in favour of -# direct ``pytest tests/integration/`` invocations gated by the -# marker-driven discovery system introduced in PR1 -# (microsoft/apm#1167). Once PR2 lands the canonical commands will be -# documented in CONTRIBUTING.md and -# docs/src/content/docs/contributing/integration-testing.md. Avoid -# adding new logic here -- new test plumbing belongs in -# ``tests/integration/conftest.py`` ``_MARKER_CHECKS``. +# This script is intentionally a thin wrapper. Per-test gating +# (tokens, runtimes, binary, network) lives in +# tests/integration/conftest.py via the marker registry shipped in +# PR1 of #1166 (microsoft/apm#1167). PR2 of #1166 retired the +# per-file pytest enumeration that previously lived here in favour of +# a single ``pytest tests/integration/`` invocation. New integration +# test files dropped into tests/integration/ are picked up +# automatically; add the right ``requires_*`` marker (see +# pyproject.toml [tool.pytest.ini_options].markers) and the registry +# will skip the test when its precondition is missing. # -# Tests comprehensive runtime scenarios and edge cases: -# - Both Codex AND LLM runtime setup and interoperability -# - Complex pytest-based scenarios with error handling -# - Template bundling verification -# - Authentication matrix testing +# This script's responsibilities are now narrow: +# - resolve GitHub / ADO tokens (via scripts/github-token-helper.sh) +# - detect platform and execution environment (CI vs local) +# - locate or build the apm PyInstaller binary +# - install runtimes the binary needs (codex / copilot / llm) +# - install python test deps (uv preferred) +# - invoke pytest tests/integration/ exactly once # -# - CI mode: Uses pre-built artifacts from build job, runs integration tests -# - Local mode: Builds binary, runs comprehensive integration tests -# This ensures robust implementation testing before release validation +# To run a focused subset locally, invoke pytest directly: +# APM_E2E_TESTS=1 pytest tests/integration/test_X.py -v +# (the marker registry will still auto-skip preconditions that the +# local env doesn't satisfy) +# +# - CI mode: Uses pre-built artifacts from build job. +# - Local mode: Builds the binary up-front. set -euo pipefail @@ -297,27 +305,29 @@ install_test_dependencies() { log_success "Test dependencies installed" } -# Run integration tests (exactly like CI does) +# Run integration tests via marker-driven discovery (issue #1166). +# +# All per-test gating (tokens, runtimes, binary, network) lives in +# tests/integration/conftest.py via the _MARKER_CHECKS registry shipped +# in PR1 (#1167). This function is intentionally a thin wrapper: pytest +# discovers test files, the marker registry skips what the env can't +# satisfy, and one exit code reports the result. run_e2e_tests() { - log_info "=== Running integration tests (mirroring CI) ===" - log_info "Testing comprehensive runtime scenarios:" - log_info " - Zero-config auto-install (NEW HERO SCENARIO 1)" - log_info " - 2-minute guardrailing (NEW HERO SCENARIO 2)" - log_info " - MCP registry integration" - log_info " - APM Dependencies with real repositories" - log_info " - Environment variable handling" - log_info " - Docker args processing" - - # Set environment variables (like CI does) + log_info "=== Running integration tests (pytest tests/integration/) ===" + + # Set environment variables (mirrors what CI does) export APM_E2E_TESTS="1" - - # Only export GITHUB_TOKEN if it's set (avoid unbound variable error) if [[ -n "${GITHUB_TOKEN:-}" ]]; then export GITHUB_TOKEN="$GITHUB_TOKEN" fi - + log_info "Environment:" echo " APM_E2E_TESTS: $APM_E2E_TESTS" + if [[ -n "${APM_RUN_INTEGRATION_TESTS:-}" ]]; then + echo " APM_RUN_INTEGRATION_TESTS: $APM_RUN_INTEGRATION_TESTS" + else + echo " APM_RUN_INTEGRATION_TESTS: (not set; network-integration tests will be skipped)" + fi if [[ -n "${GITHUB_TOKEN:-}" ]]; then echo " GITHUB_TOKEN: (set)" else @@ -335,383 +345,20 @@ run_e2e_tests() { fi echo " PATH contains: $(dirname "$(which apm)")" echo " APM binary: $(which apm)" - + echo " APM_BINARY_PATH: ${APM_BINARY_PATH:-(unset)}" + # Activate virtual environment if it exists if [[ -f ".venv/bin/activate" ]]; then source .venv/bin/activate fi - - # Run NEW hero scenario test (zero-config auto-install) - log_info "Running NEW HERO SCENARIO 1: Zero-config auto-install test..." - echo "Command: pytest tests/integration/test_auto_install_e2e.py -v -s --tb=short" - - if pytest tests/integration/test_auto_install_e2e.py -v -s --tb=short; then - log_success "Zero-config auto-install tests passed!" - else - log_error "Zero-config auto-install tests failed!" - exit 1 - fi - - # Run NEW hero scenario test (2-minute guardrailing) - log_info "Running NEW HERO SCENARIO 2: 2-minute guardrailing test..." - echo "Command: pytest tests/integration/test_guardrailing_hero_e2e.py -v -s --tb=short" - - if pytest tests/integration/test_guardrailing_hero_e2e.py -v -s --tb=short; then - log_success "2-minute guardrailing tests passed!" - else - log_error "2-minute guardrailing tests failed!" - exit 1 - fi - - # NOTE: Legacy golden scenario tests removed - replaced by faster auto-install tests above - # The auto-install tests cover the same hero scenario but with early termination for speed - - # Run MCP registry E2E tests (new - covers our implemented functionality) - log_info "Running MCP registry E2E tests..." - echo "Command: pytest tests/integration/test_mcp_registry_e2e.py -v -s --tb=short" - - if pytest tests/integration/test_mcp_registry_e2e.py -v -s --tb=short; then - log_success "MCP registry tests passed!" - else - log_error "MCP registry tests failed!" - exit 1 - fi - - # Run MCP env-var headers E2E tests (regression guard for ${VAR} -> ${env:VAR}) - log_info "Running MCP env-var headers E2E tests..." - echo "Command: pytest tests/integration/test_mcp_env_var_headers_e2e.py -v -s --tb=short" - - if pytest tests/integration/test_mcp_env_var_headers_e2e.py -v -s --tb=short; then - log_success "MCP env-var headers tests passed!" - else - log_error "MCP env-var headers tests failed!" - exit 1 - fi - - # #1212 anti-regression: ADO --update preflight must fall back from a - # stale ADO_APM_PAT to an az-cli AAD bearer when az is logged in. Uses - # PATH-injected fake `git` and `az` binaries so the test is hermetic. - log_info "Running #1212 ADO preflight bearer-fallback E2E..." - echo "Command: pytest tests/integration/test_ado_preflight_bearer_fallback_e2e.py -v --tb=short" - - if pytest tests/integration/test_ado_preflight_bearer_fallback_e2e.py -v --tb=short; then - log_success "#1212 ADO preflight bearer-fallback tests passed!" - else - log_error "#1212 ADO preflight bearer-fallback tests failed!" - exit 1 - fi - - # Run APM Dependencies integration tests (NEW - Task 8A) - log_info "Running APM Dependencies integration tests with real repositories..." - echo "Command: pytest tests/integration/test_apm_dependencies.py -v -s --tb=short -m integration" - - if pytest tests/integration/test_apm_dependencies.py -v -s --tb=short -m integration; then - log_success "APM Dependencies integration tests passed!" - else - log_error "APM Dependencies integration tests failed!" - exit 1 - fi - - # Subdirectory dedup race E2E (#1126): two sibling subdirs of the - # same upstream repo+ref must install in parallel without the - # "Subdirectory ... not found" race the v1 cache produced. - log_info "Running #1126 parallel subdir dedup E2E..." - echo "Command: pytest tests/integration/test_install_subdir_dedup_e2e.py -v -s --tb=short -m integration" - - if pytest tests/integration/test_install_subdir_dedup_e2e.py -v -s --tb=short -m integration; then - log_success "#1126 subdir dedup E2E passed!" - else - log_error "#1126 subdir dedup E2E failed!" - exit 1 - fi - - # Branch-ref drift + lockfile self-heal regression E2E (#1158). - # Defends the heal pipeline (BranchRefDriftHeal, - # BuggyLockfileRecoveryHeal) and the supply-chain interlock against - # the 3-way drift bug. Uses the public danielmeppiel/apm-update-repro - # fixture with mutable refs. - log_info "Running #1158 branch-ref drift + heal pipeline E2E..." - echo "Command: pytest tests/integration/test_diff_aware_install_e2e.py -v -s --tb=short" - - if pytest tests/integration/test_diff_aware_install_e2e.py -v -s --tb=short; then - log_success "#1158 branch-ref drift + heal pipeline E2E passed!" - else - log_error "#1158 branch-ref drift + heal pipeline E2E failed!" - exit 1 - fi - - # Target resolution overhaul E2E (#1154 + 10 sister issues). - # Offline tests: exercises detection whitelist, resolution priority, - # provenance line, error renderer, dry-run, apm targets command. - # NO GitHub token required (uses local bundles). - log_info "Running #1154 target resolution E2E..." - echo "Command: pytest tests/integration/test_target_resolution_e2e.py -v -s --tb=short -m integration" - - if pytest tests/integration/test_target_resolution_e2e.py -v -s --tb=short -m integration; then - log_success "#1154 target resolution E2E passed!" - else - log_error "#1154 target resolution E2E failed!" - exit 1 - fi - - # apm deps update CLI E2E -- defends the explicit update workflow - # (lockfile bump across all packages, selective package update, - # global-scope update, unknown-package error). - log_info "Running apm deps update CLI E2E..." - echo "Command: pytest tests/integration/test_deps_update_e2e.py -v -s --tb=short" - - if pytest tests/integration/test_deps_update_e2e.py -v -s --tb=short; then - log_success "apm deps update CLI E2E passed!" - else - log_error "apm deps update CLI E2E failed!" - exit 1 - fi - - # Run Transport Selection integration tests (issue #778) - # Always-on cases use HTTPS against a public repo. SSH cases auto-skip - # when no usable SSH key is available for git@github.com. - log_info "Running Transport Selection integration tests..." - echo "Command: APM_RUN_INTEGRATION_TESTS=1 pytest tests/integration/test_transport_selection_integration.py -v -s --tb=short" - - if APM_RUN_INTEGRATION_TESTS=1 pytest tests/integration/test_transport_selection_integration.py -v -s --tb=short; then - log_success "Transport Selection integration tests passed!" - else - log_error "Transport Selection integration tests failed!" - exit 1 - fi - - # Run global-scope (--global / -g) E2E tests -- offline, no tokens needed - log_info "Running global-scope E2E tests..." - echo "Command: pytest tests/integration/test_global_scope_e2e.py -v -s --tb=short" - - if pytest tests/integration/test_global_scope_e2e.py -v -s --tb=short; then - log_success "Global-scope E2E tests passed!" - else - log_error "Global-scope E2E tests failed!" - exit 1 - fi - - # Run Claude Code MCP schema-fidelity tests -- offline, golden fixtures - # captured from the upstream `claude` CLI (see fixtures/README.md) - log_info "Running Claude Code MCP schema-fidelity tests..." - echo "Command: pytest tests/integration/test_claude_mcp_schema_fidelity.py -v -s --tb=short" - - if pytest tests/integration/test_claude_mcp_schema_fidelity.py -v -s --tb=short; then - log_success "Claude Code MCP schema-fidelity tests passed!" - else - log_error "Claude Code MCP schema-fidelity tests failed!" - exit 1 - fi - - # Run local-bundle install E2E tests -- offline, no tokens needed - log_info "Running local-bundle install E2E tests..." - echo "Command: pytest tests/integration/test_install_local_bundle_e2e.py -v -s --tb=short" - - if pytest tests/integration/test_install_local_bundle_e2e.py -v -s --tb=short; then - log_success "Local-bundle install E2E tests passed!" - else - log_error "Local-bundle install E2E tests failed!" - exit 1 - fi - - # Run cache lockfile-parity test (requires GITHUB_APM_PAT or GITHUB_TOKEN). - # Asserts byte-identical apm.lock.yaml across cold / warm / no-cache - # regimes -- the worst silent regression the cache layer could introduce. - log_info "Running cache lockfile-parity E2E test..." - echo "Command: pytest tests/integration/test_cache_lockfile_parity.py -v -s --tb=short" - - if pytest tests/integration/test_cache_lockfile_parity.py -v -s --tb=short; then - log_success "Cache lockfile-parity E2E test passed!" - else - log_error "Cache lockfile-parity E2E test failed!" - exit 1 - fi - - # Run Azure DevOps E2E tests (requires ADO_APM_PAT) - if [[ -n "${ADO_APM_PAT:-}" ]]; then - log_info "Running Azure DevOps E2E tests..." - echo "Command: pytest tests/integration/test_ado_e2e.py -v -s --tb=short" - - if pytest tests/integration/test_ado_e2e.py -v -s --tb=short; then - log_success "Azure DevOps E2E tests passed!" - else - log_error "Azure DevOps E2E tests failed!" - exit 1 - fi - else - log_info "Skipping Azure DevOps E2E tests (ADO_APM_PAT not set)" - fi - - # Run agent-skills target E2E tests -- offline, no tokens needed - log_info "Running agent-skills target E2E tests..." - echo "Command: pytest tests/integration/test_agent_skills_target.py -v -s --tb=short" - - if pytest tests/integration/test_agent_skills_target.py -v -s --tb=short; then - log_success "Agent-skills target E2E tests passed!" - else - log_error "Agent-skills target E2E tests failed!" - exit 1 - fi - - # Run skill install E2E tests -- requires GITHUB_APM_PAT (pytestmark skips otherwise). - # Guards skill install idempotency and .apm-pin no-leak invariant on reinstall. - log_info "Running skill install E2E tests..." - echo "Command: pytest tests/integration/test_skill_install.py -v -s --tb=short" - - if pytest tests/integration/test_skill_install.py -v -s --tb=short; then - log_success "Skill install E2E tests passed!" - else - log_error "Skill install E2E tests failed!" - exit 1 - fi - - # Run unified pack format E2E tests -- offline, no tokens needed - # Guards the 0.12.0 default flip from --format apm to --format plugin. - log_info "Running unified pack format E2E tests..." - echo "Command: pytest tests/integration/test_pack_unified.py -v -s --tb=short" - - if pytest tests/integration/test_pack_unified.py -v -s --tb=short; then - log_success "Unified pack format E2E tests passed!" - else - log_error "Unified pack format E2E tests failed!" - exit 1 - fi - - # Run Copilot compile target E2E tests -- offline, no tokens needed - # Guards .github/copilot-instructions.md generation + idempotent cleanup. - log_info "Running Copilot compile target E2E tests..." - echo "Command: pytest tests/integration/test_compile_copilot_root_instructions.py -v -s --tb=short" - - if pytest tests/integration/test_compile_copilot_root_instructions.py -v -s --tb=short; then - log_success "Copilot compile target E2E tests passed!" - else - log_error "Copilot compile target E2E tests failed!" - exit 1 - fi - - # Run transitive local-path chain E2E tests -- offline, no tokens needed - # Guards local_path anchoring across multi-level local dependency chains. - log_info "Running transitive local-path chain E2E tests..." - echo "Command: pytest tests/integration/test_transitive_chain_e2e.py -v -s --tb=short" - if pytest tests/integration/test_transitive_chain_e2e.py -v -s --tb=short; then - log_success "Transitive local-path chain E2E tests passed!" + log_info "Invoking pytest tests/integration/ (marker registry handles per-test gating)" + if pytest tests/integration/ -v --tb=short; then + log_success "Integration test suite passed (collected and ran via pytest discovery)" else - log_error "Transitive local-path chain E2E tests failed!" + log_error "Integration test suite reported failures" exit 1 fi - - # Run drift-detection integration tests -- offline, no tokens needed - # Guards `apm audit` drift replay (Phase D) across all 9 drift cases, - # multi-target, --no-drift opt-out, and false-positive guards - # (CRLF, BOM, Build ID line). Pinning these tests prevents silent - # regression of the drift contract. - log_info "Running drift detection integration tests..." - echo "Command: pytest tests/integration/test_drift_check.py -v -s --tb=short" - - if pytest tests/integration/test_drift_check.py -v -s --tb=short; then - log_success "Drift detection integration tests passed!" - else - log_error "Drift detection integration tests failed!" - exit 1 - fi - - # Run drift-detection E2E tests -- offline, no tokens needed - # Verifies the no-write contract, air-gap proof, performance smoke, - # and JSON/SARIF output shapes for the `apm audit` drift surface. - log_info "Running drift detection E2E tests..." - echo "Command: pytest tests/integration/test_drift_check_e2e.py -v -s --tb=short" - - if pytest tests/integration/test_drift_check_e2e.py -v -s --tb=short; then - log_success "Drift detection E2E tests passed!" - else - log_error "Drift detection E2E tests failed!" - exit 1 - fi - - # Run #1147 in-package link rewrite E2E -- offline, no tokens needed - # Defends the install-time link rewriter against the .agents/.github - # split regression: instructions/prompts/skills with relative links - # to in-package siblings must resolve on disk after `apm install`. - # Covers happy path, mixed link types, path-traversal escape - # (security), in-bundle skill links, and multi-target installs. - log_info "Running #1147 in-package link rewrite E2E..." - echo "Command: pytest tests/integration/test_link_rewrite_e2e.py -v -s --tb=short" - - if pytest tests/integration/test_link_rewrite_e2e.py -v -s --tb=short; then - log_success "#1147 in-package link rewrite E2E passed!" - else - log_error "#1147 in-package link rewrite E2E failed!" - exit 1 - fi - - # Run #1159 audit silent-skip E2E -- offline, no tokens needed - # Defends the audit --ci CI gate against silent fall-through when - # auto-discovery hits no_git_remote / absent / empty / disabled - # outcomes. Real `git init`, real CliRunner. Covers exit codes, - # stderr cleanliness for both JSON and SARIF formats, and the - # policy.fetch_failure_default=block enforcement contract. - log_info "Running #1159 audit silent-skip E2E..." - echo "Command: pytest tests/integration/test_audit_silent_skip_e2e.py -v -s --tb=short" - - if pytest tests/integration/test_audit_silent_skip_e2e.py -v -s --tb=short; then - log_success "#1159 audit silent-skip E2E passed!" - else - log_error "#1159 audit silent-skip E2E failed!" - exit 1 - fi - - # Run #1159 install silent-skip parity E2E -- offline, no tokens - # Defends the install pipeline parity for #1159: real `git init` - # with no remote configured + project policy.fetch_failure_default=block - # must raise PolicyViolationError through the policy_gate phase. - # Mirrors the audit-side block contract on the install codepath. - log_info "Running #1159 install silent-skip parity E2E..." - echo "Command: pytest tests/integration/test_install_silent_skip_e2e.py -v -s --tb=short" - - if pytest tests/integration/test_install_silent_skip_e2e.py -v -s --tb=short; then - log_success "#1159 install silent-skip parity E2E passed!" - else - log_error "#1159 install silent-skip parity E2E failed!" - exit 1 - fi - - # Run #1159 SCP/EMU + ADO v3 SSH URL parsing E2E -- offline - # Defends the shared SCP_LIKE_RE regex against regressions on its - # three consumers: cache.url_normalize, policy.discovery, and - # models.dependency.reference. Real `git init` + real `git remote - # add origin` for EMU (enterprise-user@), GHE custom hosts, and - # ADO v3 SSH (git@ssh.dev.azure.com:v3//...). Also exercises - # APMPackage.from_apm_yml on the same URL forms. - log_info "Running #1159 dep URL parsing E2E..." - echo "Command: pytest tests/integration/test_dep_url_parsing_e2e.py -v -s --tb=short" - - if pytest tests/integration/test_dep_url_parsing_e2e.py -v -s --tb=short; then - log_success "#1159 dep URL parsing E2E passed!" - else - log_error "#1159 dep URL parsing E2E failed!" - exit 1 - fi - - # Run #1149 GitLab install E2E -- offline (mocked HTTP, no network) - # Exercises GitHubPackageDownloader.download_package end-to-end against - # a host=gitlab.com virtual file dep, asserting GitLab REST v4 routing, - # PRIVATE-TOKEN header (sourced from GITLAB_APM_PAT), absence of an - # Authorization header (cross-host leakage trap), and the resulting - # LockedDependency entry preserving host=gitlab.com. - log_info "Running #1149 GitLab install E2E..." - echo "Command: pytest tests/integration/test_gitlab_install_e2e.py -v -s --tb=short -m integration" - - if pytest tests/integration/test_gitlab_install_e2e.py -v -s --tb=short -m integration; then - log_success "#1149 GitLab install E2E passed!" - else - log_error "#1149 GitLab install E2E failed!" - exit 1 - fi - - log_success "All integration test suites completed successfully!" - - } # Main execution @@ -735,34 +382,11 @@ main() { log_success "All integration tests completed successfully!" echo "" if [[ "$USE_EXISTING_BINARY" == "true" ]]; then - echo "✅ CI mode: Used pre-built artifacts and validated integration workflow" + echo "CI mode: Used pre-built artifacts and validated integration workflow" else - echo "✅ Local mode: Built binary and validated full integration process" + echo "Local mode: Built binary and validated full integration process" fi echo "" - echo "Integration validation complete - COMPREHENSIVE TESTING:" - echo " 1. Prerequisites (GITHUB_TOKEN) ✅" - echo "" - echo " HERO SCENARIO 1: 30-Second Zero-Config ✨" - echo " - Run virtual package directly ✅" - echo " - Auto-install on first run ✅" - echo " - Use cached package on second run ✅" - echo "" - echo " HERO SCENARIO 2: 2-Minute Guardrailing ✨" - echo " - Project initialization ✅" - echo " - Install multiple APM packages ✅" - echo " - Compile to AGENTS.md with combined guardrails ✅" - echo " - Run prompts from installed packages ✅" - echo "" - echo " 3. MCP registry search & show ✅" - echo " 4. Registry-based installation ✅" - echo " 5. APM Dependencies integration ✅" - echo " 6. Environment variable handling ✅" - echo " 7. Docker args with -e flags ✅" - echo " 8. Empty string & defaults logic ✅" - echo " 9. Cross-adapter consistency ✅" - echo " 10. Duplication prevention ✅" - echo "" log_success "Ready for release validation!" } diff --git a/tests/integration/test_apm_dependencies.py b/tests/integration/test_apm_dependencies.py index 63337da66..884d0656a 100644 --- a/tests/integration/test_apm_dependencies.py +++ b/tests/integration/test_apm_dependencies.py @@ -31,6 +31,8 @@ from apm_cli.deps.github_downloader import GitHubPackageDownloader from apm_cli.models.apm_package import APMPackage, DependencyReference +pytestmark = pytest.mark.requires_github_token + class TestAPMDependenciesIntegration: """Integration tests for APM Dependencies using real GitHub repositories.""" diff --git a/tests/integration/test_global_scope_e2e.py b/tests/integration/test_global_scope_e2e.py index 782f20b38..7d9998b7d 100644 --- a/tests/integration/test_global_scope_e2e.py +++ b/tests/integration/test_global_scope_e2e.py @@ -23,6 +23,8 @@ import pytest import yaml +pytestmark = pytest.mark.requires_apm_binary + # --------------------------------------------------------------------------- # Fixtures # --------------------------------------------------------------------------- diff --git a/tests/integration/test_install_subdir_dedup_e2e.py b/tests/integration/test_install_subdir_dedup_e2e.py index 2719e4499..0c6bce911 100644 --- a/tests/integration/test_install_subdir_dedup_e2e.py +++ b/tests/integration/test_install_subdir_dedup_e2e.py @@ -36,6 +36,8 @@ from apm_cli.deps.shared_clone_cache import SharedCloneCache from apm_cli.models.dependency.reference import DependencyReference +pytestmark = pytest.mark.requires_github_token + # Two sibling subdirs under the same upstream repo+ref. Both are # present on github/awesome-copilot at the time of writing; if either # is removed upstream, swap with another pair from diff --git a/tests/integration/test_install_verbose_redaction_e2e.py b/tests/integration/test_install_verbose_redaction_e2e.py index bd8c79e63..a2cbd1bb0 100644 --- a/tests/integration/test_install_verbose_redaction_e2e.py +++ b/tests/integration/test_install_verbose_redaction_e2e.py @@ -20,6 +20,8 @@ import pytest import yaml +pytestmark = pytest.mark.requires_apm_binary + CANARY = "github_pat_BOGUS_REDACTION_CANARY_DO_NOT_LEAK" CANARY_CORE = "BOGUS_REDACTION_CANARY_DO_NOT_LEAK" diff --git a/tests/integration/test_intra_package_cleanup.py b/tests/integration/test_intra_package_cleanup.py index 2cfbfdcd0..0d4ea1466 100644 --- a/tests/integration/test_intra_package_cleanup.py +++ b/tests/integration/test_intra_package_cleanup.py @@ -15,6 +15,8 @@ import pytest import yaml +pytestmark = pytest.mark.requires_apm_binary + @pytest.fixture def apm_command(): diff --git a/tests/integration/test_link_rewrite_e2e.py b/tests/integration/test_link_rewrite_e2e.py index 082b9c030..cb67741cd 100644 --- a/tests/integration/test_link_rewrite_e2e.py +++ b/tests/integration/test_link_rewrite_e2e.py @@ -39,6 +39,8 @@ import pytest import yaml +pytestmark = pytest.mark.requires_apm_binary + # --------------------------------------------------------------------------- # Fixtures # --------------------------------------------------------------------------- diff --git a/tests/integration/test_local_content_audit.py b/tests/integration/test_local_content_audit.py index 1a3d48776..4e818fdd8 100644 --- a/tests/integration/test_local_content_audit.py +++ b/tests/integration/test_local_content_audit.py @@ -23,6 +23,8 @@ import pytest import yaml +pytestmark = pytest.mark.requires_apm_binary + # --------------------------------------------------------------------------- # Fixtures # --------------------------------------------------------------------------- diff --git a/tests/integration/test_local_install.py b/tests/integration/test_local_install.py index 1b80aea3d..70c6d834a 100644 --- a/tests/integration/test_local_install.py +++ b/tests/integration/test_local_install.py @@ -14,6 +14,8 @@ import pytest import yaml +pytestmark = pytest.mark.requires_apm_binary + # --------------------------------------------------------------------------- # Fixtures # --------------------------------------------------------------------------- diff --git a/tests/integration/test_marketplace_e2e.py b/tests/integration/test_marketplace_e2e.py index ff53aa787..da9323d1a 100644 --- a/tests/integration/test_marketplace_e2e.py +++ b/tests/integration/test_marketplace_e2e.py @@ -26,6 +26,8 @@ import pytest +pytestmark = pytest.mark.requires_apm_binary + SAMPLE_MARKETPLACE_NAME = "test-mkt" diff --git a/tests/integration/test_mcp_env_var_copilot_e2e.py b/tests/integration/test_mcp_env_var_copilot_e2e.py index f21236ca7..eb46a9213 100644 --- a/tests/integration/test_mcp_env_var_copilot_e2e.py +++ b/tests/integration/test_mcp_env_var_copilot_e2e.py @@ -25,6 +25,11 @@ import pytest import yaml +pytestmark = [ + pytest.mark.requires_apm_binary, + pytest.mark.requires_runtime_copilot, +] + @pytest.fixture def apm_command(): diff --git a/tests/integration/test_skill_integration.py b/tests/integration/test_skill_integration.py index 63fb3a449..91b03e308 100644 --- a/tests/integration/test_skill_integration.py +++ b/tests/integration/test_skill_integration.py @@ -13,8 +13,11 @@ import pytest -# Skip all tests if GITHUB_APM_PAT is not set -pytestmark = pytest.mark.requires_github_token +# Skip all tests if GITHUB_APM_PAT is not set or apm binary missing +pytestmark = [ + pytest.mark.requires_github_token, + pytest.mark.requires_apm_binary, +] @pytest.fixture diff --git a/tests/integration/test_uninstall_dry_run_e2e.py b/tests/integration/test_uninstall_dry_run_e2e.py index c2e513700..ca8433919 100644 --- a/tests/integration/test_uninstall_dry_run_e2e.py +++ b/tests/integration/test_uninstall_dry_run_e2e.py @@ -14,7 +14,10 @@ import pytest import yaml -pytestmark = pytest.mark.requires_github_token +pytestmark = [ + pytest.mark.requires_github_token, + pytest.mark.requires_apm_binary, +] @pytest.fixture diff --git a/tests/integration/test_uninstall_multi_e2e.py b/tests/integration/test_uninstall_multi_e2e.py index bcd4f1a8d..7e8f12b4f 100644 --- a/tests/integration/test_uninstall_multi_e2e.py +++ b/tests/integration/test_uninstall_multi_e2e.py @@ -17,7 +17,10 @@ import pytest import yaml -pytestmark = pytest.mark.requires_github_token +pytestmark = [ + pytest.mark.requires_github_token, + pytest.mark.requires_apm_binary, +] PKG_A = "microsoft/apm-sample-package" From 2406b043edc92449fc4e752c410fdc47dcb23651 Mon Sep 17 00:00:00 2001 From: Daniel Meppiel Date: Sun, 10 May 2026 18:33:40 +0200 Subject: [PATCH 2/5] chore(changelog): wire real PR number for PR2 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --- CHANGELOG.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index 78b986773..57ab88b9b 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -21,7 +21,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 - Registry proxy now warns when `PROXY_REGISTRY_TOKEN` is set and `PROXY_REGISTRY_URL` uses `http://`, since the bearer token would be transmitted in plaintext; set `PROXY_REGISTRY_ALLOW_HTTP=1` to silence the warning for trusted internal proxies. (#1149) - Integration tests now use marker-driven discovery: 21 `pytestmark = pytest.mark.skipif(...)` chains across `tests/integration/` are replaced with declarative `requires_*` markers, with precondition logic centralized in `tests/integration/conftest.py` and auto-skipping at collection time. PR1 of #1166. (#1167) - Integration test apm-binary resolution now prefers the local build (`./dist/apm--/apm`) over a system-wide `apm` on `PATH`, so contributors validating the binary under test are not silently shadowed by a global install; the bearer-token marker (`requires_ado_bearer`) discards the captured JWT immediately and persists only the boolean outcome. (#1167) -- `scripts/test-integration.sh` is now a thin orchestrator: it builds/locates the apm binary, sets up runtimes and tokens, then invokes `pytest tests/integration/` exactly once. The 28 per-file pytest enumerations were removed; the marker registry handles per-test gating, and new test files dropped into `tests/integration/` are picked up automatically. PR2 of #1166. (#PR_NUMBER) +- `scripts/test-integration.sh` is now a thin orchestrator: it builds/locates the apm binary, sets up runtimes and tokens, then invokes `pytest tests/integration/` exactly once. The 28 per-file pytest enumerations were removed; the marker registry handles per-test gating, and new test files dropped into `tests/integration/` are picked up automatically. PR2 of #1166. (#1247) ### Fixed From 3eea12426599248a793ce94d4d68297e9480d23e Mon Sep 17 00:00:00 2001 From: Daniel Meppiel Date: Sun, 10 May 2026 18:42:41 +0200 Subject: [PATCH 3/5] fix(tests): address apm-review-panel findings on PR2 - test_registry_client_integration.py: add requires_network_integration marker (was relying on runtime self-skip; live HTTP to api.mcp.github.com would fire on any dev pytest invocation). Sec panel finding. - test_runtime_smoke.py: add requires_e2e_mode marker (was unmarked; downloads real codex/llm binaries). Test-coverage panel finding. - docs/integration-testing.md: drop pre-fix/post-fix phrasing per docs-current-behaviour-only convention; document the live marker in the registry table. Doc-writer + devx-ux findings. - scripts/test-integration.sh: export APM_RUN_INTEGRATION_TESTS for symmetry with APM_E2E_TESTS; update banner to reflect thin-orchestrator identity; drop vestigial 'Integration Testing Coverage!' comment. Py-arch + cli-log nits. - ci-integration.yml: add inline comment on the 20->30 timeout bump. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --- .github/workflows/ci-integration.yml | 3 +++ .../docs/contributing/integration-testing.md | 16 +++++++++------- scripts/test-integration.sh | 9 ++++++--- .../test_registry_client_integration.py | 3 +++ tests/integration/test_runtime_smoke.py | 2 ++ 5 files changed, 23 insertions(+), 10 deletions(-) diff --git a/.github/workflows/ci-integration.yml b/.github/workflows/ci-integration.yml index 7f0673630..12624ac17 100644 --- a/.github/workflows/ci-integration.yml +++ b/.github/workflows/ci-integration.yml @@ -154,6 +154,9 @@ jobs: run: | chmod +x scripts/test-integration.sh uv run ./scripts/test-integration.sh + # Bumped from 20 to 30 minutes when test discovery widened from + # the 28 enumerated files to the full tests/integration/ suite + # (PR2 of #1166). timeout-minutes: 30 release-validation: diff --git a/docs/src/content/docs/contributing/integration-testing.md b/docs/src/content/docs/contributing/integration-testing.md index 284eab1e9..7bd82379f 100644 --- a/docs/src/content/docs/contributing/integration-testing.md +++ b/docs/src/content/docs/contributing/integration-testing.md @@ -56,6 +56,7 @@ what the test family you want actually requires. | `requires_runtime_codex` | The `codex` runtime installed under `~/.apm/runtimes/` | `apm runtime setup codex` | | `requires_runtime_copilot` | The GitHub Copilot CLI runtime installed under `~/.apm/runtimes/` | `apm runtime setup copilot` | | `requires_runtime_llm` | The `llm` runtime installed under `~/.apm/runtimes/` | `apm runtime setup llm` | +| `live` | Tests that hit real GitHub repos via cloning; deselected by default | Override the deselect: `pytest -m live tests/integration -v` | Without any of those env vars or runtimes a `pytest tests/integration` invocation is silent rather than red: every test is collected and @@ -106,13 +107,14 @@ GitHub / ADO tokens, detect platform, locate or build the apm PyInstaller binary, install runtimes (codex / copilot / llm), install python test dependencies, and run `pytest tests/integration/` once. All per-test gating lives in the -marker registry described above; the script no longer enumerates -individual test files. New integration tests dropped into -`tests/integration/` are picked up automatically. - -For local iteration prefer the direct `pytest` invocations earlier -on this page; the orchestrator script is mainly intended for -reproducing the full CI environment end-to-end. +marker registry described above. New integration tests dropped into +`tests/integration/` are picked up automatically; add the right +`requires_*` marker and the registry will skip the test when its +precondition is missing. + +The orchestrator is mainly intended for reproducing the full CI +environment end-to-end; for local iteration prefer the direct +`pytest` invocations earlier on this page. ## CI/CD Integration diff --git a/scripts/test-integration.sh b/scripts/test-integration.sh index 23fba0df7..5aa2b303e 100755 --- a/scripts/test-integration.sh +++ b/scripts/test-integration.sh @@ -317,6 +317,9 @@ run_e2e_tests() { # Set environment variables (mirrors what CI does) export APM_E2E_TESTS="1" + if [[ -n "${APM_RUN_INTEGRATION_TESTS:-}" ]]; then + export APM_RUN_INTEGRATION_TESTS + fi if [[ -n "${GITHUB_TOKEN:-}" ]]; then export GITHUB_TOKEN="$GITHUB_TOKEN" fi @@ -366,8 +369,8 @@ main() { echo "APM CLI Integration Testing - Unified CI/Local Script" echo "=====================================================" echo "" - echo "This script adapts to CI (using artifacts) or local (building) environments" - echo "Tests comprehensive runtime scenarios and implementation robustness" + echo "This script adapts to CI (using artifacts) or local (building) environments." + echo "Resolves tokens, builds/locates the apm binary, sets up runtimes, then invokes pytest tests/integration/ once." echo "" check_prerequisites @@ -375,7 +378,7 @@ main() { detect_environment build_binary setup_binary_for_testing - setup_runtimes # Integration Testing Coverage! + setup_runtimes install_test_dependencies run_e2e_tests diff --git a/tests/integration/test_registry_client_integration.py b/tests/integration/test_registry_client_integration.py index c429b9124..4980e1b60 100644 --- a/tests/integration/test_registry_client_integration.py +++ b/tests/integration/test_registry_client_integration.py @@ -3,10 +3,13 @@ import os # noqa: F401 import unittest +import pytest import requests from apm_cli.registry.client import SimpleRegistryClient +pytestmark = pytest.mark.requires_network_integration + class TestRegistryClientIntegration(unittest.TestCase): """Integration test cases for the MCP registry client with the GitHub MCP Registry.""" diff --git a/tests/integration/test_runtime_smoke.py b/tests/integration/test_runtime_smoke.py index 48d71b397..80bbfd705 100644 --- a/tests/integration/test_runtime_smoke.py +++ b/tests/integration/test_runtime_smoke.py @@ -14,6 +14,8 @@ import pytest +pytestmark = pytest.mark.requires_e2e_mode + # Test fixtures and utilities @pytest.fixture(scope="module") From 8f6a7f19388a40859c5e66aac75b4b4706ae94cd Mon Sep 17 00:00:00 2001 From: Daniel Meppiel Date: Sun, 10 May 2026 20:01:32 +0200 Subject: [PATCH 4/5] docs(tests): codify integration-test marker procedure as APM instructions primitive Issue #1166 / follow-up to PR #1247. PR #1247 retired the manual per-file enumeration in scripts/test-integration.sh and shifted gating to declarative pytestmarkers, but the procedure ("where to drop a new integration test, which marker to use, what NOT to do") lived only in PR comments and the test-coverage-expert agent's tribal knowledge. This PR extracts that procedure into a versioned APM primitive that applies to both human contributors (via .github/instructions/) and the test-coverage-expert persona (via a cross-reference at step 5 of its review procedure), AND adds a hermetic regression-trap test that guarantees the rule's pointers do not silently drift. Changes: - .apm/instructions/tests.instructions.md: new H2 "Integration tests: placement and markers" with procedure, the canonical marker quick-map (7 rows), and anti-patterns. Mirrored to .github/. - .apm/agents/test-coverage-expert.agent.md: step 5 now links to the instructions file and explicitly instructs the persona to flag ungated live-network/runtime-binary calls. Mirrored to .github/. - tests/integration/test_marker_registry_sync.py: 7 hermetic assertions verifying that pyproject.toml markers, tests/integration/conftest.py::_MARKER_CHECKS, the docs registry table, and the instructions rule all stay in sync; plus a static lint forbidding new os.getenv-based runtime self-skips on gate env vars in tests/integration/. - CHANGELOG.md: one line under [Unreleased] Changed. - .github/copilot-instructions.md: auto-refreshed by apm compile. Lint (uv run --extra dev ruff check + format --check): silent. Tests (uv run pytest tests/integration/test_marker_registry_sync.py tests/integration/test_drift_check.py): 40 passed. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --- .apm/agents/test-coverage-expert.agent.md | 7 +- .apm/instructions/tests.instructions.md | 65 +++++ .github/agents/test-coverage-expert.agent.md | 7 +- .github/copilot-instructions.md | 6 +- .github/instructions/tests.instructions.md | 65 +++++ CHANGELOG.md | 1 + .../integration/test_marker_registry_sync.py | 232 ++++++++++++++++++ 7 files changed, 378 insertions(+), 5 deletions(-) create mode 100644 tests/integration/test_marker_registry_sync.py diff --git a/.apm/agents/test-coverage-expert.agent.md b/.apm/agents/test-coverage-expert.agent.md index fd30e55d5..d0fa720cb 100644 --- a/.apm/agents/test-coverage-expert.agent.md +++ b/.apm/agents/test-coverage-expert.agent.md @@ -148,7 +148,12 @@ it via tool calls before emitting it as a finding. The procedure: 5. **Probe the test tree** with `view` / `grep` / `glob`: - Look in `tests/unit//` for unit tests on the touched module. - Look in `tests/integration/` for integration tests on the touched - command or flow. + command or flow. New integration tests must follow the marker + placement contract in + [`.apm/instructions/tests.instructions.md`](../instructions/tests.instructions.md); + flag ungated live-network or runtime-binary calls in + `tests/integration/` as `recommended` regardless of whether the + test self-skips at runtime. - Search for the specific symbol, error string, or flag name being changed. Absence of ANY hit on the changed symbol is a strong signal of a coverage gap. diff --git a/.apm/instructions/tests.instructions.md b/.apm/instructions/tests.instructions.md index 1378efb63..3a2b40949 100644 --- a/.apm/instructions/tests.instructions.md +++ b/.apm/instructions/tests.instructions.md @@ -107,3 +107,68 @@ production code must follow (see - **Targeted runs during iteration.** Run the specific test file first (`uv run pytest tests/unit/install/test_X.py -x`) before running the full suite (`uv run pytest tests/unit tests/test_console.py`). + +## Integration tests: placement and markers + +The integration suite uses **declarative gating** via pytest markers, +not per-file orchestrator enumeration. Adding a new integration test +is two steps. + +### Procedure + +1. Drop the file under `tests/integration/test_.py`. +2. At the top of the module, declare the runtime / network / E2E + prerequisites as a single `pytestmark`: + + ```python + import pytest + + pytestmark = pytest.mark.requires_network_integration + # OR for multiple prerequisites: + pytestmark = [ + pytest.mark.requires_e2e_mode, + pytest.mark.requires_runtime_codex, + ] + ``` + +That is it. The orchestrator (`scripts/test-integration.sh`) and the +CI integration job collect everything under `tests/integration/` in +a single `pytest` invocation; markers are honored automatically. + +### Marker selection + +Pick the marker that matches the **strongest** prerequisite the test +has. The full registry lives in `pyproject.toml` under +`[tool.pytest.ini_options].markers` and is documented (with the +opt-in commands) in +[`docs/src/content/docs/contributing/integration-testing.md`](../../docs/src/content/docs/contributing/integration-testing.md). +Quick map for the common cases: + +| Test prerequisite | Marker | +|----------------------------------------------|---------------------------------| +| Real HTTP to APM-owned services | `requires_network_integration` | +| Real codex / copilot / llm runtime binary | `requires_runtime_` | +| Downloads runtimes; full E2E flow | `requires_e2e_mode` | +| GitHub / ADO token required | `requires_github_token` / `requires_ado_pat` | +| Paid or third-party external service | `live` (deselected by default) | +| Performance measurement | `benchmark` (deselected by default) | +| Hermetic (mocks all I/O) | *no marker required* | + +Need a marker that does not exist yet? Register it in +`pyproject.toml` AND add a row to the docs registry table in the +same PR. Both must stay in sync. + +### Anti-patterns (will land as `recommended` findings on review) + +- **Editing `scripts/test-integration.sh` per file.** The orchestrator + enumerates the directory, not the files. Per-file blocks are drift + by construction. +- **Runtime self-skips inside the test body.** A bare + `if not os.getenv("APM_E2E_TESTS"): pytest.skip(...)` runs before + collection-time gating and weakens the contract. Use + module-level `pytestmark` instead -- declarative gating is the + single source of truth. +- **Reading the gate env var inside test logic.** If your test + reads `APM_RUN_INTEGRATION_TESTS` to branch behaviour, the marker + is wrong (or missing). The marker is the gate; the test body + should assume the gate already passed. diff --git a/.github/agents/test-coverage-expert.agent.md b/.github/agents/test-coverage-expert.agent.md index fd30e55d5..d0fa720cb 100644 --- a/.github/agents/test-coverage-expert.agent.md +++ b/.github/agents/test-coverage-expert.agent.md @@ -148,7 +148,12 @@ it via tool calls before emitting it as a finding. The procedure: 5. **Probe the test tree** with `view` / `grep` / `glob`: - Look in `tests/unit//` for unit tests on the touched module. - Look in `tests/integration/` for integration tests on the touched - command or flow. + command or flow. New integration tests must follow the marker + placement contract in + [`.apm/instructions/tests.instructions.md`](../instructions/tests.instructions.md); + flag ungated live-network or runtime-binary calls in + `tests/integration/` as `recommended` regardless of whether the + test self-skips at runtime. - Search for the specific symbol, error string, or flag name being changed. Absence of ANY hit on the changed symbol is a strong signal of a coverage gap. diff --git a/.github/copilot-instructions.md b/.github/copilot-instructions.md index 8b37a0200..f72f6c294 100644 --- a/.github/copilot-instructions.md +++ b/.github/copilot-instructions.md @@ -1,6 +1,6 @@ - - + + # Linting (canonical contract) @@ -49,4 +49,4 @@ skills; honor this instruction before invoking them. --- *This file was generated by APM CLI. Do not edit manually.* -*To regenerate: `specify apm compile`* +*To regenerate: `apm compile`* diff --git a/.github/instructions/tests.instructions.md b/.github/instructions/tests.instructions.md index 1378efb63..3a2b40949 100644 --- a/.github/instructions/tests.instructions.md +++ b/.github/instructions/tests.instructions.md @@ -107,3 +107,68 @@ production code must follow (see - **Targeted runs during iteration.** Run the specific test file first (`uv run pytest tests/unit/install/test_X.py -x`) before running the full suite (`uv run pytest tests/unit tests/test_console.py`). + +## Integration tests: placement and markers + +The integration suite uses **declarative gating** via pytest markers, +not per-file orchestrator enumeration. Adding a new integration test +is two steps. + +### Procedure + +1. Drop the file under `tests/integration/test_.py`. +2. At the top of the module, declare the runtime / network / E2E + prerequisites as a single `pytestmark`: + + ```python + import pytest + + pytestmark = pytest.mark.requires_network_integration + # OR for multiple prerequisites: + pytestmark = [ + pytest.mark.requires_e2e_mode, + pytest.mark.requires_runtime_codex, + ] + ``` + +That is it. The orchestrator (`scripts/test-integration.sh`) and the +CI integration job collect everything under `tests/integration/` in +a single `pytest` invocation; markers are honored automatically. + +### Marker selection + +Pick the marker that matches the **strongest** prerequisite the test +has. The full registry lives in `pyproject.toml` under +`[tool.pytest.ini_options].markers` and is documented (with the +opt-in commands) in +[`docs/src/content/docs/contributing/integration-testing.md`](../../docs/src/content/docs/contributing/integration-testing.md). +Quick map for the common cases: + +| Test prerequisite | Marker | +|----------------------------------------------|---------------------------------| +| Real HTTP to APM-owned services | `requires_network_integration` | +| Real codex / copilot / llm runtime binary | `requires_runtime_` | +| Downloads runtimes; full E2E flow | `requires_e2e_mode` | +| GitHub / ADO token required | `requires_github_token` / `requires_ado_pat` | +| Paid or third-party external service | `live` (deselected by default) | +| Performance measurement | `benchmark` (deselected by default) | +| Hermetic (mocks all I/O) | *no marker required* | + +Need a marker that does not exist yet? Register it in +`pyproject.toml` AND add a row to the docs registry table in the +same PR. Both must stay in sync. + +### Anti-patterns (will land as `recommended` findings on review) + +- **Editing `scripts/test-integration.sh` per file.** The orchestrator + enumerates the directory, not the files. Per-file blocks are drift + by construction. +- **Runtime self-skips inside the test body.** A bare + `if not os.getenv("APM_E2E_TESTS"): pytest.skip(...)` runs before + collection-time gating and weakens the contract. Use + module-level `pytestmark` instead -- declarative gating is the + single source of truth. +- **Reading the gate env var inside test logic.** If your test + reads `APM_RUN_INTEGRATION_TESTS` to branch behaviour, the marker + is wrong (or missing). The marker is the gate; the test body + should assume the gate already passed. diff --git a/CHANGELOG.md b/CHANGELOG.md index 57ab88b9b..3dd564e1d 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -22,6 +22,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 - Integration tests now use marker-driven discovery: 21 `pytestmark = pytest.mark.skipif(...)` chains across `tests/integration/` are replaced with declarative `requires_*` markers, with precondition logic centralized in `tests/integration/conftest.py` and auto-skipping at collection time. PR1 of #1166. (#1167) - Integration test apm-binary resolution now prefers the local build (`./dist/apm--/apm`) over a system-wide `apm` on `PATH`, so contributors validating the binary under test are not silently shadowed by a global install; the bearer-token marker (`requires_ado_bearer`) discards the captured JWT immediately and persists only the boolean outcome. (#1167) - `scripts/test-integration.sh` is now a thin orchestrator: it builds/locates the apm binary, sets up runtimes and tokens, then invokes `pytest tests/integration/` exactly once. The 28 per-file pytest enumerations were removed; the marker registry handles per-test gating, and new test files dropped into `tests/integration/` are picked up automatically. PR2 of #1166. (#1247) +- Integration-test marker procedure codified as `.apm/instructions/tests.instructions.md` (wired into `test-coverage-expert` persona) and guarded by a regression-trap test that asserts `pyproject.toml`, `tests/integration/conftest.py::_MARKER_CHECKS`, the docs registry table, and the instructions rule stay in sync. (#1166) ### Fixed diff --git a/tests/integration/test_marker_registry_sync.py b/tests/integration/test_marker_registry_sync.py new file mode 100644 index 000000000..ab3fca581 --- /dev/null +++ b/tests/integration/test_marker_registry_sync.py @@ -0,0 +1,232 @@ +"""Marker-registry sync invariants for the integration suite. + +This test asserts the integrity contract surfaced in +``.apm/instructions/tests.instructions.md`` (section "Integration tests: +placement and markers"): the marker names referenced by the rule, declared +in ``pyproject.toml``, documented in +``docs/src/content/docs/contributing/integration-testing.md``, and wired +into ``tests/integration/conftest.py::_MARKER_CHECKS`` MUST stay in sync. + +Why this matters: the rule tells future contributors and agents that + +* ``pyproject.toml`` is the marker source of truth, +* the docs table is the canonical registry, +* the conftest predicate map is the gate logic. + +If any of those drifts, the rule's pointer becomes misleading and a future +PR that adds an ungated network/runtime test will not be caught by the +collection-time gate it claims to honour. + +The tests are hermetic (read-only on repo files); no marker required. +""" + +from __future__ import annotations + +import re +import sys +from pathlib import Path + +import tomllib + +# --------------------------------------------------------------------------- +# Resolve repo root from this file location -- the test relies on relative +# layout, not on a cwd assumption. +# --------------------------------------------------------------------------- + +REPO_ROOT = Path(__file__).resolve().parents[2] +PYPROJECT = REPO_ROOT / "pyproject.toml" +DOCS_REGISTRY = ( + REPO_ROOT / "docs" / "src" / "content" / "docs" / "contributing" / "integration-testing.md" +) +APM_RULE = REPO_ROOT / ".apm" / "instructions" / "tests.instructions.md" +CONFTEST = REPO_ROOT / "tests" / "integration" / "conftest.py" +INTEGRATION_DIR = REPO_ROOT / "tests" / "integration" + + +# --------------------------------------------------------------------------- +# Helpers +# --------------------------------------------------------------------------- + + +def _pyproject_marker_names() -> set[str]: + """Return the set of marker names declared in pyproject.toml.""" + with PYPROJECT.open("rb") as fh: + data = tomllib.load(fh) + raw = data["tool"]["pytest"]["ini_options"]["markers"] + out: set[str] = set() + for line in raw: + name = line.split(":", 1)[0].strip() + if name: + out.add(name) + return out + + +def _docs_registry_marker_names() -> set[str]: + """Parse the marker table in the docs registry and return marker names. + + Table rows have shape: ``| `marker_name` | precondition | how |``. + """ + text = DOCS_REGISTRY.read_text(encoding="utf-8") + out: set[str] = set() + # Match the first cell content if it is a backtick-quoted identifier. + row_re = re.compile(r"^\|\s*`([a-z_][a-z0-9_]*)`\s*\|", re.MULTILINE) + for m in row_re.finditer(text): + out.add(m.group(1)) + return out + + +def _conftest_marker_names() -> set[str]: + """Return marker names that have a predicate registered in conftest.""" + # Static parse to avoid importing the conftest module under test. + text = CONFTEST.read_text(encoding="utf-8") + # Look for entries inside the _MARKER_CHECKS dict: ' "name": (' + key_re = re.compile(r'^\s*"(requires_[a-z0-9_]+)"\s*:\s*\(', re.MULTILINE) + return set(key_re.findall(text)) + + +def _gating_markers_in_pyproject() -> set[str]: + """Subset of pyproject markers that gate execution (requires_* + live). + + ``integration``, ``slow``, ``benchmark`` are taxonomy markers, not + gates, and are intentionally excluded from the docs registry. + """ + names = _pyproject_marker_names() + return {n for n in names if n.startswith("requires_") or n == "live"} + + +# --------------------------------------------------------------------------- +# Tests +# --------------------------------------------------------------------------- + + +def test_every_pyproject_gating_marker_has_conftest_predicate() -> None: + """A declared ``requires_*`` marker without a predicate is dead config. + + If pyproject declares ``requires_X`` but ``_MARKER_CHECKS`` has no + entry for it, the marker never auto-skips and the gate is a lie. + """ + declared = {n for n in _pyproject_marker_names() if n.startswith("requires_")} + wired = _conftest_marker_names() + missing = sorted(declared - wired) + assert not missing, ( + f"requires_* markers in pyproject.toml missing a predicate in " + f"tests/integration/conftest.py::_MARKER_CHECKS: {missing}" + ) + + +def test_every_conftest_predicate_is_declared_in_pyproject() -> None: + """A wired predicate without a marker declaration warns with PytestUnknownMarkWarning.""" + declared = _pyproject_marker_names() + wired = _conftest_marker_names() + orphans = sorted(wired - declared) + assert not orphans, ( + f"_MARKER_CHECKS entries with no matching pyproject.toml marker " + f"declaration (would emit PytestUnknownMarkWarning): {orphans}" + ) + + +def test_every_gating_marker_is_documented_in_registry() -> None: + """Every ``requires_*`` / ``live`` marker MUST appear in the docs table. + + The instructions rule tells agents and humans to consult the docs + table as the canonical registry. A marker missing from the table is + invisible to anyone following the rule. + """ + gating = _gating_markers_in_pyproject() + documented = _docs_registry_marker_names() + missing = sorted(gating - documented) + assert not missing, ( + f"Gating markers declared in pyproject.toml but missing from " + f"docs/src/content/docs/contributing/integration-testing.md " + f"registry table: {missing}" + ) + + +def test_docs_registry_only_names_declared_markers() -> None: + """The docs table must not advertise a marker that does not exist.""" + declared = _pyproject_marker_names() + documented = _docs_registry_marker_names() + phantom = sorted(documented - declared) + assert not phantom, ( + f"Markers documented in the docs registry but not declared in pyproject.toml: {phantom}" + ) + + +def test_apm_rule_only_names_declared_markers() -> None: + """Marker names cited in the APM instructions rule must really exist. + + Scans for ``requires_*`` tokens in the rule body and asserts each one + matches a declared marker (or the documented ``requires_runtime_`` + placeholder). + """ + declared = _pyproject_marker_names() + # Allow the literal placeholder used in the docs/quick-map. + declared_with_placeholder = declared | {"requires_runtime_"} + body = APM_RULE.read_text(encoding="utf-8") + # Match identifiers OR the placeholder shape inside backticks / plain. + token_re = re.compile(r"\brequires_(?:runtime_|[a-z][a-z0-9_]*)") + cited = set(token_re.findall(body)) + # Re-prefix because the regex captured only the suffix; rebuild full names. + # (The regex above intentionally returns the full token thanks to \b + # anchoring and the group inside; re-derive with findall on full pattern.) + full_re = re.compile(r"\brequires_(?:runtime_|[a-z][a-z0-9_]+)") + cited = set(full_re.findall(body)) + unknown = sorted(cited - declared_with_placeholder) + assert not unknown, ( + f"Marker names cited in .apm/instructions/tests.instructions.md " + f"that are not declared in pyproject.toml: {unknown}" + ) + + +def test_integration_tests_use_pytestmark_not_runtime_self_skip() -> None: + """Tests must declare gates via ``pytestmark``, not runtime ``os.getenv``. + + The rule says: never write ``if not os.getenv("APM_E2E_TESTS"): + pytest.skip(...)`` inside a test body. Use module-level ``pytestmark = + pytest.mark.requires_e2e_mode`` instead. This guard catches future + regressions of that pattern in ``tests/integration/test_*.py``. + + ``conftest.py`` is intentionally exempt: the marker registry itself + must read the env vars to implement the gate. + """ + gate_env_vars = ( + "APM_E2E_TESTS", + "APM_RUN_INTEGRATION_TESTS", + "APM_RUN_INFERENCE_TESTS", + "APM_TEST_ADO_BEARER", + ) + # Two-line window pattern: an os.getenv on a gate var followed by + # pytest.skip on the next non-blank line (or same line). + offenders: list[str] = [] + this_file = Path(__file__).resolve() + for path in sorted(INTEGRATION_DIR.glob("test_*.py")): + if path.resolve() == this_file: + # The lint itself names the env var strings literally; skip. + continue + text = path.read_text(encoding="utf-8") + for var in gate_env_vars: + # Heuristic: an os.getenv(...gate var...) call AND a pytest.skip + # in the SAME function body. We check for the var name appearing + # within 200 chars of pytest.skip. + for m in re.finditer(rf'os\.getenv\(\s*[\'"]{var}[\'"]', text): + window = text[m.start() : m.start() + 400] + if "pytest.skip" in window or "pytest.exit" in window: + offenders.append(f"{path.relative_to(REPO_ROOT)}: gates on {var}") + break + assert not offenders, ( + "Integration test files use runtime os.getenv self-skip on a gate " + "env var. Replace with a module-level pytestmark = pytest.mark." + "requires_* marker (see .apm/instructions/tests.instructions.md):\n " + + "\n ".join(offenders) + ) + + +# --------------------------------------------------------------------------- +# Sanity: confirm tomllib is available (Python 3.11+). All CI matrix +# entries currently use 3.12+; this test is the canary if that changes. +# --------------------------------------------------------------------------- + + +def test_tomllib_available() -> None: + """tomllib was added in 3.11; CI runs on 3.12. Guard the assumption.""" + assert sys.version_info >= (3, 11), "This suite needs Python 3.11+ for tomllib" From c30a9609e6b50da421ae2e9eab5ea714f3741db3 Mon Sep 17 00:00:00 2001 From: Daniel Meppiel Date: Sun, 10 May 2026 20:36:01 +0200 Subject: [PATCH 5/5] fix: regenerate compiled mirror with correct relative paths apm install rewrites cross-primitive links from .apm/ source paths to the equivalent paths from the .github/ deployment root. The manual mirror in the previous commit shipped the unrewritten path, which caused APM Self-Check drift. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --- .github/agents/test-coverage-expert.agent.md | 2 +- apm.lock.yaml | 4 ++-- 2 files changed, 3 insertions(+), 3 deletions(-) diff --git a/.github/agents/test-coverage-expert.agent.md b/.github/agents/test-coverage-expert.agent.md index d0fa720cb..c6e9cfb40 100644 --- a/.github/agents/test-coverage-expert.agent.md +++ b/.github/agents/test-coverage-expert.agent.md @@ -150,7 +150,7 @@ it via tool calls before emitting it as a finding. The procedure: - Look in `tests/integration/` for integration tests on the touched command or flow. New integration tests must follow the marker placement contract in - [`.apm/instructions/tests.instructions.md`](../instructions/tests.instructions.md); + [`.apm/instructions/tests.instructions.md`](../../.apm/instructions/tests.instructions.md); flag ungated live-network or runtime-binary calls in `tests/integration/` as `recommended` regardless of whether the test self-skips at runtime. diff --git a/apm.lock.yaml b/apm.lock.yaml index a690b45c4..1c7666e82 100644 --- a/apm.lock.yaml +++ b/apm.lock.yaml @@ -45,7 +45,7 @@ local_deployed_file_hashes: .github/agents/oss-growth-hacker.agent.md: sha256:1cd56bb78ab37d52c50e45ab69d759f775cd49cdf35981b3dc6c4004315c6b83 .github/agents/python-architect.agent.md: sha256:7587ee7c684c61046a83dfa1b7e39d1345f2f119c3395478e3ca2dbbaaaff0e9 .github/agents/supply-chain-security-expert.agent.md: sha256:8fb8cc426d6af17ba084a28b3f026c2b475b62e3ca63ed2f88b83bd823f877af - .github/agents/test-coverage-expert.agent.md: sha256:4df107de6179b5237fa9300582921e7fcb439928649601f0fef2d3b9275cea40 + .github/agents/test-coverage-expert.agent.md: sha256:bc588d89530362469502bfbea788df892a9a0b00e630cd0f3926d3dfd2c2a9e2 .github/instructions/changelog.instructions.md: sha256:1e51ec4c74e847967962bd279dc4c6e582c5d3578490b3c28d5f3acd3e05f73e .github/instructions/cicd.instructions.md: sha256:9c0fafc74f743aa97e5adba2168d66c9e3a327b135065e3b804bdbb5f04cda5d .github/instructions/cli.instructions.md: sha256:8e39e8d5047ce88575cb02f87c2bcede584dfef258bd86f7466c7badf136541a @@ -54,4 +54,4 @@ local_deployed_file_hashes: .github/instructions/integrators.instructions.md: sha256:b151e0438088d2c0b636dfc28532ecf43c3b51e5f1070a354b8d5b57c345e335 .github/instructions/linting.instructions.md: sha256:312acd32353567834ec9f4f246710a47a991729a11c0380aa6a010b63de607eb .github/instructions/python.instructions.md: sha256:45173f778eddc126c37c7ace96acd0e17adb1895031eec134ec0754638d3ba37 - .github/instructions/tests.instructions.md: sha256:4c6335e3373f9735778a05913f2d8ef250d118f8c5305e70ba407e578a525ef7 + .github/instructions/tests.instructions.md: sha256:b527ccaecf0e92f74d300fc9027f1bc49bb43d8ddcdd36338c1556fcde0d8b2d