Skip to content

Add OSV 1.7.5 parsing support#6

Merged
TooFastTooCurious merged 1 commit intomainfrom
osv-parsing
Apr 16, 2026
Merged

Add OSV 1.7.5 parsing support#6
TooFastTooCurious merged 1 commit intomainfrom
osv-parsing

Conversation

@TooFastTooCurious
Copy link
Copy Markdown
Contributor

Summary

Updates pkg/advisory to parse and match OSV-shaped advisory data. Pairs with the data migration in JulietSecurity/abom-advisories#6. See #3 for the design discussion.

Struct layout change

BEFORE (legacy ABOM format)           AFTER (OSV 1.7.5)
┌─────────────────────────┐          ┌─────────────────────────────┐
│ Advisory                │          │ Advisory                    │
│   ID                    │          │   SchemaVersion             │
│   Title                 │          │   ID                        │
│   CVE                   │          │   Modified                  │
│   CVSS (float64)        │          │   Published                 │
│   Published             │          │   Withdrawn                 │
│   Updated               │          │   Aliases []                │
│   Status                │   ====>  │   Summary                   │
│   Description           │          │   Details                   │
│   References []string   │          │   Severity   []Severity     │
│   AffectedActions []    │          │   Affected   []Affected     │
│     .Uses               │          │     .Package{Ecosystem,Name}│
│     .AffectedRefs       │          │     .Ranges[].Events[]      │
│       .TagRange         │          │     .Versions[]             │
│       .SafeTags         │          │     .EcosystemSpecific.ABOM │
│       .Tags             │          │   References []Reference    │
│     .ToolNames          │          │   DatabaseSpecific.ABOM     │
│     .AffectedPeriod     │          └─────────────────────────────┘
│   Indicators            │
│   RecommendedActions    │
└─────────────────────────┘

ABOM-specific extensions preserved under two OSV-sanctioned namespaces:

  • ecosystem_specific.abom (per-package, GitHub Actions flavor): tool_names, affected_period
  • database_specific.abom (advisory-wide): indicators, recommended_actions

Matching logic change

The old matcher parsed a string like ">=v0.0.1 <=v0.34.2" into bounds. The new matcher walks OSV events in declaration order, tracking whether the version is inside an affected window:

events: [{introduced: v0.0.1}, {fixed: v0.35.0}]

walk for version v0.20.0:
  event 1: introduced = v0.0.1, v0.20.0 >= v0.0.1 => affected = true
  event 2: fixed      = v0.35.0, v0.20.0 >= v0.35.0 false, affected stays true
  result: true (compromised)

walk for version v0.35.0:
  event 1: introduced = v0.0.1, v0.35.0 >= v0.0.1 => affected = true
  event 2: fixed      = v0.35.0, v0.35.0 >= v0.35.0 => affected = false
  result: false (not affected)

Same shape works for multi-cycle ranges (introduced, fixed, introduced, fixed), the 0 sentinel for "from the beginning," and the last_affected event type. All covered by new tests.

Public API is stable

External callers use only three things:

  • advisory.NewDatabase(LoadOptions) *Database
  • advisory.LoadOptions{Offline, NoCache, Quiet, Token}
  • (*Database).CheckAll(*model.ABOM)

None of those signatures changed. cmd/scan.go and cmd/check.go compile and run unchanged.

Tests

All existing tests ported to the new API plus new coverage for OSV-specific semantics:

  • TestMatchesRange for core introduced/fixed walking
  • TestMatchesRange_IntroducedZero for the OSV "0" sentinel
  • TestMatchesRange_LastAffected for the last_affected event type
  • TestBuiltinDataIsOSV as a shape sanity check on the embedded data
  • TestCheck_TrivyActionVulnerable now also exercises tj-actions entries

SHA-pinned refs

Same policy as before: flagged as verify-sha since ordinal comparison against tag ranges is not meaningful for SHAs. The affected_period in ecosystem_specific.abom is available for time-based correlation, but wiring that in is a follow-up.

Coordination with advisories#6

Merging this before the advisories PR means abom fetches legacy JSON from the remote, the parser rejects it, and users fall back to the built-in snapshot (now OSV). Graceful degradation, not ideal.

Merging advisories#6 before this means abom fetches OSV JSON, the current parser rejects it, users fall back to the old built-in snapshot. Also graceful degradation.

Either order works. Merging close together minimizes the degraded window. I would suggest advisories#6 first since Ben is reviewing it, then this.

Follow-ups not in this PR

  • CVSS vector backfill is already in the data migration (NVD-authoritative). No abom-side work needed.
  • SHA-pinned-ref matching against affected_period for time correlation: possible future enhancement.
  • Dropping legacy format handling from older abom releases: no-op since we never had formal legacy compat.

Files changed

pkg/advisory/advisory.go              +376 / -228  (rewrite)
pkg/advisory/advisory_test.go         +148 / -22   (updated + new cases)
pkg/advisory/builtin_advisories.json  +176 / -35   (format migration)

Rewrite pkg/advisory to consume OSV-shaped advisory data rather than the
legacy ABOM format. Advisory, Affected, Range, Event, and related structs
match the OSV schema. ABOM-specific signal lives in ecosystem_specific.abom
(tool_names, affected_period) and database_specific.abom (indicators,
recommended_actions). Matching logic walks range events
(introduced/fixed/last_affected) instead of parsing a tag_range string.

Builtin advisories snapshot updated to OSV format to match
JulietSecurity/abom-advisories#6.

Agreed approach in #3.
@TooFastTooCurious
Copy link
Copy Markdown
Contributor Author

@funnelfiasco heads-up that the abom-side parser update is open here, pairs with advisories#6. Not asking for a formal review given the Go-heavy internals, just wanted you in the loop.

Copy link
Copy Markdown

@funnelfiasco funnelfiasco left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@TooFastTooCurious TooFastTooCurious merged commit a4406f3 into main Apr 16, 2026
2 checks passed
@TooFastTooCurious TooFastTooCurious deleted the osv-parsing branch April 16, 2026 12:32
digorgonzola added a commit to digorgonzola/abom that referenced this pull request Apr 23, 2026
CollectActions populates abom.Actions from the workflow walk, so every
entry is already reached by the first loop. The second loop rechecked
the same refs with a misleading comment about unreachable entries.

Addresses PR review item JulietSecurity#6.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
TooFastTooCurious pushed a commit that referenced this pull request Apr 23, 2026
…omised (#8)

* fix: resolve SHA-pinned refs to upstream tags before advisory matching

SHA-pinned action refs (e.g., `owner/repo@<sha>`) that match an
advisory's affected package are currently flagged as compromised with
"verify manually", even when the SHA corresponds to a fixed version.
This is because SHAs cannot be ordinally compared against version
ranges, so `matchAffected` short-circuits with `"verify-sha"`.

This change adds a tag resolution step that uses the GitHub API to
determine which tag a verified SHA actually points to. Once the
upstream tag is known, the advisory checker can compare it against
affected version ranges and correctly clear the compromised flag
for refs pinned to fixed versions.

Changes:
- Add `ResolvedTag` field to `ActionRef` model
- Add `TagResolver` interface and `GitHubTagResolver` implementation
  that resolves a SHA to its tag via the GitHub tags API
- Add `ResolveABOMTags` function to populate `ResolvedTag` for all
  SHA-pinned actions after SHA verification
- Update `matchAffected` to use `ResolvedTag` for version comparison
  when available, falling back to `"verify-sha"` only when the tag
  is unknown
- Add `RecheckSHARefs` method to re-evaluate previously flagged SHA
  refs after tag resolution, clearing false positives
- Wire tag resolution and re-checking into the scan pipeline after
  `--verify-shas`

Fixes false positives where `--verify-shas --check` flags SHA-pinned
refs at fixed versions (e.g., setup-trivy@<sha> # v0.2.6).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: skip detected-tool match for actions with resolved tag outside affected range

The detected-tool pathway in Check() was re-flagging SHA-pinned actions
(like setup-trivy) even after matchAffected correctly determined they
were outside the affected version range via ResolvedTag. This happened
because detected-tool matching only checks tool names, not versions.

Also fix tag propagation: CollectActions deduplicates by Raw, so when
the same action appears in multiple workflows, only one ActionRef gets
ResolvedTag set by ResolveABOMTags. RecheckSHARefs now propagates
resolved tags to all duplicate workflow-level refs before rechecking.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: only resolve tags for advisory-flagged SHA refs

ResolveABOMTags was calling the GitHub tags API for every SHA-pinned
action (~20-30 per repo), when only the 0-2 refs flagged by advisory
checking actually need tag resolution. This triggered GitHub's
secondary rate limit in CI, causing tag resolution to be skipped
entirely and leaving false positives unresolved.

Add a Compromised filter to ResolveABOMTags so it only resolves tags
for refs that CheckAll flagged with "verify manually". This reduces
API calls from ~N to ~0-2, avoiding rate limits in practice.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: run tag resolution before SHA verification to avoid rate limits

Move advisory-flagged tag resolution ahead of VerifyABOMShas in the
scan pipeline. SHA verification makes ~30 HEAD requests in a tight
loop and can trigger GitHub's secondary rate limit, leaving no API
budget for the 1-2 tag resolution calls that clear false positives.

Also decouple tag resolution from --verify-shas: it now runs whenever
--check is used, since advisory false positives need clearing
regardless of whether full SHA verification was requested.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: use git ls-remote for tag resolution instead of REST API

The GitHub REST API tags endpoint shares rate limit quota with all
other API calls. In CI environments, the GITHUB_TOKEN budget is often
exhausted by other workflow steps or concurrent runs, causing tag
resolution to fail with 403 even on its very first request.

Switch GitHubTagResolver to use `git ls-remote --tags` which uses the
git protocol instead of the REST API. This doesn't consume API rate
limit quota and works in CI where git credentials are already
configured by actions/checkout.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: clean up dead code and auth path in GitHubTagResolver.ResolveTag

Remove fake GIT_ASKPASS_TOKEN env var (not a real git variable), eliminate
the confusing cmd.Args rebuild pattern, and set GIT_TERMINAL_PROMPT=0
unconditionally. Document the http.extraHeader trade-off (token visible
in process list).

Addresses PR review blocker #1.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: remove unreachable ErrVerifyRateLimit branch from ResolveABOMTags

git ls-remote never returns ErrVerifyRateLimit (only GitHubSHAVerifier
does, for HTTP 403/429). Remove the dead rateLimited flag and sentinel
check, keeping the general error path for actual git ls-remote failures.

Addresses PR review blocker #2.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: eliminate duplicate advisory DB load in runScan

Hoist the Database instance above both advisory blocks so CheckAll and
RecheckSHARefs share a single load. The second NewDatabase call was
re-fetching from remote (NoCache: true skips cache, not uses it) and
could return a different advisory version than the first.

Addresses PR review blocker #3.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: clarify that tag resolution runs with --check, not --verify-shas

Expand the comment to explain why tag resolution is gated on --check
independently of --verify-shas: git ls-remote doesn't consume REST API
quota, making it safe to run without explicit opt-in to SHA verification.

Addresses PR review item #4.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: remove redundant second loop in RecheckSHARefs

CollectActions populates abom.Actions from the workflow walk, so every
entry is already reached by the first loop. The second loop rechecked
the same refs with a misleading comment about unreachable entries.

Addresses PR review item #6.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants