Skip to content

feat(backend): implement reachability analysis for vulnerable dependencies#28

Merged
ionfwsrijan merged 7 commits into
ionfwsrijan:mainfrom
lakshay122007:feat/issue-22-reachability-analysis
Jun 3, 2026
Merged

feat(backend): implement reachability analysis for vulnerable dependencies#28
ionfwsrijan merged 7 commits into
ionfwsrijan:mainfrom
lakshay122007:feat/issue-22-reachability-analysis

Conversation

@lakshay122007
Copy link
Copy Markdown
Contributor

Before opening: make sure there is an issue tracking this work, and link it below. PRs without a linked issue may be closed without review.

Linked issue

Closes #22

What this PR does

This PR introduces a reachability analysis engine that triggers after the osv-scanner finishes. It actively parses the source code directory to detect if flagged vulnerable packages are actually being imported (e.g., require('pkg') or import 'pkg'), helping developers distinguish between passive, unused dependencies and actively executing vulnerabilities.

Type of change

  • Bug fix
  • New feature
  • ML model / training pipeline
  • Refactor (no behaviour change)
  • Documentation
  • Tests only

ML tier (if applicable)

  • Tier 1 — Triage
  • Tier 2 — Predictive
  • Tier 3 — Autonomous
  • Not ML-related

Changes

Backend

  • app/models.py: Extended the Finding schema with a new Reachability Pydantic model (reachable boolean and evidence string).
  • app/utils/fs.py: Added check_reachability, a highly optimized regex parser. It uses os.walk with in-place pruning of heavy directories (node_modules, .git, venv, etc.) to quickly find active imports in target source files without tanking performance.
  • app/scanners/osv.py: Implemented a post-scan hook that iterates over the OSV findings, runs the reachability check for each vulnerable package, and mutates the finding object before returning it.

Testing

How did you test this?

  • Created a local, isolated dummy Node.js project.
  • Installed an outdated version of lodash (4.17.20) with known vulnerabilities.
  • Created an index.js file containing const _ = require('lodash');.
  • Executed the run_osv_scanner directly against the dummy project and verified the final JSON payload successfully appended: "reachability": { "reachable": true, "evidence": "Imported in index.js: line 1" }.

see the screenshot -

Screenshot 2026-06-03 at 5 24 46 PM

Checklist

  • Tested locally end-to-end (upload ZIP or GitHub URL → scan → findings returned correctly)
  • New ML model falls back gracefully when model file is absent
  • No new console.error or unhandled Python exceptions introduced
  • Added or updated tests where applicable
  • requirements.txt / package.json updated if new dependencies added
  • New model files (.pkl, .pt, etc.) are gitignored, not committed

@lakshay122007
Copy link
Copy Markdown
Contributor Author

hi @ionfwsrijan kindly review it and let me know if any changes required. also attatched the screenshot for verification, its shows - reachability: true. Thanks!

@lakshay122007 lakshay122007 changed the title feat(backend): implement reachability analysis for vulnerable depende… feat(backend): implement reachability analysis for vulnerable dependencies Jun 3, 2026
@ionfwsrijan ionfwsrijan added enhancement New feature or request backend Backend issues medium Medium difficulty SSoC26 labels Jun 3, 2026
@ionfwsrijan ionfwsrijan requested a review from Copilot June 3, 2026 12:56
@ionfwsrijan
Copy link
Copy Markdown
Owner

@lakshay122007 The code looks clean and good to me.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Implements a post-processing “reachability analysis” step for OSV dependency findings, enriching each finding with whether the vulnerable package appears to be imported/required in the scanned source tree.

Changes:

  • Extends the backend finding schema with an optional reachability object (reachable, evidence).
  • Adds a filesystem-based reachability check that scans source files for import / require(...) patterns while pruning heavy directories.
  • Hooks reachability computation into the OSV scanner output so findings are returned with reachability metadata.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

File Description
backend/app/utils/fs.py Adds check_reachability() to scan repo source files for import/require evidence while skipping heavy directories.
backend/app/scanners/osv.py Enriches OSV findings with reachability results after the scanner JSON is parsed.
backend/app/models.py Introduces Reachability model and adds optional reachability field to Finding.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread backend/app/scanners/osv.py Outdated
Comment thread backend/app/utils/fs.py Outdated
@ionfwsrijan
Copy link
Copy Markdown
Owner

@lakshay122007 Kindly review copilot comments

lakshay122007 and others added 2 commits June 3, 2026 18:36
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
@lakshay122007
Copy link
Copy Markdown
Contributor Author

lakshay122007 commented Jun 3, 2026

Done! @ionfwsrijan made the changes and they were valid also.

@ionfwsrijan
Copy link
Copy Markdown
Owner

Done! @ionfwsrijan made the changes and they were valid also.

Yes. Thanks for fixing them up.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.

Comment thread backend/app/utils/fs.py
Comment thread backend/app/utils/fs.py Outdated
Comment thread backend/app/scanners/osv.py Outdated
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
@lakshay122007
Copy link
Copy Markdown
Contributor Author

let me fix the rest of the changes

@ionfwsrijan
Copy link
Copy Markdown
Owner

@lakshay122007 Can you please look into the suggestions again? They seem important to me.

@lakshay122007
Copy link
Copy Markdown
Contributor Author

wait i forgot reuff formatting, just a moment!

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated 4 comments.

Comment thread backend/app/utils/fs.py Outdated
Comment thread backend/app/utils/fs.py Outdated
Comment thread backend/app/scanners/osv.py
Comment thread backend/app/scanners/osv.py
@lakshay122007 lakshay122007 force-pushed the feat/issue-22-reachability-analysis branch from 0c3e1d0 to 17264e0 Compare June 3, 2026 15:03
@lakshay122007
Copy link
Copy Markdown
Contributor Author

lakshay122007 commented Jun 3, 2026

Why so many suggestions😭, i guess they all are just ruff formatting issues?? i have fixed it

@lakshay122007
Copy link
Copy Markdown
Contributor Author

Done @ionfwsrijan

@ionfwsrijan ionfwsrijan merged commit 2efec31 into ionfwsrijan:main Jun 3, 2026
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backend Backend issues enhancement New feature or request medium Medium difficulty SSoC26

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Reachability Analysis for Vulnerable Dependencies

3 participants