Add CRA/EN 18031 threat model: asset register, trust boundaries, security controls#1149
Add CRA/EN 18031 threat model: asset register, trust boundaries, security controls#1149spoorcc wants to merge 28 commits into
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
WalkthroughAdds an executable pytm-based threat model and Sphinx tooling to render it, introduces comprehensive security documentation (CRA/EN 18031 alignment, STRIDE), publishes structured threats JSON, updates docs/toctree and changelog, and adds a ChangesSecurity model + docs + build
Sequence Diagram(s)sequenceDiagram
participant Contributor as Contributor
participant Repo as GitHub Repository
participant CI as GitHub Actions Runner
participant Df as dfetch CLI
participant VCS as Remote VCS / Archive
participant PyPI as PyPI/TestPyPI
participant Consumer as Consumer Build System
Contributor->>Repo: Push/PR (manifest, source, workflows)
Repo->>CI: Trigger workflow (OIDC, pinned Actions)
CI->>Df: Run dfetch (parse manifest, fetch, validate)
Df->>VCS: Clone / fetch archives
VCS-->>Df: Serve archives / source
Df->>Df: Validate archives (paths, symlinks, decompression, integrity optional)
CI->>PyPI: Publish package (OIDC trusted publish)
Consumer->>PyPI: Install package
Consumer->>Df: Execute fetched sources during build
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Suggested labels
Suggested reviewers
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 9
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@doc/explanation/security.rst`:
- Line 205: Replace the repo path string `/.github/workflows/*.yml` with the
repo-relative form `.github/workflows/*.yml` in the documentation so it does not
imply an absolute filesystem path; search for the exact token
`/.github/workflows/*.yml` in doc/explanation/security.rst and update it to
`.github/workflows/*.yml`.
- Around line 166-172: Add the missing asset declaration for SA-01 in the threat
model so the executable model matches the RST: create an Asset (or call the
model's register_asset / add_asset helper) with id "SA-01", name "dfetch
Process", type/category matching other assets, and a description that mirrors
the RST text (including "Restricted" and the subprocess injection/dependency
confusion concerns); ensure the new Asset is included in the model collection
used by the report generator (e.g., ThreatModel.assets or the register/export
path) so python -m security.threat_model --report emits the same SA-01 entry as
the documentation.
- Around line 393-492: Replace the hardcoded count "17 supply-chain health
checks" with a non-static description so it won't go stale: update the doc text
that mentions "17 supply-chain health checks" to say something like "the full
set of OpenSSF Scorecard checks" (or "the current set of OpenSSF Scorecard
checks") and optionally add a parenthetical note pointing to the Scorecard
project for the authoritative list; also update or remove the corresponding
enumeration in threat_model.py (the list of check names) so the docs and
threat_model.py remain consistent when the Scorecard checks change.
- Line 17: The reference link for `EN 18031` points to an incorrect ETSI URL and
should be replaced with a correct source; update the link target used in the
reference label `EN 18031` to point to the CEN-CENELEC information page or an
authorized distributor (for example the CEN-CENELEC standards overview or an
iTeh/national standards body preview) and adjust the link text if needed so the
doc accurately identifies EN 18031 as a CEN/CENELEC standard rather than linking
to ETSI.
- Around line 408-410: Replace the hardcoded line-number reference in the
security.rst text with the function name only: update the fragment that
currently reads "dfetch/util/util.py:265–285" to reference
"check_no_path_traversal() in dfetch/util/util.py" (or simply
"check_no_path_traversal()") so the docs cite the function by name and remove
the specific line numbers; locate the mention in doc/explanation/security.rst
and change the referenced text accordingly.
In `@pyproject.toml`:
- Line 110: The security.rst "Regenerate analysis output" section currently
shows the diagram generation commands (--seq and --dfd) but omits that PlantUML
and Graphviz are external system dependencies; update that section to add a
short note stating that running the diagram commands requires PlantUML and
Graphviz to be installed on the system (they are not installed via pip and pip
install .[security] alone will not produce diagrams), and mention alternatives
or links if desired so maintainers know the extra prerequisites before running
those commands.
In `@security/threat_model.py`:
- Around line 376-384: The EA-06..EA-08 assets are missing from the
environmental-assets block; add mapping comments so the module matches
doc/explanation/security.rst by annotating EA-06 → developer and/or
gh_actions_runner (Developer Workstation / CI Runner), EA-07 → consumer_build
(Consumer Build System), and EA-08 → network_transport (Network Transport).
Insert these concise comment lines next to the existing EA-01..EA-05 block using
the exact symbol names developer, gh_actions_runner, consumer_build, and
network_transport so the executable model and RST asset register stay aligned.
- Around line 242-374: The supporting-asset numbering is inconsistent: SA-01
("dfetch Process") is documented but missing from the model; either add SA-01 in
this file mapping to the existing dfetch_cli Process or remove SA-01 from the
RST. Fix by adding a Data/Process entry named "SA-01: dfetch Process" (matching
fields used for other assets) and set its identifier/attributes to reflect the
existing dfetch_cli Process (or alternatively remove the SA-01 row from
doc/explanation/security.rst); ensure the symbol dfetch_cli is referenced/used
so the model and documentation remain consistent.
- Around line 211-213: The hasWriteAccess flag for integrity_hash_record is
incorrect: update integrity_hash_record.hasWriteAccess to True (since
developers/processes write the dfetch.yaml field) and similarly review and set
pypi_package.hasWriteAccess = True where the publish pipeline writes to it;
locate the Datastore objects named integrity_hash_record and pypi_package in
threat_model.py and set their hasWriteAccess to reflect actual write capability
by processes/pipelines while keeping integrity_hash_record.classification =
Classification.SENSITIVE unchanged.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 0131831c-88e8-40e8-800c-522279fc431b
📒 Files selected for processing (6)
CHANGELOG.rstdoc/explanation/security.rstdoc/index.rstpyproject.tomlsecurity/__init__.pysecurity/threat_model.py
8948021 to
c5d1217
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (2)
doc/explanation/security.rst (1)
17-17:⚠️ Potential issue | 🟡 MinorEN 18031 link still generic.
https://www.cenelec.eu/is the CEN-CENELEC homepage, not a page about EN 18031. Consider linking to the CEN-CENELEC standards portal entry or an authorised distributor preview (e.g. iTeh) so readers can reach the actual standard in one click.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@doc/explanation/security.rst` at line 17, The EN 18031 anchor target currently points to the generic CEN-CENELEC homepage; update the reST reference for "`EN 18031`" in security.rst so it links directly to the standard’s specific entry (for example the CEN/CENELEC standards portal page or an authorised distributor preview such as iTeh) instead of https://www.cenelec.eu/, ensuring the anchor text remains "`EN 18031`" and the URL points to the specific standard resource.security/threat_model.py (1)
256-259: 🧹 Nitpick | 🔵 TrivialNit: consider declaring SA-01 as a first-class model element instead of comment-only.
The comment here maps SA-01 →
dfetch_cli, but the pytm--reportoutput won't emit an "SA-01" label next to thedfetch CLIprocess. If you want the executable model to surface the same SA-XX identifiers as the RST asset register, consider renamingdfetch_cli = Process("SA-01: dfetch CLI")(mirroring the naming convention used for PA-0x/SA-02..SA-10 datastores). Same applies to EA-01..EA-08 mapping comments at Lines 394–401. Non-blocking.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@security/threat_model.py` around lines 256 - 259, The SA-01 comment should be represented as a first-class pytm model element: rename the existing dfetch_cli Process creation to include the SA identifier (e.g. dfetch_cli = Process("SA-01: dfetch CLI")) so the --report output shows the SA-01 label; similarly update the model elements referenced by the EA-01..EA-08 comments so their Process/Entity constructors include the EA-0x prefix (e.g. Process("EA-01: ...") or DataStore("EA-02: ...")) to mirror the RST asset register and surface those identifiers in reports.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@doc/explanation/security.rst`:
- Around line 303-307: Replace the empty classification marker '—' for the EA-08
/ Network Transport table row with a clear label (e.g., "Public" or "N/A") and,
if choosing "N/A", add a brief footnote explaining why the transport channel is
not confidential; specifically update the EA-08 row that currently contains '—'
(the Network Transport entry listing "HTTPS, SSH, SVN, and plain HTTP carrying
all remote fetch traffic") so the classification column matches the ISO/IEC
27005 taxonomy used elsewhere.
---
Duplicate comments:
In `@doc/explanation/security.rst`:
- Line 17: The EN 18031 anchor target currently points to the generic
CEN-CENELEC homepage; update the reST reference for "`EN 18031`" in security.rst
so it links directly to the standard’s specific entry (for example the
CEN/CENELEC standards portal page or an authorised distributor preview such as
iTeh) instead of https://www.cenelec.eu/, ensuring the anchor text remains "`EN
18031`" and the URL points to the specific standard resource.
In `@security/threat_model.py`:
- Around line 256-259: The SA-01 comment should be represented as a first-class
pytm model element: rename the existing dfetch_cli Process creation to include
the SA identifier (e.g. dfetch_cli = Process("SA-01: dfetch CLI")) so the
--report output shows the SA-01 label; similarly update the model elements
referenced by the EA-01..EA-08 comments so their Process/Entity constructors
include the EA-0x prefix (e.g. Process("EA-01: ...") or DataStore("EA-02: ..."))
to mirror the RST asset register and surface those identifiers in reports.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: acb6cfa3-58a6-4dd6-91aa-2838010a0ecf
📒 Files selected for processing (3)
.devcontainer/Dockerfiledoc/explanation/security.rstsecurity/threat_model.py
c5d1217 to
7b284be
Compare
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Actionable comments posted: 4
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
pyproject.toml (1)
110-116:⚠️ Potential issue | 🟠 MajorInclude the
securitypackage in setuptools discovery or document source-checkout-only usage.The new
securityextra installspytm, butsecurity/threat_model.pyis excluded from the configured package include list (dfetch,dfetch.*). Thepython -m security.threat_model --reportworkflow will only work from a repository checkout, not from a standard wheel install of the project with the security extra.Suggested fix
[tool.setuptools.packages.find] -include = ["dfetch", "dfetch.*"] +include = ["dfetch", "dfetch.*", "security", "security.*"]🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pyproject.toml` around lines 110 - 116, The security extra currently installs pytm but the security package (security/threat_model.py) is not included by setuptools discovery because the include list only has "dfetch" and "dfetch.*"; update the package discovery in pyproject.toml to also include "security" and "security.*" so the security module is bundled into wheels (or alternatively add a short note in your packaging documentation that the python -m security.threat_model --report workflow only works from a source checkout and not from a wheel), and ensure references to the extra name "security" and the module path "security.threat_model" are consistent with the change.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@doc/explanation/security.rst`:
- Around line 417-421: Update the documentation text describing the
path-traversal control to match the actual implementation: replace the reference
to os.path.realpath with either the exact API used (pathlib.Path.resolve()) or a
generic phrase like “resolves paths” and mention check_no_path_traversal() from
dfetch/util/util.py so the doc accurately reflects the implementation.
- Around line 265-270: The current EA-01 "Remote VCS Servers" entry wrongly
implies non-TLS MITM is mitigated by SSH controls; update the EA-01 text to
explicitly state that "BatchMode=yes" and SSH host-key verification only affect
SSH remotes and do not protect plain http:// or svn:// transports, call out
residual MITM risk for non-TLS endpoints, and recommend using TLS-protected
transports (HTTPS, svn+https) or other TLS/SSH tunnels for those hosts instead;
keep the existing mention of GitHub/GitLab/Gitea but reword the mitigation
sentence to separate SSH-specific controls (BatchMode, host-key verification)
from general TLS requirements.
In `@security/threat_model.py`:
- Around line 409-428: The df03 and df04 Dataflow objects currently mark
encryption/integrity controls as unconditional even though they accept
unencrypted svn:// and http:// paths; update the model by removing or setting to
False the unconditional flags on df03.controls.isEncrypted and
df04.controls.providesIntegrity, and instead create separate Dataflow entries
(e.g., df03_https_ssh, df03_plain and df04_https_ssh, df04_plain) that
explicitly represent HTTPS/SSH (isEncrypted=True, providesIntegrity=True) versus
svn/http (isEncrypted=False, providesIntegrity=False) paths so the pytm report
reflects conditional guarantees; also apply the same split for df07-related
flows and ensure corresponding changes are mirrored in
doc/explanation/security.rst to keep docs and model aligned.
- Around line 312-325: The model declares gh_workflows = Datastore("SA-06:
GitHub Actions Workflows") but contradictorily sets gh_workflows.hasWriteAccess
= False while the description explicitly states PRs can modify
.github/workflows/*.yml; update the executable model by setting
gh_workflows.hasWriteAccess = True to reflect writable workflow definitions (and
re-evaluate gh_workflows.storesSensitiveData/classification if needed) and
ensure the change in security/threat_model.py (references: gh_workflows,
Datastore, boundary_github, hasWriteAccess) is kept in sync with
doc/explanation/security.rst.
---
Outside diff comments:
In `@pyproject.toml`:
- Around line 110-116: The security extra currently installs pytm but the
security package (security/threat_model.py) is not included by setuptools
discovery because the include list only has "dfetch" and "dfetch.*"; update the
package discovery in pyproject.toml to also include "security" and "security.*"
so the security module is bundled into wheels (or alternatively add a short note
in your packaging documentation that the python -m security.threat_model
--report workflow only works from a source checkout and not from a wheel), and
ensure references to the extra name "security" and the module path
"security.threat_model" are consistent with the change.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: c4c0b47e-f52e-4a55-b262-8ef2a7e19fa1
📒 Files selected for processing (8)
.devcontainer/DockerfileAGENTS.mdCHANGELOG.rstdoc/explanation/security.rstdoc/index.rstpyproject.tomlsecurity/__init__.pysecurity/threat_model.py
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Actionable comments posted: 3
♻️ Duplicate comments (1)
doc/explanation/security.rst (1)
442-442: 🧹 Nitpick | 🔵 TrivialHard-coded line number (
line 277) will drift.This reference pins to a specific line in
dfetch/util/util.pythat will silently become incorrect on any refactor in that file. Prefer referencing the function by name only (e.g.,check_no_path_traversal()indfetch/util/util.py) so the docs stay correct without coupling to line numbers.Proposed fix
- ``dfetch/util/util.py`` — ``check_no_path_traversal`` at line 277 + ``check_no_path_traversal()`` in ``dfetch/util/util.py``🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@doc/explanation/security.rst` at line 442, Remove the hard-coded line number and reference the function by name and file only: replace the text "``dfetch/util/util.py`` — ``check_no_path_traversal`` at line 277" with something like "``check_no_path_traversal()`` in ``dfetch/util/util.py``"; ensure you reference the function symbol check_no_path_traversal() (no line numbers) so the documentation remains correct after refactors.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@pyproject.toml`:
- Line 116: The pyproject.toml currently includes a top-level "security" package
via the include entry (include = ["dfetch", "dfetch.*", "security",
"security.*"]), which exposes a global security module; either vendor/relocate
the code under your project namespace (move the package into dfetch/security —
e.g., dfetch/security/threat_model.py and update any module imports and python
-m entrypoints to use dfetch.security.threat_model) or remove "security" and
"security.*" from the include list so the package is not distributed in the
wheel; update the include array accordingly and fix any import paths referencing
the old top-level "security" symbol.
In `@security/threat_model.py`:
- Around line 94-108: The EA IDs and titles are out of order and EA-04's title
doesn't match the docs; update the declarations so EA-03 (gh_repository) is
declared before EA-04 and rename gh_actions_runner to use the exact title
"EA-04: GitHub Actions Infrastructure" instead of "GitHub Actions Runner";
ensure gh_repository and gh_actions_runner (the two ExternalEntity objects)
retain their current inBoundary and description content but appear in the
correct order and with the exact EA-03/EA-04 titles to match
doc/explanation/security.rst.
- Line 127: The Process instance dfetch_cli is labeled "SA-01: dfetch CLI" but
the RST uses "dfetch Process"; update the Process(...) call for dfetch_cli to
use the exact RST title ("SA-01: dfetch Process") so the threat ID text matches
across surfaces, and scan for any other occurrences of "SA-01" to ensure
consistent naming.
---
Duplicate comments:
In `@doc/explanation/security.rst`:
- Line 442: Remove the hard-coded line number and reference the function by name
and file only: replace the text "``dfetch/util/util.py`` —
``check_no_path_traversal`` at line 277" with something like
"``check_no_path_traversal()`` in ``dfetch/util/util.py``"; ensure you reference
the function symbol check_no_path_traversal() (no line numbers) so the
documentation remains correct after refactors.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 4758c3d8-3d8e-4b75-bf40-ef9b25926bb8
📒 Files selected for processing (3)
doc/explanation/security.rstpyproject.tomlsecurity/threat_model.py
There was a problem hiding this comment.
Actionable comments posted: 2
♻️ Duplicate comments (2)
security/threat_model.py (1)
469-475:⚠️ Potential issue | 🟠 MajorDo not mark DF-07 integrity as unconditional.
Line 475 still reports integrity for every vendored-file write, but DF-07 can write content from Git/SVN, optional-hash archives, or patch application. This makes the generated STRIDE report overstate the guarantee; keep this
Falseunless the flow is split by integrity-verified vs unverified sources.🛠️ Proposed fix
df07.controls.sanitizesInput = True -df07.controls.providesIntegrity = True +df07.controls.providesIntegrity = False # not guaranteed for VCS, patch, or optional-hash archive pathsBased on learnings, after security-relevant changes, review and update
security/threat_model.pyanddoc/explanation/security.rstto keep them aligned.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@security/threat_model.py` around lines 469 - 475, The Dataflow df07 is incorrectly marked as always providing integrity; modify df07.controls.providesIntegrity to False (or only set True on a new, separate Dataflow representing integrity-verified sources) so DF-07 does not overstate guarantees; update the DF-07 description if needed and ensure any related documentation (e.g., security/threat_model.py references and doc/explanation/security.rst) is updated to reflect the conditional integrity status.doc/explanation/security.rst (1)
435-442:⚠️ Potential issue | 🟡 MinorAlign the path-traversal control description with the implementation.
This section now says
check_no_path_traversal()usesos.path.realpathand notpathlib.Path.resolve; please verify the function and keep the API reference exact. If the implementation still usesPath.resolve(), this doc should be reverted to that wording.🔎 Read-only verification
#!/bin/bash # Description: Inspect the actual path traversal helper implementation. # Expected: The documentation should name the same path-resolution API used here. rg -n -A30 -B5 'def\s+check_no_path_traversal\b|check_no_path_traversal\s*=' --type py📝 Proposed fix if the helper uses pathlib
- - ``check_no_path_traversal()`` resolves both the candidate path and the - destination root via ``os.path.realpath`` (symlink-aware, not - ``pathlib.Path.resolve``), then rejects any path whose resolved prefix - does not start with the resolved root. Applied to every file copy and - post-extraction symlink. + - ``check_no_path_traversal()`` resolves both the candidate path and the + destination root with ``pathlib.Path.resolve()``, then rejects any path + that escapes the resolved root. Applied to every file copy and + post-extraction symlink.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@doc/explanation/security.rst` around lines 435 - 442, The documentation claims check_no_path_traversal() uses os.path.realpath but the implementation may use pathlib.Path.resolve; inspect the actual implementation of check_no_path_traversal in dfetch/util/util.py and update this doc block to match: if the function uses pathlib.Path.resolve, revert the wording to mention pathlib.Path.resolve (including any resolve(strict=...) behavior), otherwise keep the os.path.realpath wording; ensure the doc references the exact API used and preserves mention that it compares resolved candidate path prefix against the resolved destination root.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@doc/explanation/security.rst`:
- Around line 19-26: Update the security.rst snippet to clarify that the listed
python -m security.threat_model commands must be run from a repository source
checkout with the repository root on PYTHONPATH (or run from the repo root)
because the security/ package is intentionally excluded from built wheels; note
that pip install dfetch[security] will not install security/threat_model. Edit
the paragraph that references security/threat_model.py and the example commands
to add a short sentence stating this checkout/PYTHONPATH requirement and the
exclusion from packaged wheels so users don't expect pip-only availability.
In `@security/threat_model.py`:
- Around line 501-508: The DF-12 dataflow is currently modeled in the wrong
direction; update the Dataflow construction for df12 so it represents
gh_repository -> gh_actions_runner (repository to runner) instead of
gh_actions_runner to gh_repository, and ensure the descriptive text still
explains that GitHub Actions checks out source from the repository; also update
the DF-12 entry in doc/explanation/security.rst so the diagram/text reflects the
repository → actions runner direction. Locate the Dataflow instance named df12
and change its source/target ordering (gh_repository and gh_actions_runner) and
then edit the DF-12 row in security.rst to match the same direction and wording.
---
Duplicate comments:
In `@doc/explanation/security.rst`:
- Around line 435-442: The documentation claims check_no_path_traversal() uses
os.path.realpath but the implementation may use pathlib.Path.resolve; inspect
the actual implementation of check_no_path_traversal in dfetch/util/util.py and
update this doc block to match: if the function uses pathlib.Path.resolve,
revert the wording to mention pathlib.Path.resolve (including any
resolve(strict=...) behavior), otherwise keep the os.path.realpath wording;
ensure the doc references the exact API used and preserves mention that it
compares resolved candidate path prefix against the resolved destination root.
In `@security/threat_model.py`:
- Around line 469-475: The Dataflow df07 is incorrectly marked as always
providing integrity; modify df07.controls.providesIntegrity to False (or only
set True on a new, separate Dataflow representing integrity-verified sources) so
DF-07 does not overstate guarantees; update the DF-07 description if needed and
ensure any related documentation (e.g., security/threat_model.py references and
doc/explanation/security.rst) is updated to reflect the conditional integrity
status.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: df1ae2e4-0318-4df3-b7e9-26316ad11a9a
📒 Files selected for processing (3)
doc/explanation/security.rstpyproject.tomlsecurity/threat_model.py
5686b14 to
e34d3ff
Compare
…rity controls - security/threat_model.py: pytm-based threat model covering the full SDLC (runtime, GitHub Actions CI/CD, PyPI distribution). Defines 6 trust boundaries, 5 primary assets (PA-01..PA-05), 10 supporting assets (SA-01..SA-10), 8 environmental assets (EA-01..EA-08), and 15 annotated data flows with existing-control flags. - doc/explanation/security.rst: Sphinx RST page documenting the asset register, trust boundaries, data flows, implemented controls table, and known gaps/residual risks — aligned with CRA Article 13 and EN 18031. - doc/index.rst: wire security.rst into the Explanation toctree. - pyproject.toml: add pytm==1.3.1 as a new [security] optional dependency. - CHANGELOG.rst: record the new security documentation in 0.14.0 (unreleased). https://claude.ai/code/session_01Rc28JtpAPWhJtA3YvS5kcr
security/threat_model.py: - Split DF-03/DF-04 into DF-03a/b and DF-04a/b (HTTPS+SSH vs svn://+http://) so pytm STRIDE analysis reflects conditional encryption guarantees rather than unconditionally marking unencrypted transports as encrypted. - Set gh_workflows.hasWriteAccess = True (was False but description stated PRs can modify .github/workflows/ definitions — now consistent). - Remove threatsFile=None (pytm rejects None; omitting uses built-in library). doc/explanation/security.rst: - EA-01: Separate SSH-specific controls (BatchMode=yes, host-key verification) from non-TLS transports (http://, svn://), which remain unmitigated; add explicit MITM residual risk and HTTPS/svn+https:// recommendation. - Data Flows table: replace single DF-03/DF-04 rows with DF-03a/b and DF-04a/b to match the split in the pytm model. - Path-traversal control: clarify that check_no_path_traversal() uses os.path.realpath (not pathlib.Path.resolve); fix line reference to 277. pyproject.toml: - Add "security" and "security.*" to setuptools package discovery so that `pip install .[security]` bundles the threat model module and `python -m security.threat_model` works from an installed wheel. https://claude.ai/code/session_01Rc28JtpAPWhJtA3YvS5kcr
…security package pyproject.toml: - Remove "security" and "security.*" from setuptools include list; the threat model is a source-checkout-only compliance tool and must not be bundled as a top-level namespace package in the distributed wheel. security/threat_model.py: - Declare gh_repository (EA-03) before gh_actions_runner (EA-04) to match numeric order in the RST asset register. - Rename gh_actions_runner label from "EA-04: GitHub Actions Runner" to "EA-04: GitHub Actions Infrastructure" to match doc/explanation/security.rst. - Rename dfetch_cli label from "SA-01: dfetch CLI" to "SA-01: dfetch Process" to match the Supporting Assets table in security.rst. doc/explanation/security.rst: - Remove hard-coded line number from path-traversal control entry; reference the function symbol check_no_path_traversal() and file only so the doc stays correct after refactors. https://claude.ai/code/session_01Rc28JtpAPWhJtA3YvS5kcr
Inserts the mandatory eight-section Product Security Context note before the Scope and Assumptions section in doc/explanation/security.rst. Covers: product/manufacturer identification, IPFRU, user roles (Developer / CI Runner / Security-operator), operating environment, architecture and connectivity, external dependencies, security assumptions, and support period / data handling. https://claude.ai/code/session_01Rc28JtpAPWhJtA3YvS5kcr
b25e9da to
092f230
Compare
Implements a custom Sphinx extension (doc/_ext/pytm_directive.py) that renders sections of the pytm threat model directly at build time — no pre-generated intermediate files, no duplication between code and docs. Extension highlights - .. pytm:: directive with :assets:, :dataflows:, :controls:, :gaps:, and :threats: flags; any subset may be combined - Model path configured via pytm_model in conf.py (absolute path); directive also accepts an explicit relative-path argument - Incremental builds: env.note_dependency() marks the document for rebuild whenever security/threat_model.py changes - Single-build caching: model is loaded once in the builder-inited hook (single-threaded) and stored in app._pytm_cache; directive runs only read from cache, so parallel_read_safe = True is correct - Thread-safe lazy fallback: _load_lock guards the rare case of an explicit path not covered by the preload Threat model additions (security/threat_model.py) - CONTROLS list (17 entries): each control has name, assets, implementation text (RST markup supported), and reference path - GAPS list (8 entries): each gap has name and description (RST markup) - classification added to all ExternalEntity and Process elements so the asset table renders complete Classification column values - Data objects fetched from TM._data (separate from TM._elements in pytm) Documentation (doc/explanation/security.rst) - Four static list-table blocks replaced with four .. pytm:: directives (:assets:, :dataflows:, :controls:, :gaps:) — ~445 lines removed - Introductory paragraphs updated to note live generation from the model - Product Security Context note (added previously) unchanged Packaging - pytm==1.3.1 added to the docs extras group so pip install .[docs] installs the dependency for documentation builds https://claude.ai/code/session_01Rc28JtpAPWhJtA3YvS5kcr
…plantweb silently overwrites it at the docutils level (which is exactly what its own source comment says it does to avoid generating warnings). The [app.add_directive] warning disappears, and .. graphviz:: continues to work for the DFD rendering.
24e5a92 to
642a1e1
Compare
- _generate_diagrams: pre-renders PlantUML sequence diagram to PNG via
plantweb Python API (confdir/_static/pytm/threat_model_seq.png) so
the LaTeX builder embeds it with ``.. image::`` rather than calling
the network-dependent ``.. uml::`` directive at compile time.
Falls back silently to ``.. only:: html`` wrapper when plantweb is
unavailable or the render server is unreachable.
- _render_dfd: removes the ``.. only:: html`` guard so sphinx.ext.graphviz
produces the DFD via local ``dot`` binary in both HTML and PDF builds.
Previously the guard excluded the diagram from PDF entirely.
- _list_table: prepends ``.. tabularcolumns::`` with proportional
``p{\linewidth}`` column specs so Sphinx uses the LaTeX ``longtable``
environment, allowing tall tables (controls, gaps, threats) to break
vertically across page boundaries.
https://claude.ai/code/session_01Rc28JtpAPWhJtA3YvS5kcr
The previous commit emitted bare ``.. graphviz::`` without an ``.. only::
html`` guard. On CI runners where the ``dot`` binary is not installed
sphinx.ext.graphviz fails silently and emits ``\includegraphics{None}``,
which crashes pdflatex.
Fix: _generate_diagrams now calls ``dot -Tpng`` via subprocess to pre-render
the DFD to _static/pytm/threat_model_dfd.png (same pattern as the plantweb
pre-render for the sequence diagram). _render_dfd emits ``.. image::`` when
the PNG is available. If ``dot`` is absent the exception is suppressed and the
output falls back to ``.. only:: html`` (diagram visible in HTML, absent from
PDF but no crash).
Both diagrams therefore appear in PDF when the necessary tools are available
on the build host, and neither causes a build failure when they are not.
https://claude.ai/code/session_01Rc28JtpAPWhJtA3YvS5kcr
…ols/gaps Each ID cell in the threats, controls, gaps, assets, and dataflows tables now carries a ``.. _label:`` anchor so Sphinx registers it as a named cross-reference target. Cross-reference columns (Controls column in the threats table; Asset(s) and Threat(s) columns in the controls and gaps tables) now emit ``:ref:`` links instead of plain text, allowing one-click navigation between related items in both the HTML and PDF renders. A new _ref_list() helper builds comma-separated ``:ref:`ID <id>``` strings and returns an em-dash for empty lists. _list_table() gains an anchor_first_col flag that injects ``.. _label:`` + blank line before each ID cell, which Sphinx picks up as a valid explicit hyperlink target. https://claude.ai/code/session_01Rc28JtpAPWhJtA3YvS5kcr
The pre-render-to-PNG approach was unreliable: subprocess dot and plantweb
API calls failed silently, falling back to ``.. only:: html`` and hiding
both diagrams from the PDF.
Root cause: graphviz (dot) is not installed in the CI doc/PDF build jobs,
and sphinx.ext.graphviz unconditionally emits \sphinxincludegraphics{None}
when dot is absent — crashing pdflatex.
Fix: use ``.. uml::`` via plantweb.directive for both diagrams, matching
exactly how the existing architecture diagrams are rendered:
- Sequence diagram: pass pytm's @startuml/@enduml output directly to
``.. uml::`` — plantweb sends it to plantuml.com.
- Data-flow diagram: wrap pytm's GraphViz DOT output in PlantUML
@startdot/@enddot tags and pass it to ``.. uml::`` — plantuml.com
renders GraphViz natively, no local dot binary required.
plantuml.com is in the egress allowlist for both the HTML and PDF CI jobs
and is already used successfully for all other UML diagrams in the docs.
This also removes the complex pre-render infrastructure (_generate_diagrams
confdir parameter, seq_img_path/dfd_img_path in the data dict, subprocess
and plantweb-API pre-render blocks) which is no longer needed.
https://claude.ai/code/session_01Rc28JtpAPWhJtA3YvS5kcr
…eX PDF
Root cause: plantweb saves rendered PNGs to builder.outdir (``_build/latex/
plantweb/hash.png`` for the LaTeX builder, because ``builder.imagedir = ''``).
Sphinx's copy_image_files copies images FROM srcdir TO outdir, so it copies
``srcdir/_build/latex/plantweb/hash.png → outdir/hash.png``. But srcdir is the
RST source tree and ``_build/`` is outside it, so the source path doesn't
resolve and the copy silently fails. pdflatex then crashes with:
! Unable to load picture or PDF file 'hash.png'
Fix: separate HTML and LaTeX rendering:
HTML (``.. only:: html``): ``.. uml::`` via plantweb.directive renders at Sphinx
read time, calling plantuml.com. This correctly generates images in
``builder.outdir/_images/plantweb/`` for the HTML builder.
LaTeX (``.. only:: latex``): ``_generate_diagrams`` pre-renders both diagrams
to PNG using the plantweb Python API during ``builder-inited``, saving PNGs to
``confdir/_static/pytm/`` (the SOURCE TREE). ``.. image:: /_static/pytm/...``
then references these files. Sphinx resolves the /-path relative to srcdir,
finds the PNG in the source tree, and copies it to ``_build/latex/`` where
pdflatex can load it with ``\includegraphics{threat_model_dfd.png}``.
If pre-rendering fails (plantweb unavailable, network blocked), the LaTeX block
is simply omitted; diagrams are HTML-only but the PDF build does not crash.
DFD: the GraphViz DOT output from pytm is wrapped in @startdot/@enddot so
plantweb renders it via PlantUML's bundled GraphViz — no local dot binary needed.
https://claude.ai/code/session_01Rc28JtpAPWhJtA3YvS5kcr
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Actionable comments posted: 6
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In @.devcontainer/Dockerfile:
- Line 56: The pip install extras list in the devcontainer Dockerfile includes a
non-existent "security" extra which will break the build; either remove
"security" from the pip install -e .[development,docs,test,casts,build,security]
invocation in the Dockerfile, or add a proper
[project.optional-dependencies].security section to pyproject.toml that lists
the intended packages (e.g., pytm) so the extra exists; update the Dockerfile's
pip extras or pyproject.toml's optional-dependencies to match the actual extras
(development, docs, test, casts, build, sbom, wheel).
In `@doc/_ext/pytm_directive.py`:
- Around line 134-138: The directive currently only marks the Python model file
as a dependency which leaves non-Python inputs like security/threats.json
untracked; update the dependency registration in run() (and where
_get_model_data() is called) to also call env.note_dependency() for the threats
JSON (and any other non-Python inputs used to render the threats/controls
section) so Sphinx will rebuild on JSON edits and avoid stale in-memory cache
content.
- Around line 369-371: The code currently uses inline "# type: ignore" on the
importlib spec and on a docutils container children usage; instead explicitly
handle nullable and type-mismatch cases: after calling
importlib.util.spec_from_file_location("_pytm_model_tmp", model_path) check that
the returned spec is not None and that spec.loader is present (raise a clear
ImportError or ValueError with context if absent) before calling
spec.loader.exec_module(mod), and call importlib.util.module_from_spec(spec)
only after that check; for the docutils container, replace direct uses of
container.children with an explicit list conversion (e.g., children =
list(container.children)) or appropriate typing cast so the code no longer needs
"# type: ignore" and types are explicit.
In `@doc/explanation/security.rst`:
- Around line 75-77: Update the text to narrow the claim: change the sentence to
state that dfetch validates the destination path via check_no_path_traversal()
before extraction (as performed in Update._check_destination()) and that archive
symlinks are validated post-extraction (see archive symlink checks in
dfetch/vcs/archive.py), rather than implying the check runs on every individual
file copy.
In `@security/threat_model.py`:
- Line 12: The file contains an inline pylint suppression comment "# pylint:
disable=too-many-lines" in security/threat_model.py which violates guidelines;
remove that suppression and resolve the root cause by refactoring the large
module: split the threat_model module into smaller modules (e.g., create
assets.py, dataflows.py, controls.py or move large model definitions into a
separate data/config file) and update imports/exports used by any functions or
classes referenced in threat_model (search for top-level symbols like the main
threat model classes or data structures) so the resulting files are under the
pylint line limits and the inline disable can be deleted.
- Around line 946-950: Update the C-021 description string in
security/threat_model.py to fix the grammar: replace "does generates" with
either "does generate" or simply "generates" in the description=(...) text so
the sentence reads correctly (e.g., "The release pipeline generates SLSA
provenance attestations..." or "The release pipeline does generate SLSA
provenance attestations...").
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 3fd610c8-b2a9-41e1-aa61-88d4165eab32
📒 Files selected for processing (13)
.devcontainer/DockerfileAGENTS.mdCHANGELOG.rstdoc/_ext/pytm_directive.pydoc/conf.pydoc/dfetch_preamble.incdoc/explanation/security.rstdoc/index.rstdoc/static/css/custom.csspyproject.tomlsecurity/__init__.pysecurity/threat_model.pysecurity/threats.json
|
|
||
| RUN pip install --no-cache-dir --root-user-action=ignore --upgrade pip==26.0.1 \ | ||
| && pip install --no-cache-dir --root-user-action=ignore -e .[development,docs,test,casts,build] \ | ||
| && pip install --no-cache-dir --root-user-action=ignore -e .[development,docs,test,casts,build,security] \ |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Verify the security extra is not defined in pyproject.toml
grep -n "^\[project.optional-dependencies\]" pyproject.toml
grep -n "^security\s*=" pyproject.toml || echo "No 'security' extra found in pyproject.toml"Repository: dfetch-org/dfetch
Length of output: 140
🏁 Script executed:
# Get the exact content of line 56 in Dockerfile
sed -n '56p' .devcontainer/DockerfileRepository: dfetch-org/dfetch
Length of output: 172
🏁 Script executed:
# Get the full optional-dependencies section from pyproject.toml to see all defined extras
sed -n '/^\[project.optional-dependencies\]/,/^\[/p' pyproject.toml | head -30Repository: dfetch-org/dfetch
Length of output: 931
🏁 Script executed:
# Read the complete optional-dependencies section from pyproject.toml
sed -n '/^\[project.optional-dependencies\]/,/^$/p' pyproject.tomlRepository: dfetch-org/dfetch
Length of output: 1625
🏁 Script executed:
# Verify that pip would fail with this unknown extra by checking pip behavior
# Also search Dockerfile for any other references to 'security'
grep -n "security" .devcontainer/DockerfileRepository: dfetch-org/dfetch
Length of output: 175
The security extra does not exist — this will break the devcontainer build.
The pyproject.toml defines these extras: development, docs, test, casts, build, sbom, wheel. There is no security extra. The pytm dependency was added to the docs extra, not a separate security extra.
Either remove security from this line, or add a [project.optional-dependencies].security section to pyproject.toml.
🐛 Proposed fix (remove non-existent extra)
- && pip install --no-cache-dir --root-user-action=ignore -e .[development,docs,test,casts,build,security] \
+ && pip install --no-cache-dir --root-user-action=ignore -e .[development,docs,test,casts,build] \📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| && pip install --no-cache-dir --root-user-action=ignore -e .[development,docs,test,casts,build,security] \ | |
| && pip install --no-cache-dir --root-user-action=ignore -e .[development,docs,test,casts,build] \ |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In @.devcontainer/Dockerfile at line 56, The pip install extras list in the
devcontainer Dockerfile includes a non-existent "security" extra which will
break the build; either remove "security" from the pip install -e
.[development,docs,test,casts,build,security] invocation in the Dockerfile, or
add a proper [project.optional-dependencies].security section to pyproject.toml
that lists the intended packages (e.g., pytm) so the extra exists; update the
Dockerfile's pip extras or pyproject.toml's optional-dependencies to match the
actual extras (development, docs, test, casts, build, sbom, wheel).
| # Mark this document as depending on the model file so Sphinx | ||
| # rebuilds it whenever the model changes. | ||
| env.note_dependency(model_path) | ||
|
|
||
| data = _get_model_data(model_path) |
There was a problem hiding this comment.
Track non-Python threat-model inputs in the dependency key.
run() and _get_model_data() only invalidate on model_path, but the rendered threats section is also sourced from security/threats.json. A JSON-only edit will keep incremental Sphinx builds and the in-memory cache serving stale threat/control content until a clean build.
Also applies to: 183-193
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@doc/_ext/pytm_directive.py` around lines 134 - 138, The directive currently
only marks the Python model file as a dependency which leaves non-Python inputs
like security/threats.json untracked; update the dependency registration in
run() (and where _get_model_data() is called) to also call env.note_dependency()
for the threats JSON (and any other non-Python inputs used to render the
threats/controls section) so Sphinx will rebuild on JSON edits and avoid stale
in-memory cache content.
| spec = importlib.util.spec_from_file_location("_pytm_model_tmp", model_path) | ||
| mod = importlib.util.module_from_spec(spec) # type: ignore[arg-type] | ||
| spec.loader.exec_module(mod) # type: ignore[union-attr] |
There was a problem hiding this comment.
🧩 Analysis chain
🌐 Web query:
Python stdlib docs: can importlib.util.spec_from_file_location()returnNone, and can ModuleSpec.loaderbeNonebeforeexec_module() is called?
💡 Result:
Yes, importlib.util.spec_from_file_location can return None. No, ModuleSpec.loader cannot be None before exec_module is called when using this function. spec_from_file_location(name, location, *, loader=None, submodule_search_locations=None) returns None if no suitable loader is found for the file location (i.e., the file suffix does not match any known loader suffixes like .py, .pyc, etc.). From the source code in Lib/importlib/_bootstrap_external.py: if loader is None: for loader_class, suffixes in _get_supported_file_loaders: if location.endswith(tuple(suffixes)): loader = loader_class(name, location) spec.loader = loader break else: return None If a loader is explicitly provided, or one is found automatically, it creates a ModuleSpec with that loader set, so spec.loader is not None. The official documentation describes it as a factory function that creates a ModuleSpec, with loader as an optional parameter (default None, which triggers auto-detection), and does not mention returning None except through the implementation for unknown suffixes. Secondary sources sometimes check "if spec is None" defensively, but the primary sources confirm it can happen for invalid file types. For ModuleSpec in general (from importlib.machinery.ModuleSpec), the docs state "The loader used to load the module. The finder should always set this attribute." When created by spec_from_file_location with a valid file, it follows this. Namespace packages have loader=None, but spec_from_file_location is for file-based modules, not namespaces. Before exec_module, the spec.loader is already set during spec creation, and exec_module uses it.
Citations:
- 1: https://docs.python.org/3/library/importlib.html
- 2: https://docs.python.org/3.11/library/importlib.html
- 3: https://github.com/python/cpython/blob/main/Lib/importlib/_bootstrap_external.py
- 4: Wrong signature of "importlib.util.spec_from_loader" and "spec_from_file_location" python/typeshed#5117
- 5: https://runebook.dev/en/docs/python/library/importlib/importlib.machinery.ModuleSpec
- 6: https://peps.python.org/pep-0451/
🏁 Script executed:
# First, check file size and read the specific sections
wc -l doc/_ext/pytm_directive.pyRepository: dfetch-org/dfetch
Length of output: 92
🏁 Script executed:
# Read lines around 369-371
sed -n '360,380p' doc/_ext/pytm_directive.pyRepository: dfetch-org/dfetch
Length of output: 862
🏁 Script executed:
# Read lines around 667-677
sed -n '660,685p' doc/_ext/pytm_directive.pyRepository: dfetch-org/dfetch
Length of output: 978
🏁 Script executed:
# Check the imports and function signatures at the top of the file
head -50 doc/_ext/pytm_directive.pyRepository: dfetch-org/dfetch
Length of output: 1503
🏁 Script executed:
# Look for importlib usage and type hints
rg "importlib|module_from_spec" doc/_ext/pytm_directive.py -A 2 -B 2Repository: dfetch-org/dfetch
Length of output: 410
🏁 Script executed:
# Check container/StringList imports and what nodes.section() returns
rg "container|StringList|nodes.section" doc/_ext/pytm_directive.py -B 2 -A 2 | head -60Repository: dfetch-org/dfetch
Length of output: 666
🏁 Script executed:
# Let me check the typing module imports and see if there are any type stubs hints
rg "from typing|import typing" doc/_ext/pytm_directive.pyRepository: dfetch-org/dfetch
Length of output: 99
🏁 Script executed:
# Check what container.children actually is - look at docutils source or type hints
# Also check if there's a py.typed marker or stub files
find . -name "docutils" -o -name "py.typed" 2>/dev/null | head -10Repository: dfetch-org/dfetch
Length of output: 43
🏁 Script executed:
# Let's look more broadly at the actual return/assignment patterns
# to understand what type container.children actually is
rg "children" doc/_ext/pytm_directive.py -B 2 -A 2Repository: dfetch-org/dfetch
Length of output: 231
Remove the inline # type: ignore suppressions by handling the nullable import path and type mismatches explicitly.
The suppressions avoid real nullable edges and type mismatches:
spec_from_file_location()can returnNonefor unsupported file types, requiring an explicit check before accessingspec.loadercontainer.childrenfrom docutils nodes is a list-like object that requires explicit conversion tolistfor type compatibility
Suggested fixes
spec = importlib.util.spec_from_file_location("_pytm_model_tmp", model_path)
- mod = importlib.util.module_from_spec(spec) # type: ignore[arg-type]
- spec.loader.exec_module(mod) # type: ignore[union-attr]
+ if spec is None or spec.loader is None:
+ raise RuntimeError(f"Unable to load threat model module from {model_path}")
+ mod = importlib.util.module_from_spec(spec)
+ spec.loader.exec_module(mod)- return container.children # type: ignore[return-value]
+ return list(container.children)Applies to lines 369–371 and 667–677.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| spec = importlib.util.spec_from_file_location("_pytm_model_tmp", model_path) | |
| mod = importlib.util.module_from_spec(spec) # type: ignore[arg-type] | |
| spec.loader.exec_module(mod) # type: ignore[union-attr] | |
| spec = importlib.util.spec_from_file_location("_pytm_model_tmp", model_path) | |
| if spec is None or spec.loader is None: | |
| raise RuntimeError(f"Unable to load threat model module from {model_path}") | |
| mod = importlib.util.module_from_spec(spec) | |
| spec.loader.exec_module(mod) |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@doc/_ext/pytm_directive.py` around lines 369 - 371, The code currently uses
inline "# type: ignore" on the importlib spec and on a docutils container
children usage; instead explicitly handle nullable and type-mismatch cases:
after calling importlib.util.spec_from_file_location("_pytm_model_tmp",
model_path) check that the returned spec is not None and that spec.loader is
present (raise a clear ImportError or ValueError with context if absent) before
calling spec.loader.exec_module(mod), and call
importlib.util.module_from_spec(spec) only after that check; for the docutils
container, replace direct uses of container.children with an explicit list
conversion (e.g., children = list(container.children)) or appropriate typing
cast so the code no longer needs "# type: ignore" and types are explicit.
| - A manifest crafted with a malicious ``dst:`` path could attempt to write | ||
| files outside the project root (path-traversal). dfetch mitigates this with | ||
| ``check_no_path_traversal()`` applied to every file copy. |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "Definition and call sites for check_no_path_traversal():"
rg -n -C2 '\bdef\s+check_no_path_traversal\b|\bcheck_no_path_traversal\s*\(' --type=py
echo
echo "Potential destination-path validation paths:"
rg -n -C2 '\bdst\b|destination|path_traversal' --type=pyRepository: dfetch-org/dfetch
Length of output: 50373
🏁 Script executed:
# Check Manifest.validate_destination() implementation
rg -n -A10 'def validate_destination' --type=pyRepository: dfetch-org/dfetch
Length of output: 883
🏁 Script executed:
# Check how _check_destination is called in the update flow
rg -n -B5 -A15 '_check_destination\(' dfetch/commands/update.py --type=pyRepository: dfetch-org/dfetch
Length of output: 2013
🏁 Script executed:
# Check where manifest destinations are actually used during copy/extract
rg -n -A5 'def _fetch_impl|def update' dfetch/project/gitsubproject.py dfetch/project/archivesubproject.py --type=py | head -80Repository: dfetch-org/dfetch
Length of output: 1253
Narrow the claim about path-traversal validation scope.
The implementation shows check_no_path_traversal() is called once per manifest entry via Update._check_destination() (lines 89, 117–128 in dfetch/commands/update.py) to validate the destination path before extraction starts. This is a single pre-check, not a check "applied to every file copy." Additionally, archive symlinks are validated separately post-extraction in dfetch/vcs/archive.py:430.
The current phrasing overstates coverage. Rewrite lines 75–77 to reflect that the destination path itself is validated against traversal before extraction, and symlinks are validated after:
- A manifest crafted with a malicious ``dst:`` path could attempt to write
files outside the project root (path-traversal). dfetch mitigates this by
validating the destination path via ``check_no_path_traversal()`` before
extraction and post-extraction symlink walks.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@doc/explanation/security.rst` around lines 75 - 77, Update the text to narrow
the claim: change the sentence to state that dfetch validates the destination
path via check_no_path_traversal() before extraction (as performed in
Update._check_destination()) and that archive symlinks are validated
post-extraction (see archive symlink checks in dfetch/vcs/archive.py), rather
than implying the check runs on every individual file copy.
4d8dbbf to
42243b6
Compare
security/threat_model.py: pytm-based threat model covering the full SDLC
(runtime, GitHub Actions CI/CD, PyPI distribution). Defines 6 trust
boundaries, 5 primary assets (PA-01..PA-05), 10 supporting assets
(SA-01..SA-10), 8 environmental assets (EA-01..EA-08), and 15 annotated
data flows with existing-control flags.
doc/explanation/security.rst: Sphinx RST page documenting the asset
register, trust boundaries, data flows, implemented controls table, and
known gaps/residual risks — aligned with CRA Article 13 and EN 18031.
doc/index.rst: wire security.rst into the Explanation toctree.
pyproject.toml: add pytm==1.3.1 as a new [security] optional dependency.
CHANGELOG.rst: record the new security documentation in 0.14.0 (unreleased).
https://claude.ai/code/session_01Rc28JtpAPWhJtA3YvS5kcr
Summary by CodeRabbit
Documentation
Chores