Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 3 additions & 3 deletions docs/dogfooding_findings_tracker.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@ Status taxonomy:
| D5 | 2026-05-07 Session 5 dogfood (FINDING-1) | set operator partial-match semantics | partial-dict `expected` items canonicalised as different elements from full extractor records — false positive on `includes_*` / `subset_of` / `superset_of`, **false negative (CI bypass) on `excludes_all`** | **解決** | PR #65 (CSCI-35c) — Match Schema partial-record matching + flat-projection aliases + `evidence.matched`; schema_version v4→v5 |
| D6 | 2026-05-28 real-PR complexity dogfood (FINDING-F1) | vacuous PASS (extractor coverage gap) — **重複・関連 = sibling of D4** | nested function bodies are excluded from `ComplexityEntry` by `python_complexity_extractor` spec (`api_surface` parity); refactor that nests outer-function body into nested helpers reports large CC drop while real complexity is unchanged | **未解決** | Candidate paths: (a) `docs/target_yaml_guide.md` new Hazard 4 + `ADVISORY-D6` detector mirroring D4; (b) extractor spec change to emit nested-function entries (long-term, schema-impacting). Reproduction: langgraph PR #3700 (8/1 vacuous PASS in real-PR pass) |
| D7 | 2026-05-28 real-PR complexity dogfood (FINDING-F2) | authoring mismatch (operator / constraint pairing) | `extract-method` refactor is mathematically guaranteed to **micro-increase cyclomatic** (each extracted function adds base 1), even with `_` prefix discipline and api_surface preserved. Cognitive is the metric that drops. Authors declaring `complexity_delta.cyclomatic ≤ 0` for extract-method refactors hit a structural false-FAIL | **未解決** | Candidate paths: (a) authoring guide section "Choosing complexity metric per refactor pattern" recommending `cognitive_delta` for extract-method; (b) future `ADVISORY-D7` detector emitted when a `change.primary_kind=refactor` target uses `cyclomatic_delta ≤ 0` and the diff matches extract-method shape. Low priority: this is authoring advice, not a CI integrity hazard |
| D8 | 2026-06-07 scale + security dogfood (SCA gap) | SCA sensor dependency-source discovery gap | SSP SCA auto-discovery (`_requirements_file` in `src/semantic_ci_code/cli/commands/ssp.py`) only finds `requirements.txt` at repo root; the `--locked` fallback only accepts `pylock.toml` / requirements lockfiles. PEP 621 pyproject-only projects (litellm) and `pdm.lock` projects (pdm) declare deps in unrecognised formats → `pip-audit --locked .` errors "no lockfiles found" → empty JSON → adapter degrades to `unknown` (exit 3). Correct graceful degradation (no silent false PASS, honours `unknown > fail > pass`) but a real usability gap that blocks SCA on most modern Python projects | **未解決** | Candidate fix: extend `_requirements_file` / the pip-audit adapter to recognise PEP 621 `pyproject.toml`, `poetry.lock`, and `pdm.lock` (e.g. `pip-audit` against the resolved env or a generated lock). Genuine fixable `semantic-ci` defect, not an inherent sensor limitation |
| D8 | 2026-06-07 scale + security dogfood (SCA gap) | SCA sensor dependency-source discovery gap | SSP SCA auto-discovery (`_requirements_file` in `src/semantic_ci_code/cli/commands/ssp.py`) only found `requirements.txt` at repo root; the `--locked` fallback only accepted `pylock.toml` / requirements lockfiles. PEP 621 pyproject-only projects (litellm) and `pdm.lock` projects (pdm) declared deps in unrecognised formats → `pip-audit --locked .` errors "no lockfiles found" → empty JSON → adapter degraded to `unknown` (exit 3). Correct graceful degradation (no silent false PASS, honours `unknown > fail > pass`) but a real usability gap that blocked SCA on most modern Python projects | **解決** | CSCI-55 — dependency source discovery now recognises `requirements.txt`, `pylock.toml` / `pylock.*.toml`, `uv.lock`, `pdm.lock`, `poetry.lock`, and static PEP 621 `[project].dependencies`; lock sources are converted deterministically to pinned temp requirements, optional/non-default-group/marker-inactive packages are filtered, and malformed recognized sources fail closed to SSP `unknown` |

## Reading order

Expand All @@ -37,8 +37,8 @@ Status taxonomy:
## Classification at a glance

- **重複・関連 pairs**: D4 ↔ D6 (both are "vacuous PASS" via extractor coverage gap, distinct mechanism — D4 is "diff outside Python scope", D6 is "diff inside scope but inside nested function")
- **解決 (5 of 8)**: D1, D2, D3, D4, D5
- **未解決 (3 of 8)**: D6 (mitigation path open), D7 (authoring advice, low priority), D8 (SCA discovery gap, fixable defect)
- **解決 (6 of 8)**: D1, D2, D3, D4, D5, D8
- **未解決 (2 of 8)**: D6 (mitigation path open), D7 (authoring advice, low priority)
- **observation-only (not a D#)**: F6 (pattern-SAST logic-vuln blindspot) — **UNTESTED HYPOTHESIS, not a demonstrated observation in the 2026-06-07 pass**: the Semgrep registry rulesets returned HTTP 403, so Semgrep ran with 0 loaded rules over 0 paths and produced no valid SAST measurement. F6 records the *a-priori* expectation that deterministic SAST misses semantic / business-logic vulns, cross-linked to Phase H (`docs/llm_sensor_adapter_planning.md`) as **motivation** — it is **not** empirically validated by this pass. Recorded in `docs/dogfooding_scale_and_security.md` (which now carries a validity warning + repro note for redoing the SAST sub-pass under a network policy allowing `semgrep.dev`). Distinct from the demonstrated observations of the same pass: real vulns merged-then-fixed (git evidence) and SCA clean-on-litellm (pip-audit positive-controlled with `jinja2==2.11.2` → 5 CVEs)

## Source pass index
Expand Down
16 changes: 12 additions & 4 deletions docs/dogfooding_scale_and_security.md
Original file line number Diff line number Diff line change
Expand Up @@ -197,6 +197,13 @@ found"* → empty JSON → the adapter degrades to `unknown`. This is
(`unknown > fail > pass`) — there is **no silent false PASS** — but it is
a real usability gap (registered as a D# below).

**Resolution.** CSCI-55 closes D8 by replacing the `requirements.txt`-only
lookup with deterministic dependency-source discovery for `requirements.txt`,
`pylock.toml`, `uv.lock`, `pdm.lock`, `poetry.lock`, and static PEP 621
`[project].dependencies`. Lock sources are converted to pinned temporary
requirements, optional/non-default-group/marker-inactive packages are filtered,
and malformed recognized sources fail closed to SSP `unknown`.

## Headline / conclusion

**Scale & robustness.** 16 runs total (5 scale + 5 random + 3 litellm
Expand Down Expand Up @@ -372,10 +379,11 @@ Findings classification (hazard D# vs observation) for this pass and all
prior passes is consolidated in
**`docs/dogfooding_findings_tracker.md`**. This pass registered:

- **D8** (SCA auto-discovery gap, 未解決) — `_requirements_file` ignores
PEP 621 pyproject / `poetry.lock` / `pdm.lock`, so modern dependency
declarations degrade to `unknown`. A genuine, fixable `semantic-ci`
defect.
- **D8** (SCA auto-discovery gap, resolved by CSCI-55) — SSP SCA now recognises
PEP 621 pyproject / `uv.lock` / `poetry.lock` / `pdm.lock` dependency sources,
translating lockfiles to deterministic pinned temporary requirements, filtering
optional/non-default-group/marker-inactive packages, and keeping malformed
recognized sources fail-closed as `unknown`.
- **F6** (SAST logic-vuln blindspot) — **UNTESTED HYPOTHESIS in this
pass**, not a demonstrated observation: the Semgrep registry rulesets
returned HTTP 403, so Semgrep never ran with real rules (0 rules / 0
Expand Down
25 changes: 21 additions & 4 deletions docs/ssp_usage_guide.md
Original file line number Diff line number Diff line change
Expand Up @@ -50,10 +50,27 @@ semantic-ci ssp scan \
--candidate-dir /tmp/candidate
```

If `requirements.txt` exists in a directory, it is passed to pip-audit
via `--requirement`. Otherwise pip-audit audits the project directory
directly (using `--locked` when supported, or the directory path as
fallback).
Dependency source discovery is deterministic and independent for the baseline
and candidate directories:

| Priority | Root file | Handling | pip-audit argv |
|---:|---|---|---|
| 1 | `requirements.txt` | Existing behavior. | `--requirement <path>` |
| 2 | `pylock.toml` / `pylock.*.toml` | Locked project scan. | `--locked <dir>` when supported, otherwise `<dir>` |
| 3 | `uv.lock` | Translate pinned packages to a temporary requirements file. | `--requirement <tmp>` + `--no-deps` |
| 4 | `pdm.lock` | Translate pinned packages to a temporary requirements file. | `--requirement <tmp>` + `--no-deps` |
| 5 | `poetry.lock` | Translate pinned packages to a temporary requirements file. | `--requirement <tmp>` + `--no-deps` |
| 6 | `pyproject.toml` with static `[project].dependencies` | Copy dependency specifiers into a temporary requirements file. | `--requirement <tmp>` |
| 7 | No recognized source | Preserve fallback behavior. | `--locked <dir>` when supported, otherwise `<dir>` |

Malformed recognized dependency sources fail closed as a pip-audit sensor error,
which produces SSP `unknown` rather than silently falling back to a lower
priority source. Lockfile translation skips optional packages and packages whose
environment markers do not apply to the current scan environment. It also keeps
only default/main dependency groups when lock metadata exposes package groups, so
docs/test/dev-only packages are not audited as production dependencies.
Unsupported markers or malformed group metadata fail closed instead of being
guessed.

### 3. Fixture mode (no scanner required)

Expand Down
10 changes: 3 additions & 7 deletions src/semantic_ci_code/cli/commands/ssp.py
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@
)
from semantic_ci_code.cli.exit_codes import ENGINE_ERROR, FAIL, SUCCESS
from semantic_ci_code.cli.output.json_formatter import dump_json
from semantic_ci_code.ssp.adapters.dependency_sources import discover_dependency_source
from semantic_ci_code.ssp.adapters.pip_audit import PipAuditAdapter
from semantic_ci_code.ssp.adapters.semgrep import SemgrepAdapter
from semantic_ci_code.ssp.delta import compute_delta
Expand Down Expand Up @@ -69,11 +70,11 @@ def _scan_envelope(args: Namespace) -> SSPEnvelope:
)
elif sensor == "pip-audit":
baseline_result = PipAuditAdapter().scan(
requirements=_requirements_file(baseline_dir),
source=discover_dependency_source(baseline_dir),
repo_root=baseline_dir,
)
candidate_result = PipAuditAdapter().scan(
requirements=_requirements_file(candidate_dir),
source=discover_dependency_source(candidate_dir),
repo_root=candidate_dir,
)
else:
Expand Down Expand Up @@ -169,8 +170,3 @@ def _existing_file(path: Path, *, label: str) -> Path:
def _resolve_package_root(root: Path, package_root: Path) -> Path:
candidate = package_root if package_root.is_absolute() else root / package_root
return _existing_dir(candidate, label="--package-root")


def _requirements_file(root: Path) -> Path | None:
path = root / "requirements.txt"
return path if path.exists() else None
Loading
Loading