Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
179 changes: 173 additions & 6 deletions src/agentready/assessors/stub_assessors.py
Original file line number Diff line number Diff line change
Expand Up @@ -43,6 +43,121 @@ def attribute(self) -> Attribute:
default_weight=0.10,
)

# Directories to exclude from recursive lock file searches
_EXCLUDED_DIRS = {
"vendor", "node_modules", ".venv", "venv",
"__pycache__", ".git", ".terraform",
}

# Package manifests that indicate a repo actually uses dependencies
_DEPENDENCY_MANIFESTS = [
"go.mod", "package.json", "pyproject.toml", "setup.py",
"setup.cfg", "Gemfile", "Cargo.toml", "requirements.txt",
"Pipfile", "pom.xml", "build.gradle", "build.gradle.kts",
"composer.json", "mix.exs", "pubspec.yaml",
"Dockerfile", "Containerfile",
]
Comment on lines +52 to +59
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Don’t treat Dockerfiles as lock-file dependency manifests.

Line 58 makes Dockerfile-only/container-only repos enter the “has manifests but no lock files” failure path, but this assessor’s remediation has no Docker lock-file action and container_setup is explicitly separate at Line 980. This can reintroduce false positives for config/container repos.

🛠️ Proposed fix
     _DEPENDENCY_MANIFESTS = [
         "go.mod", "package.json", "pyproject.toml", "setup.py",
         "setup.cfg", "Gemfile", "Cargo.toml", "requirements.txt",
         "Pipfile", "pom.xml", "build.gradle", "build.gradle.kts",
-        "composer.json", "mix.exs", "pubspec.yaml",
-        "Dockerfile", "Containerfile",
+        "composer.json", "mix.exs", "pubspec.yaml",
     ]
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
# Package manifests that indicate a repo actually uses dependencies
_DEPENDENCY_MANIFESTS = [
"go.mod", "package.json", "pyproject.toml", "setup.py",
"setup.cfg", "Gemfile", "Cargo.toml", "requirements.txt",
"Pipfile", "pom.xml", "build.gradle", "build.gradle.kts",
"composer.json", "mix.exs", "pubspec.yaml",
"Dockerfile", "Containerfile",
]
# Package manifests that indicate a repo actually uses dependencies
_DEPENDENCY_MANIFESTS = [
"go.mod", "package.json", "pyproject.toml", "setup.py",
"setup.cfg", "Gemfile", "Cargo.toml", "requirements.txt",
"Pipfile", "pom.xml", "build.gradle", "build.gradle.kts",
"composer.json", "mix.exs", "pubspec.yaml",
]
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/agentready/assessors/stub_assessors.py` around lines 52 - 59, The
_DEPENDENCY_MANIFESTS list should not include container image files because
Dockerfile/Containerfile are not package lock manifests and cause container-only
repos to hit the “has manifests but no lock files” failure path; remove
"Dockerfile" and "Containerfile" from the _DEPENDENCY_MANIFESTS constant (used
in stub_assessors.py) so that container repos are handled by the separate
container_setup flow (referenced as container_setup) instead of being treated
like package-manifest repos.


def _rglob_filtered(self, root: Path, filename: str) -> list[Path]:
"""Recursively search for filename, excluding common non-source dirs."""
matches = []
for match in root.rglob(filename):
if not any(part in self._EXCLUDED_DIRS for part in match.parts):
matches.append(match)
return matches
Comment on lines +61 to +67
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, let's read the actual file to verify the code matches the review snippet
cat -n src/agentready/assessors/stub_assessors.py | sed -n '55,75p'

Repository: ambient-code/agentready

Length of output: 1171


🏁 Script executed:

# Check the _EXCLUDED_DIRS definition
grep -n "_EXCLUDED_DIRS" src/agentready/assessors/stub_assessors.py

Repository: ambient-code/agentready

Length of output: 172


🏁 Script executed:

# Search for tests related to this method
fd -t f "test.*\.py" | xargs grep -l "_rglob_filtered\|stub_assessors" 2>/dev/null | head -5

Repository: ambient-code/agentready

Length of output: 100


🏁 Script executed:

# Check the broader context and imports in the file
head -30 src/agentready/assessors/stub_assessors.py

Repository: ambient-code/agentready

Length of output: 965


🏁 Script executed:

# Get the full _EXCLUDED_DIRS definition
sed -n '47,60p' src/agentready/assessors/stub_assessors.py

Repository: ambient-code/agentready

Length of output: 588


🏁 Script executed:

# Check the test file for this method
cat tests/unit/test_assessors_stub.py | grep -A 20 -B 5 "_rglob_filtered"

Repository: ambient-code/agentready

Length of output: 49


🏁 Script executed:

# Verify if os module is already used elsewhere in the file
grep -n "^import os\|^from os" src/agentready/assessors/stub_assessors.py

Repository: ambient-code/agentready

Length of output: 49


🏁 Script executed:

# Let's read the entire test file to understand the test coverage
wc -l tests/unit/test_assessors_stub.py

Repository: ambient-code/agentready

Length of output: 105


🏁 Script executed:

# Check if there are any tests that indirectly test _rglob_filtered through _has_any_dependency_manifest
grep -n "_has_any_dependency_manifest\|DependencyPinningAssessor" tests/unit/test_assessors_stub.py | head -20

Repository: ambient-code/agentready

Length of output: 1131


🏁 Script executed:

# Let's also check if there are any other usages of rglob in the codebase
rg "\.rglob\(" --type py

Repository: ambient-code/agentready

Length of output: 1783


🏁 Script executed:

# Check the full context around the method to understand its importance
sed -n '40,80p' src/agentready/assessors/stub_assessors.py

Repository: ambient-code/agentready

Length of output: 1815


Optimize directory traversal to avoid walking excluded directories.

The current implementation uses Path.rglob() to traverse all directories, then filters out matches from excluded paths after traversal completes. On large repositories with deep vendor/, node_modules/, .terraform/, and other excluded directory trees, this wastes significant time traversing directories that will be discarded anyway.

Use os.walk() with in-place dirnames modification to prune excluded directories before traversal:

Proposed traversal fix
     def _rglob_filtered(self, root: Path, filename: str) -> list[Path]:
         """Recursively search for filename, excluding common non-source dirs."""
+        import os
+
         matches = []
-        for match in root.rglob(filename):
-            if not any(part in self._EXCLUDED_DIRS for part in match.parts):
-                matches.append(match)
+        for dirpath, dirnames, filenames in os.walk(root):
+            dirnames[:] = [
+                dirname
+                for dirname in dirnames
+                if dirname not in self._EXCLUDED_DIRS
+            ]
+            if filename in filenames:
+                matches.append(Path(dirpath) / filename)
         return matches
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/agentready/assessors/stub_assessors.py` around lines 61 - 67, The
_rglob_filtered function currently uses Path.rglob and then filters results,
causing wasted traversal; replace the implementation to use os.walk on the
provided root and prune excluded directories by modifying the dirnames list
in-place (using self._EXCLUDED_DIRS) so those trees are never descended into,
and collect matching Path objects for files whose name equals the filename
parameter (convert os.walk root+file to Path and append to matches). Ensure you
preserve the return type list[Path] and that the function still honors the same
exclusion set stored in self._EXCLUDED_DIRS.


def _has_any_dependency_manifest(self, repository: Repository) -> bool:
"""Check whether the repo uses any package manager at all."""
for manifest in self._DEPENDENCY_MANIFESTS:
if (repository.path / manifest).exists():
return True
if self._rglob_filtered(repository.path, manifest):
return True
return False

def _score_terraform_constraints(self, repository: Repository) -> Finding | None:
"""Score Terraform repos based on versions.tf provider constraints."""
import re

tf_version_files = self._rglob_filtered(repository.path, "versions.tf")
tf_lock_files = self._rglob_filtered(
repository.path, ".terraform.lock.hcl"
)

if not tf_version_files and not tf_lock_files:
return None

if tf_lock_files:
rel_paths = [
str(p.relative_to(repository.path)) for p in tf_lock_files
]
return Finding(
attribute=self.attribute,
status="pass",
score=100.0,
measured_value=", ".join(rel_paths),
threshold="lock file with pinned versions",
evidence=[
f"Found Terraform lock file(s): {', '.join(rel_paths)}"
],
remediation=None,
error_message=None,
)

pinned = 0
ranged = 0
for vf in tf_version_files:
try:
content = vf.read_text()
for m in re.finditer(
r'version\s*=\s*"([^"]+)"', content
):
ver = m.group(1)
if any(op in ver for op in [">", "<", "~", "!="]):
ranged += 1
else:
pinned += 1
Comment on lines +112 to +119
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Description: Demonstrate that the current regex matches Terraform required_version.
# Expect: the first output includes both ">= 1.5.0" and "6.0.0"; the second includes only "6.0.0".
python - <<'PY'
import re

content = '''
terraform {
  required_version = ">= 1.5.0"

  required_providers {
    aws = {
      source  = "hashicorp/aws"
      version = "6.0.0"
    }
  }
}
'''

print(re.findall(r'version\s*=\s*"([^"]+)"', content))
print(re.findall(r'\bversion\b\s*=\s*"([^"]+)"', content))
PY

Repository: ambient-code/agentready

Length of output: 98


🏁 Script executed:

cat -n src/agentready/assessors/stub_assessors.py | sed -n '100,130p'

Repository: ambient-code/agentready

Length of output: 1219


🏁 Script executed:

# Get more context around the issue
rg -B 20 -A 10 'for m in re.finditer' src/agentready/assessors/stub_assessors.py

Repository: ambient-code/agentready

Length of output: 1106


🏁 Script executed:

# Check how ranged/pinned are used after counting
rg -A 20 'ranged \+= 1' src/agentready/assessors/stub_assessors.py

Repository: ambient-code/agentready

Length of output: 708


Fix regex to match provider version key only, not required_version.

The current regex incorrectly captures required_version = ">= ..." alongside provider version pins. A Terraform configuration with required_version = ">= 1.5.0" and exact provider pins (e.g., version = "6.0.0") will be scored as mixed/ranged and fail (score < 75%) instead of pass (100%), causing false negatives on correctly pinned provider versions.

Use word boundaries to isolate the version key:

Regex fix
                 for m in re.finditer(
-                    r'version\s*=\s*"([^"]+)"', content
+                    r'\bversion\b\s*=\s*"([^"]+)"', content
                 ):
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
for m in re.finditer(
r'version\s*=\s*"([^"]+)"', content
):
ver = m.group(1)
if any(op in ver for op in [">", "<", "~", "!="]):
ranged += 1
else:
pinned += 1
for m in re.finditer(
r'\bversion\b\s*=\s*"([^"]+)"', content
):
ver = m.group(1)
if any(op in ver for op in [">", "<", "~", "!="]):
ranged += 1
else:
pinned += 1
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/agentready/assessors/stub_assessors.py` around lines 112 - 119, The regex
in the re.finditer call currently matches any occurrence of version inside
content and unintentionally captures required_version; update that pattern to
only match the standalone key "version" (e.g. using a word-boundary or explicit
key match) so the loop that sets ver and increments ranged/pinned (variables
ver, ranged, pinned inside the re.finditer block) only processes provider
version pins; keep the rest of the logic (the any(op in ver...) check and
counters) unchanged.

except OSError:
pass

total = pinned + ranged
if total == 0:
return None

score = (pinned / total) * 100 if pinned > 0 else 35.0
score = max(score, 35.0)
status = "pass" if score >= 75 else "fail"

return Finding(
attribute=self.attribute,
status=status,
score=score,
measured_value=f"{len(tf_version_files)} versions.tf files",
threshold="lock file with pinned versions",
evidence=[
f"Found {len(tf_version_files)} versions.tf file(s) "
f"with provider constraints",
f"{pinned} exact pins, {ranged} range constraints",
],
remediation=Remediation(
summary="Pin Terraform providers to exact versions",
steps=[
"Run 'terraform providers lock' to generate "
".terraform.lock.hcl",
"Or pin exact versions in versions.tf "
'(e.g., version = "6.0.0" not ">= 6.0")',
],
tools=["terraform"],
commands=[
"terraform providers lock "
"-platform=linux_amd64",
],
examples=[],
citations=[],
) if status == "fail" else None,
error_message=None,
)

def assess(self, repository: Repository) -> Finding:
"""Check for dependency lock files and validate version pinning quality."""
# Language-specific lock files (auto-managed, always have exact versions)
Expand All @@ -56,15 +171,58 @@ def assess(self, repository: Repository) -> Finding:
"Cargo.lock", # Rust
"Gemfile.lock", # Ruby
"go.sum", # Go
".terraform.lock.hcl", # Terraform
]

# Manual lock files (need validation)
manual_lock_files = ["requirements.txt"] # Python pip

found_strict = [f for f in strict_lock_files if (repository.path / f).exists()]
found_manual = [f for f in manual_lock_files if (repository.path / f).exists()]
# 1. Check root-level lock files
found_strict = [
f for f in strict_lock_files
if (repository.path / f).exists()
]
found_manual = [
f for f in manual_lock_files
if (repository.path / f).exists()
]

# 2. Recursive fallback for multi-module repos (e.g. Go workspaces,
# monorepos with per-package lock files)
if not found_strict:
for f in strict_lock_files:
matches = self._rglob_filtered(repository.path, f)
if matches:
found_strict.extend(
str(m.relative_to(repository.path))
for m in matches
)

if not found_manual and not found_strict:
for f in manual_lock_files:
matches = self._rglob_filtered(repository.path, f)
if matches:
found_manual.extend(
str(m.relative_to(repository.path))
for m in matches
)

# 3. If nothing found, check for special ecosystems and edge cases
if not found_strict and not found_manual:
# 3a. Terraform repos: score based on versions.tf constraints
tf_finding = self._score_terraform_constraints(repository)
if tf_finding is not None:
return tf_finding

# 3b. Repos with no dependency manifests at all → not applicable
if not self._has_any_dependency_manifest(repository):
return Finding.not_applicable(
self.attribute,
reason="No dependency manifests found "
"(no package manager in use)",
)

# 3c. Has manifests but no lock files → fail
return Finding(
attribute=self.attribute,
status="fail",
Expand All @@ -75,15 +233,24 @@ def assess(self, repository: Repository) -> Finding:
remediation=Remediation(
summary="Add lock file for dependency reproducibility",
steps=[
"For npm: run 'npm install' (generates package-lock.json)",
"For Python: use 'pip freeze > requirements.txt' or poetry",
"For Ruby: run 'bundle install' (generates Gemfile.lock)",
"For Go: run 'go mod tidy' (generates go.sum)",
"For npm: run 'npm install' "
"(generates package-lock.json)",
"For Python: use 'pip freeze > requirements.txt'"
" or poetry",
"For Ruby: run 'bundle install' "
"(generates Gemfile.lock)",
"For Terraform: run 'terraform providers lock'"
" (generates .terraform.lock.hcl)",
],
tools=["npm", "pip", "poetry", "bundler"],
tools=["go", "npm", "pip", "poetry", "bundler",
"terraform"],
commands=[
"go mod tidy # Go",
"npm install # npm",
"pip freeze > requirements.txt # Python",
"poetry lock # Python with Poetry",
"terraform providers lock # Terraform",
],
examples=[],
citations=[],
Expand Down
Loading