Skip to content

Add repository package discovery script#89

Open
AntTheLimey wants to merge 2 commits into
mainfrom
feat/repo-discovery-script
Open

Add repository package discovery script#89
AntTheLimey wants to merge 2 commits into
mainfrom
feat/repo-discovery-script

Conversation

@AntTheLimey
Copy link
Copy Markdown
Member

@AntTheLimey AntTheLimey commented May 12, 2026

Summary

  • Docker-based tool that queries pgEdge DNF (including noarch) and APT repositories
  • Diffs discovered packages against catalog.json and reports NEW, MATCH, MISSING, EXCLUDED
  • Interactive walkthrough for adding new packages to the catalog (category, name, description)
  • Shell glob exclusion patterns for non-user-facing packages (debuginfo, devel, llvmjit, etc.)
  • Self-test suite with 5 checks, no Docker required

Usage

# Discover and diff (no interactive walkthrough)
./scripts/discover-repo-packages.sh --discover-only

# Full interactive mode
./scripts/discover-repo-packages.sh

# Run self-tests
./scripts/discover-repo-packages.sh --self-test

Test plan

  • Run ./scripts/discover-repo-packages.sh --self-test — all 5 checks pass
  • Run ./scripts/discover-repo-packages.sh --discover-only — exits 0, shows MATCH/EXCLUDED with no unexpected NEW
  • Run ./scripts/discover-repo-packages.sh --discover-only --verbose — excluded packages listed individually
  • Verify scripts/output/ is gitignored and not committed

Summary by CodeRabbit

  • Chores
    • Added automated package discovery script supporting EL and DEB repositories.
    • Introduced package exclusion configuration for filtering.
    • Updated gitignore to exclude script output directory.
    • Added test fixtures for package catalog validation and repository discovery.

Review Change Stack

Docker-based tool that queries pgEdge DNF and APT repositories,
diffs discovered packages against catalog.json, and interactively
walks through adding new packages to the catalog.

Features:
- Queries both pgedge and pgedge-noarch EL repos + DEB repo
- Shell glob exclusion patterns for non-user-facing packages
- PG-version detection (standalone vs versioned)
- Meta-package membership from DEB deps (authoritative source)
- Interactive walkthrough with category/name/description prompts
- Self-test suite (5 checks, no Docker required)
- Flags: --discover-only, --verbose, --el-only, --deb-only

Run: ./scripts/discover-repo-packages.sh --discover-only
Test: ./scripts/discover-repo-packages.sh --self-test
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 12, 2026

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: fbe02978-a42f-48aa-8249-80d2dbbf34af

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • ✅ Review completed - (🔄 Check again to review again)
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/repo-discovery-script

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

  • Generate code and open pull requests
  • Plan features and break down work
  • Investigate incidents and troubleshoot customer tickets together
  • Automate recurring tasks and respond to alerts with triggers
  • Summarize progress and report instantly

Built for teams:

  • Shared memory across your entire org—no repeating context
  • Per-thread sandboxes to safely plan and execute work
  • Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@codacy-production
Copy link
Copy Markdown

codacy-production Bot commented May 12, 2026

Up to standards ✅

🟢 Issues 0 issues

Results:
0 new issues

View in Codacy

NEW Get contextual insights on your PRs based on Codacy's metrics, along with PR and Jira context, without leaving GitHub. Enable AI reviewer
TIP This summary will be updated as you push new changes.

@cloudflare-workers-and-pages
Copy link
Copy Markdown

cloudflare-workers-and-pages Bot commented May 12, 2026

Deploying pgedge-docs with  Cloudflare Pages  Cloudflare Pages

Latest commit: 3c542e9
Status: ✅  Deploy successful!
Preview URL: https://d746cd2a.pgedge-docs.pages.dev
Branch Preview URL: https://feat-repo-discovery-script.pgedge-docs.pages.dev

View logs

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (3)
scripts/discover-repo-packages.sh (3)

1027-1027: Stale TODO marker — confirm intent before merging.

# Additional tests added in later tasks reads like a hand-off note from PR drafting. If no follow-up tests are planned for this PR, remove it; otherwise capture the work in a tracking issue so it doesn't linger.

Would you like me to open a follow-up issue describing the remaining self-test coverage (e.g., catalog version-expansion, exclusion-glob edge cases) so this comment can be removed?

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@scripts/discover-repo-packages.sh` at line 1027, The inline TODO comment "#
Additional tests added in later tasks" is stale—either remove it or replace it
with a tracked work item; confirm whether more tests will be added for self-test
coverage (catalog version-expansion, exclusion-glob edge cases) and if not
delete the line from scripts/discover-repo-packages.sh, otherwise create a
follow-up issue and replace the comment with the issue number or URL so the
intent is captured and won't linger; look for the exact string "# Additional
tests added in later tasks" to locate and update the comment.

668-669: 💤 Low value

SC2188: use : > file instead of a bare redirection.

> "$file" with no command is a quirk that works in bash but is flagged by shellcheck and isn't portable to other POSIX shells. Replace with the standard : > "$file" idiom in both discover_el and discover_deb.

♻️ Proposed fix (apply at both locations)
-    > "$output_dir/rocky9-packages.txt"
-    > "$output_dir/rocky9-metapkg-deps.txt"
+    : > "$output_dir/rocky9-packages.txt"
+    : > "$output_dir/rocky9-metapkg-deps.txt"
-    > "$output_dir/debian12-packages.txt"
-    > "$output_dir/debian12-metapkg-deps.txt"
+    : > "$output_dir/debian12-packages.txt"
+    : > "$output_dir/debian12-metapkg-deps.txt"

Also applies to: 717-718

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@scripts/discover-repo-packages.sh` around lines 668 - 669, The script uses
bare redirections like > "$output_dir/rocky9-packages.txt" (and similarly for
"$output_dir/rocky9-metapkg-deps.txt") inside the discover_el and discover_deb
logic, which ShellCheck SC2188 flags as non-portable; replace each bare
redirection with the portable no-op truncation form : >
"$output_dir/rocky9-packages.txt" and : > "$output_dir/rocky9-metapkg-deps.txt"
(and the analogous two occurrences around lines noted for discover_deb) so files
are created/truncated portably.

458-458: ⚡ Quick win

Hardcoded 18 will silently regress when a new PG major ships.

detect_pg_versions already enumerates ("16" "17" "18") (lines 89, 160), the metapkg repoquery/apt-cache depends calls target …_18 / …-18 directly, and the {ver} expansion in interactive_walkthrough uses 18 as the canonical resolution. The moment PG 19 lands, all three places will need to be updated in lockstep or membership detection will quietly miss the new variants.

Centralize this — for example a single DEFAULT_PG_VERSION constant near the top (sourced from default_pg_version in catalog.json if available), and reference it in discover_el, discover_deb, and interactive_walkthrough.

Also applies to: 660-662, 709-711

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@scripts/discover-repo-packages.sh` at line 458, Replace the hardcoded "18" by
introducing a single DEFAULT_PG_VERSION variable near the top (set it from
catalog.json's default_pg_version if present, otherwise derive the latest entry
from detect_pg_versions), then update usage sites: change the rpm_name expansion
that sets check_name (where rpm_name//\{ver\}/18), the metapkg
repoquery/apt-cache depends constructions in discover_el and discover_deb, and
the {ver} resolution in interactive_walkthrough to use ${DEFAULT_PG_VERSION};
keep detect_pg_versions unchanged as the authoritative list but reference
DEFAULT_PG_VERSION for canonical resolution so new PG majors don’t require
multiple edits.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@scripts/discover-repo-packages.sh`:
- Around line 202-211: The pkg_in_metapkg function (and the same loose check
used in interactive_walkthrough for in_full/in_minimal classification) currently
treats any dependency containing $pkg as a match, causing false positives;
change the matching logic to only accept exact tokens or clearly delimited
variants (e.g., match "$dep" == "$pkg" or "$dep" == "${pkg}_*" or split dep
tokens and compare each token exactly) so versioned or suffixed names like
pkg_... are matched intentionally and not by arbitrary substring containment;
update pkg_in_metapkg and the corresponding checks in interactive_walkthrough to
use this stricter comparison.
- Around line 53-58: The load_exclusions loop currently uses line="${line// /}"
which removes all internal spaces and ignores tabs, breaking patterns with
leading tabs; change the trimming logic in load_exclusions so you only strip
leading and trailing whitespace (spaces and tabs) and preserve internal spaces
when populating EXCLUSION_PATTERNS. Replace the single global-space removal with
a two-step trim using POSIX parameter expansion (remove leading whitespace, then
trailing whitespace) before the empty-check and appending to EXCLUSION_PATTERNS;
keep the existing comment-stripping step (line="${line%%#*}") and the rest of
the loop unchanged so is_excluded can match correctly.

---

Nitpick comments:
In `@scripts/discover-repo-packages.sh`:
- Line 1027: The inline TODO comment "# Additional tests added in later tasks"
is stale—either remove it or replace it with a tracked work item; confirm
whether more tests will be added for self-test coverage (catalog
version-expansion, exclusion-glob edge cases) and if not delete the line from
scripts/discover-repo-packages.sh, otherwise create a follow-up issue and
replace the comment with the issue number or URL so the intent is captured and
won't linger; look for the exact string "# Additional tests added in later
tasks" to locate and update the comment.
- Around line 668-669: The script uses bare redirections like >
"$output_dir/rocky9-packages.txt" (and similarly for
"$output_dir/rocky9-metapkg-deps.txt") inside the discover_el and discover_deb
logic, which ShellCheck SC2188 flags as non-portable; replace each bare
redirection with the portable no-op truncation form : >
"$output_dir/rocky9-packages.txt" and : > "$output_dir/rocky9-metapkg-deps.txt"
(and the analogous two occurrences around lines noted for discover_deb) so files
are created/truncated portably.
- Line 458: Replace the hardcoded "18" by introducing a single
DEFAULT_PG_VERSION variable near the top (set it from catalog.json's
default_pg_version if present, otherwise derive the latest entry from
detect_pg_versions), then update usage sites: change the rpm_name expansion that
sets check_name (where rpm_name//\{ver\}/18), the metapkg repoquery/apt-cache
depends constructions in discover_el and discover_deb, and the {ver} resolution
in interactive_walkthrough to use ${DEFAULT_PG_VERSION}; keep detect_pg_versions
unchanged as the authoritative list but reference DEFAULT_PG_VERSION for
canonical resolution so new PG majors don’t require multiple edits.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: aac0f51a-6f47-41ac-9b44-7a05e09693fa

📥 Commits

Reviewing files that changed from the base of the PR and between 63d9be1 and df6ba61.

📒 Files selected for processing (10)
  • .gitignore
  • scripts/discover-repo-packages.sh
  • scripts/excluded-packages.txt
  • scripts/test-fixtures/catalog-test.json
  • scripts/test-fixtures/deb-metapkg-deps.txt
  • scripts/test-fixtures/deb-packages.txt
  • scripts/test-fixtures/el-metapkg-deps.txt
  • scripts/test-fixtures/el-packages.txt
  • scripts/test-fixtures/excluded-test.txt
  • scripts/test-fixtures/expected-output.json

Comment thread scripts/discover-repo-packages.sh
Comment thread scripts/discover-repo-packages.sh
- load_exclusions: replace global space removal with POSIX trim that
  handles leading/trailing spaces and tabs without stripping internal
  spaces from patterns
- pkg_in_metapkg: remove substring fallback that caused false
  positives (e.g. pgedge-pg matching pgedge-pgadmin4), use exact
  match only
@AntTheLimey AntTheLimey marked this pull request as ready for review May 13, 2026 14:42
@AntTheLimey AntTheLimey requested review from dpage and susan-pgedge May 13, 2026 14:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant