Fix production retrieval correctness by mohanagy · Pull Request #146 · mohanagy/graphify-ts

mohanagy · 2026-05-11T20:35:17Z

Summary

share default hard-ignore rules across legacy detect and SPI and add source-domain classification
make retrieval exclusions operational, fix negated test intent parsing, and prefer exact method anchors with truthful slice reasons
deepen slice-v1 runtime traversal, add bad-pack diagnostics, and lock behavior with production-correctness regressions

Verification

npm run typecheck
npm run build
npm run test:run

Summary by CodeRabbit

Changed
- Improved file indexing: VCS/build/cache directories are now excluded by default, while tests and fixtures remain available for context
- Smarter retrieval: Understands exclusion phrases like "exclude tests" and filters results accordingly
- Better code anchoring for method-level prompts and framework patterns
- Enhanced diagnostics with better visibility into context quality and domain coverage
- .graphifyignore patterns now support negation for fine-grained control

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

coderabbitai · 2026-05-11T20:35:31Z

📝 Walkthrough

Walkthrough

This PR introduces source-domain classification and intent-aware retrieval filtering. It adds a shared utility for classifying file paths into domains and detecting hard-ignored paths, extends contracts with domain metadata and diagnostic kinds, refactors discovery to use shared helpers, implements prompt exclusion detection, enriches retrieval with symbol-reference parsing and domain-aware filtering, and expands diagnostics to flag pollution and incorrect pack shapes.

Changes

Source-domain filtering and discovery

Layer / File(s)	Summary
Source Discovery Foundation `src/shared/source-discovery.ts`	New module classifying paths into `SourceDomain` (test/benchmark/fixture/generated/docs/config/build_artifact/production/unknown) and detecting hard-ignored VCS/cache/build directories with `.graphifyignore` pattern support.
Contract Updates `src/contracts/context-pack.ts`, `src/contracts/retrieval-gate.ts`, `src/contracts/context-pack-diagnostics.ts`	`ContextPackNode` adds optional `source_domain`; new `RetrievalExcludedDomain` type; `ContextPackDiagnosticKind` extends with domain/pollution/structural diagnostics; `ContextPackQualitySignals` adds `domain_distribution`, `excluded_domains`, `polluted_source_path_count`.
Discovery Pipeline Refactoring `src/pipeline/detect.ts`, `src/pipeline/spi/build.ts`	Delegates ignore-pattern loading and path filtering to shared helpers; removes local glob/regex matching and hardcoded `SKIP_DIRS`; stops scanning `graphify-out/memory`.
Retrieval Gate Exclusions `src/runtime/retrieval-gate.ts`	Implements `extractPromptExclusions()` to detect and map exclusion terms to domains/path hints; enriches `classifyRetrievalLevel` to emit structured `excluded_domains`, `excluded_terms`, `excluded_path_hints` signals; enhances `detectSymbols` to extract non-backtick references.
Symbol Reference Parsing `src/runtime/retrieve.ts` (symbol parsing section)	Introduces `SymbolReference` parsing, `symbolReferenceMatchScore()` matching, and domain-exclusion helpers to drive seed filtering with symbol-aware anchoring.
Retrieval Seed & Expansion Scoring `src/runtime/retrieve.ts` (scoring sections)	Extends seed/expansion structures with `sourceDomain`, `exactLabelMatch`, `literalPathMatch`; wires domain-based penalties and symbol-aware scoring through seed enumeration, expansion, unscored-node addition, and async retrieval using `classifySourceDomain` and exclusion signals.
Retrieval Slicing Policy `src/runtime/retrieve/slicing.ts`	Adds `effectivePolicy()` adjusting traversal based on method anchors and runtime prompts; updates `shouldSuppressNode()` to filter by domain/pollution/exclusions; threads `SliceOptions` through traversal; updates `sliceCandidatesForRetrieve()` signature to accept `options`.
Context Pack Candidate Scoring `src/runtime/context-pack.ts`	Classifies `source_domain` for each candidate; introduces `sourceDomainPenalty()` helper; adds domain-based score boost and penalty during value computation; replaces test-detection regex with domain classification.
Context Pack Diagnostics `src/runtime/context-pack-diagnostics.ts`	Extends `computeContextPackDiagnostics` with domain/pollution rule weights; computes `domain_distribution` and `polluted_source_path_count` in signals; emits warnings for excluded-domain selection, polluted paths, test-dominated packs, controller-only pipelines, missing anchors/pipelines, and missing structural evidence.
Documentation `CHANGELOG.md`, `README.md`	Documents stricter default discovery, soft-penalty behavior for tests/benchmarks, intent-aware exclusions, improved anchoring for `Class.method` prompts, and enhanced diagnostics.

Test coverage

Layer / File(s)	Summary
Discovery Tests `tests/unit/detect.test.ts`, `tests/unit/spi-build.test.ts`	Validates hard-ignore of worktrees/graphify-out/build outputs while indexing tests/benchmarks/fixtures; confirms test config files (`*.test.ts`, `jest.config.js`, etc.) are now included as code.
Retrieval Gate Tests `tests/unit/retrieval-gate.test.ts`	Adds tests for explicit `Class.method` symbol detection; validates exclusion detection mapping test/benchmark/fixture/reporter domains; confirms excluded terms suppress test/benchmark intent.
Context Pack Diagnostics Tests `tests/unit/context-pack-diagnostics.test.ts`	Validates excluded-domain-selected, polluted-source-path-selected, controller-only pipeline pack, missing method anchor, missing runtime pipeline, test-dominated pack, and missing structural evidence warnings.
Production Retrieval Correctness `tests/unit/retrieve-production-correctness.test.ts`	New end-to-end test file validating domain/term suppression, `Class.method` anchor promotion, anchor-reason distinctions, and absence of incorrect diagnostic warnings for healthy production slices.
Benchmark & Generate Tests `tests/unit/benchmark-quality.test.ts`, `tests/unit/generate.test.ts`	Benchmark uses isolated OS temp directories; generate.test.ts verifies saved memory notes under `graphify-out/memory` are not re-indexed.

Estimated Code Review Effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly Related PRs

mohanagy/graphify-ts#103: Extends retrieval-gate integration with new exclusion signals and deeper propagation through same retrieval plumbing.
mohanagy/graphify-ts#121: Both modify context-pack diagnostics and runtime areas that this PR builds upon for enriched signals and quality checks.
mohanagy/graphify-ts#144: Modifies same retrieval slicing and retrieve.ts plumbing for foundational slice recovery.

🐰 Domains dance through the pipeline's spine,
Exclusions whisper what to leave behind,
Test and prod paths now correctly aligned,
Context packs freed from pollution's bind! ✨

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 1.39% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The PR title directly describes the main objective: fixing production retrieval correctness, which aligns with the primary focus on making retrieval exclusions operational, fixing intent parsing, and improving slice anchoring.
Description check	✅ Passed	The PR description covers all required template sections with specific details: summary explains the three main changes, verification lists the testing commands performed, and checklist items are implicitly addressed through the comprehensive testing and no secrets disclosure.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch retrieval-correctness

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 3

🧹 Nitpick comments (6)

src/runtime/context-pack-diagnostics.ts (2)
340-355: 💤 Low value

missingRuntimePipeline may produce false positives.

The function checks for both pipelineNodeCount === 0 and structuralRelations === 0. However, it already calls pipelinePrompt(pack) at line 354, which means it's checking a pipeline prompt. If the pack has pipeline nodes but no structural relations (or vice versa), it will still trigger. Consider whether this OR condition is intentional or if it should be AND.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/runtime/context-pack-diagnostics.ts` around lines 340 - 355, The
missingRuntimePipeline function currently uses an OR between pipelineNodeCount
=== 0 and structuralRelations === 0 which causes false positives when only one
of those is missing; update the return to require both to be missing by changing
the condition to pipelinePrompt(pack) && (pipelineNodeCount === 0 &&
structuralRelations === 0) so that missingRuntimePipeline only reports when
there are no pipeline nodes AND no structural relations; ensure you modify the
expression that references pipelineNodeCount and structuralRelations in
missingRuntimePipeline accordingly.
371-382: 💤 Low value

Fixed denominator may allow negative scores before clamping.

The fixed denominator of 10 means that if many new warnings trigger (total weight > 10), the raw score would go negative before the Math.max(0, ...) clamp. This is handled correctly, but consider adding a comment explaining the design choice and the expected behavior when total triggered weight exceeds the denominator.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/runtime/context-pack-diagnostics.ts` around lines 371 - 382, The
computeQualityScore function uses a fixed denominator totalWeight = 10 which
allows the raw score to go negative when triggeredWeight (accumulated from
RULE_WEIGHTS via the warnings loop) exceeds that denominator before the clamp;
add a concise comment above totalWeight (or at the top of computeQualityScore)
explaining that the denominator is intentionally fixed to preserve historical
comparability, that excess triggeredWeight will produce negative raw scores
which are then clamped to 0 by Math.max, and document the expected semantics
(new warnings reduce the numerator but don’t change the historical baseline),
referencing totalWeight, triggeredWeight, and RULE_WEIGHTS so future readers
understand the design choice.
src/runtime/retrieval-gate.ts (1)
75-76: 💤 Low value

Regex may miss some valid symbol patterns.

The SYMBOL_EXPLICIT_RE pattern requires the class name to start with uppercase ([A-Z]), which won't match lowercase class names or module-style references like utils.parseDate. Consider whether this is intentional or if the pattern should be more permissive.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/runtime/retrieval-gate.ts` around lines 75 - 76, The SYMBOL_EXPLICIT_RE
currently enforces an uppercase first character for the left-hand identifier
which will miss valid symbols like lowercase class names or module.function
references; update the regex used in SYMBOL_EXPLICIT_RE to allow an initial
lowercase letter and to accept dot-separated module paths (e.g., allow
[A-Za-z_$][A-Za-z0-9_$]* at the front and permit repeated
(?:\.[A-Za-z_$][A-Za-z0-9_$]*)? segments), while preserving existing support for
separators (., #, ::) and trailing parentheses; locate and update the
SYMBOL_EXPLICIT_RE declaration to be more permissive (no change needed to
STACK_TRACE_RE).
src/runtime/context-pack.ts (1)
863-866: 💤 Low value

Clarify the intent of domain boost for non-production sources.

The +0.25 boost for domains that are neither production nor unknown seems intentional—it rewards explicitly-tagged domains (test, benchmark, config, etc.) when they pass the penalty gate (i.e., when the task contract requests them). However, this logic is subtle and could benefit from a brief inline comment explaining the rationale.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/runtime/context-pack.ts` around lines 863 - 866, Add a brief inline
comment above the block that adjusts score based on view.source_domain (the if
checking view.source_domain !== 'production' && view.source_domain !==
'unknown') explaining the intent: that non-production explicit domains (e.g.,
test, benchmark, config) receive a +0.25 boost to reward explicitly-tagged
non-prod sources when the task contract permits them (after any penalty gate),
so future readers understand this subtle scoring choice; reference the variables
score, reasons, and the pushUnique call in the comment for clarity.
src/runtime/retrieve.ts (1)
741-781: 💤 Low value

Symbol matching score thresholds could benefit from documentation.

The symbolReferenceMatchScore function returns scores in the range 0-4 with specific thresholds (4, 3.5, 3, 2.5) for different match qualities. Consider adding a brief comment or JSDoc explaining the score semantics, as these values are used as gates elsewhere (e.g., symbolMatch >= 3 at line 1673, symbolMatch >= 3 at line 1982).
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/runtime/retrieve.ts` around lines 741 - 781, The function
symbolReferenceMatchScore uses numeric thresholds (0, 2.5, 3, 3.5, 4) to
represent match quality but has no documentation; add a concise JSDoc above
symbolReferenceMatchScore that documents the meaning of each score (e.g., 4 =
exact match, 3.5 = method match with class in source, 3 = method match on
reference with class context, 2.5 = bare name match, 0 = no match) and note
where callers gate on values (e.g., consumers that compare symbolMatch >= 3
expect strong matches). Ensure the comment mentions the score range (0–4) and
intended semantic thresholds so future readers and maintainers understand why
callers check for >=3.
src/runtime/retrieve/slicing.ts (1)
203-209: 💤 Low value

Anchor reason cascade has a potential logic gap.

The reason assignment at lines 203-209 uses a cascade: exactLabelMatch → literalPathMatch → sourcePathMatch → default. However, sourcePathMatch could be true when both exact label AND source path match. The current order prioritizes exactLabelMatch which seems correct, but the intermediate literalPathMatch check (line 205) could mask cases where sourcePathMatch is more descriptive. Consider whether this is the intended precedence.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/runtime/retrieve/slicing.ts` around lines 203 - 209, The cascade
assigning reason can mis-prioritize source path matches because
node.literalPathMatch is checked before node.sourcePathMatch; decide the
intended precedence and update the ternary chain accordingly (e.g.,
exactLabelMatch → sourcePathMatch → literalPathMatch → default if source path
should be more descriptive), or explicitly document the chosen precedence with a
comment next to the reason assignment; update the expression that computes
reason (referencing the variable reason and properties node.exactLabelMatch,
node.literalPathMatch, node.sourcePathMatch) and add/adjust unit tests to cover
combinations where multiple flags are true to prevent regressions.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/runtime/retrieve.ts`:
- Line 1563: The code sets rootPath using graph.graph.root_path with
process.cwd() as a silent fallback (const rootPath = typeof
graph.graph.root_path === 'string' ? graph.graph.root_path : process.cwd()),
which can cause surprising behavior in non-standard working directories; update
the logic to detect when graph.graph.root_path is missing and emit a clear
warning log (e.g., via the existing logger used in this module) that indicates
the fallback to process.cwd() and recommends configuring root_path, and apply
the same change to the other occurrence noted (the line around 2118) so both
places log the fallback rather than silently using process.cwd().

In `@src/shared/source-discovery.ts`:
- Around line 157-174: The isDiscoveryPathIgnored function currently lets a
negated pattern flip a hard-ignore; change the loop logic so hard ignores from
isHardIgnoredPath(relativePath) cannot be overridden by negated patterns:
compute ignored = isHardIgnoredPath(relativePath) as now, but inside the
for-loop (over patterns) if matchesWorkspacePattern(relativePath, path, pattern)
and the matched pattern is a negation (pattern.startsWith('!')), skip applying
the negation when ignored is true due to isHardIgnoredPath; only allow
non-negated patterns to set ignored = true and allow negated patterns to clear
ignored when and only when the current ignored state was set by a non-hard
ignore. Ensure references to isDiscoveryPathIgnored, isHardIgnoredPath,
matchesWorkspacePattern and the pattern/negated logic are updated accordingly.

---

Nitpick comments:
In `@src/runtime/context-pack-diagnostics.ts`:
- Around line 340-355: The missingRuntimePipeline function currently uses an OR
between pipelineNodeCount === 0 and structuralRelations === 0 which causes false
positives when only one of those is missing; update the return to require both
to be missing by changing the condition to pipelinePrompt(pack) &&
(pipelineNodeCount === 0 && structuralRelations === 0) so that
missingRuntimePipeline only reports when there are no pipeline nodes AND no
structural relations; ensure you modify the expression that references
pipelineNodeCount and structuralRelations in missingRuntimePipeline accordingly.
- Around line 371-382: The computeQualityScore function uses a fixed denominator
totalWeight = 10 which allows the raw score to go negative when triggeredWeight
(accumulated from RULE_WEIGHTS via the warnings loop) exceeds that denominator
before the clamp; add a concise comment above totalWeight (or at the top of
computeQualityScore) explaining that the denominator is intentionally fixed to
preserve historical comparability, that excess triggeredWeight will produce
negative raw scores which are then clamped to 0 by Math.max, and document the
expected semantics (new warnings reduce the numerator but don’t change the
historical baseline), referencing totalWeight, triggeredWeight, and RULE_WEIGHTS
so future readers understand the design choice.

In `@src/runtime/context-pack.ts`:
- Around line 863-866: Add a brief inline comment above the block that adjusts
score based on view.source_domain (the if checking view.source_domain !==
'production' && view.source_domain !== 'unknown') explaining the intent: that
non-production explicit domains (e.g., test, benchmark, config) receive a +0.25
boost to reward explicitly-tagged non-prod sources when the task contract
permits them (after any penalty gate), so future readers understand this subtle
scoring choice; reference the variables score, reasons, and the pushUnique call
in the comment for clarity.

In `@src/runtime/retrieval-gate.ts`:
- Around line 75-76: The SYMBOL_EXPLICIT_RE currently enforces an uppercase
first character for the left-hand identifier which will miss valid symbols like
lowercase class names or module.function references; update the regex used in
SYMBOL_EXPLICIT_RE to allow an initial lowercase letter and to accept
dot-separated module paths (e.g., allow [A-Za-z_$][A-Za-z0-9_$]* at the front
and permit repeated (?:\.[A-Za-z_$][A-Za-z0-9_$]*)? segments), while preserving
existing support for separators (., #, ::) and trailing parentheses; locate and
update the SYMBOL_EXPLICIT_RE declaration to be more permissive (no change
needed to STACK_TRACE_RE).

In `@src/runtime/retrieve.ts`:
- Around line 741-781: The function symbolReferenceMatchScore uses numeric
thresholds (0, 2.5, 3, 3.5, 4) to represent match quality but has no
documentation; add a concise JSDoc above symbolReferenceMatchScore that
documents the meaning of each score (e.g., 4 = exact match, 3.5 = method match
with class in source, 3 = method match on reference with class context, 2.5 =
bare name match, 0 = no match) and note where callers gate on values (e.g.,
consumers that compare symbolMatch >= 3 expect strong matches). Ensure the
comment mentions the score range (0–4) and intended semantic thresholds so
future readers and maintainers understand why callers check for >=3.

In `@src/runtime/retrieve/slicing.ts`:
- Around line 203-209: The cascade assigning reason can mis-prioritize source
path matches because node.literalPathMatch is checked before
node.sourcePathMatch; decide the intended precedence and update the ternary
chain accordingly (e.g., exactLabelMatch → sourcePathMatch → literalPathMatch →
default if source path should be more descriptive), or explicitly document the
chosen precedence with a comment next to the reason assignment; update the
expression that computes reason (referencing the variable reason and properties
node.exactLabelMatch, node.literalPathMatch, node.sourcePathMatch) and
add/adjust unit tests to cover combinations where multiple flags are true to
prevent regressions.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: 6683a487-0d60-4861-9a0f-ffbf42ec8331

📥 Commits

Reviewing files that changed from the base of the PR and between 7d4e486 and 64b1ee2.

📒 Files selected for processing (20)

CHANGELOG.md
README.md
src/contracts/context-pack-diagnostics.ts
src/contracts/context-pack.ts
src/contracts/retrieval-gate.ts
src/pipeline/detect.ts
src/pipeline/spi/build.ts
src/runtime/context-pack-diagnostics.ts
src/runtime/context-pack.ts
src/runtime/retrieval-gate.ts
src/runtime/retrieve.ts
src/runtime/retrieve/slicing.ts
src/shared/source-discovery.ts
tests/unit/benchmark-quality.test.ts
tests/unit/context-pack-diagnostics.test.ts
tests/unit/detect.test.ts
tests/unit/generate.test.ts
tests/unit/retrieval-gate.test.ts
tests/unit/retrieve-production-correctness.test.ts
tests/unit/spi-build.test.ts

coderabbitai · 2026-05-11T20:41:10Z

  const { question, budget } = options
  const questionTokens = tokenizeQuestion(question)
-  const rootPath = typeof graph.graph.root_path === 'string' ? graph.graph.root_path : undefined
+  const rootPath = typeof graph.graph.root_path === 'string' ? graph.graph.root_path : process.cwd()


⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

process.cwd() fallback for rootPath may behave unexpectedly in some environments.

Using process.cwd() as a fallback when graph.graph.root_path is unset could lead to inconsistent behavior if the process working directory differs from the repository root (e.g., when run from a subdirectory or in CI environments with non-standard working directories). Consider documenting this behavior or adding a warning log when the fallback is used.

Also applies to: 2118-2118

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/runtime/retrieve.ts` at line 1563, The code sets rootPath using graph.graph.root_path with process.cwd() as a silent fallback (const rootPath = typeof graph.graph.root_path === 'string' ? graph.graph.root_path : process.cwd()), which can cause surprising behavior in non-standard working directories; update the logic to detect when graph.graph.root_path is missing and emit a clear warning log (e.g., via the existing logger used in this module) that indicates the fallback to process.cwd() and recommends configuring root_path, and apply the same change to the other occurrence noted (the line around 2118) so both places log the fallback rather than silently using process.cwd().

coderabbitai · 2026-05-11T20:41:10Z

+function globToRegExp(pattern: string): RegExp {
+  const wildcardCount = [...pattern].filter((character) => character === '*').length
+  if (pattern.length > 512 || wildcardCount > 32) {
+    return /^$/
+  }
+
+  const escaped = pattern.replace(/[.+^${}()|[\]\\]/g, '\\$&')
+  const wildcarded = escaped.replace(/\*/g, '.*').replace(/\?/g, '.')
+  return new RegExp(`^${wildcarded}$`)


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Preserve glob segment semantics in .graphifyignore matching.

globToRegExp() currently maps both * and ** to .*. That means patterns like src/*.ts also match src/nested/file.ts, while **/*.ts misses index.ts at the workspace root. The new ignore support will silently include/exclude the wrong files for common gitignore-style patterns.

Possible fix direction

function globToRegExp(pattern: string): RegExp { const wildcardCount = [...pattern].filter((character) => character === '*').length if (pattern.length > 512 || wildcardCount > 32) { return /^$/ } const escaped = pattern.replace(/[.+^${}()|[\]\\]/g, '\\$&') - const wildcarded = escaped.replace(/\*/g, '.*').replace(/\?/g, '.') + const wildcarded = escaped + .replace(/\*\*\//g, '(?:.*/)?') + .replace(/\*\*/g, '.*') + .replace(/\*/g, '[^/]*') + .replace(/\?/g, '[^/]') return new RegExp(`^${wildcarded}$`) }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

function globToRegExp(pattern: string): RegExp {

const wildcardCount = [...pattern].filter((character) => character === '*').length

if (pattern.length > 512 || wildcardCount > 32) {

return /^$/

}

const escaped = pattern.replace(/[.+^${}()|[\]\\]/g, '\\$&')

const wildcarded = escaped.replace(/\*/g, '.*').replace(/\?/g, '.')

return new RegExp(`^${wildcarded}$`)

function globToRegExp(pattern: string): RegExp {

const wildcardCount = [...pattern].filter((character) => character === '*').length

if (pattern.length > 512 || wildcardCount > 32) {

return /^$/

}

const escaped = pattern.replace(/[.+^${}()|[\]\\]/g, '\\$&')

const wildcarded = escaped

.replace(/\*\*\//g, '(?:.*/)?')

.replace(/\*\*/g, '.*')

.replace(/\*/g, '[^/]*')

.replace(/\?/g, '[^/]')

return new RegExp(`^${wildcarded}$`)

}

🧰 Tools

🪛 ast-grep (0.42.2)

[warning] 98-98: Regular expression constructed from variable input detected. This can lead to Regular Expression Denial of Service (ReDoS) attacks if the variable contains malicious patterns. Use libraries like 'recheck' to validate regex safety or use static patterns.
Context: new RegExp(^${wildcarded}$)
Note: [CWE-1333] Inefficient Regular Expression Complexity [REFERENCES]
- https://owasp.org/www-community/attacks/Regular_expression_Denial_of_Service_-_ReDoS
- https://cwe.mitre.org/data/definitions/1333.html

(regexp-from-variable)

coderabbitai · 2026-05-11T20:41:10Z

+export function isDiscoveryPathIgnored(path: string, root: string, patterns: readonly string[]): boolean {
+  const relativePath = relativeWorkspacePath(path, root)
+  if (!relativePath) {
+    return false
+  }
+
+  let ignored = isHardIgnoredPath(relativePath)
+  for (const rawPattern of patterns) {
+    const pattern = rawPattern.trim()
+    if (!pattern) {
+      continue
+    }
+    const negated = pattern.startsWith('!')
+    if (matchesWorkspacePattern(relativePath, path, pattern)) {
+      ignored = !negated
+    }
+  }
+  return ignored


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Keep hard ignores non-overridable.

A negated .graphifyignore entry can currently flip ignored back to false for graphify-out, node_modules, and other hard-ignored paths. That defeats the new hard-ignore contract and reopens generated artifacts to indexing.

Suggested change

export function isDiscoveryPathIgnored(path: string, root: string, patterns: readonly string[]): boolean { const relativePath = relativeWorkspacePath(path, root) if (!relativePath) { return false } - let ignored = isHardIgnoredPath(relativePath) + if (isHardIgnoredPath(relativePath)) { + return true + } + + let ignored = false for (const rawPattern of patterns) { const pattern = rawPattern.trim() if (!pattern) { continue }

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/shared/source-discovery.ts` around lines 157 - 174, The isDiscoveryPathIgnored function currently lets a negated pattern flip a hard-ignore; change the loop logic so hard ignores from isHardIgnoredPath(relativePath) cannot be overridden by negated patterns: compute ignored = isHardIgnoredPath(relativePath) as now, but inside the for-loop (over patterns) if matchesWorkspacePattern(relativePath, path, pattern) and the matched pattern is a negation (pattern.startsWith('!')), skip applying the negation when ignored is true due to isHardIgnoredPath; only allow non-negated patterns to set ignored = true and allow negated patterns to clear ignored when and only when the current ignored state was set by a non-hard ignore. Ensure references to isDiscoveryPathIgnored, isHardIgnoredPath, matchesWorkspacePattern and the pattern/negated logic are updated accordingly.

Fix retrieval correctness for production prompts

64b1ee2

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

coderabbitai Bot reviewed May 11, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix production retrieval correctness#146

Fix production retrieval correctness#146
mohanagy wants to merge 1 commit into
mainfrom
retrieval-correctness

mohanagy commented May 11, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented May 11, 2026 •

edited

Loading

Walkthrough

Changes

Estimated Code Review Effort

Possibly Related PRs

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot May 11, 2026

Uh oh!

coderabbitai Bot May 11, 2026

Uh oh!

coderabbitai Bot May 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

mohanagy commented May 11, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Verification

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented May 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated Code Review Effort

Possibly Related PRs

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 11, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 11, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 11, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

mohanagy commented May 11, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 11, 2026 •

edited

Loading