fix(specialists): add metadata-bearing VFS enumeration#61
Conversation
Add an optional librarian VFS enumerate hook for providers that can return property-bearing entries, split librarian source diagnostics by VFS path, and retry apiFallback when structured filters remove all VFS results. Co-Authored-By: Codex <noreply@openai.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: d5f0f633d5
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| if (hasFilters && options.vfs.enumerate) { | ||
| try { | ||
| const enumerated = await options.vfs.enumerate({ | ||
| roots: adapter.listRoots(types, filters), |
There was a problem hiding this comment.
Use default entity types when enumerate has no type filter
When a filtered query has no explicit/inferred type, requestedTypes is empty, but this call now passes that empty array to adapter.listRoots. In current adapters this can collapse the search space to nothing (e.g., Linear’s listRoots(types) is types.map(...)), so enumerate runs with roots: [] and valid queries like state:open team:ENG return no results even when matching entries exist. The prior list/search path used adapter.entityTypes in this case, so this is a behavior regression.
Useful? React with 👍 / 👎.
| entries = enumerated.map((entry) => ({ | ||
| entry, | ||
| enumerationType: adapter.inferEntityType(entry), | ||
| })); |
There was a problem hiding this comment.
Reapply requested-type filtering to enumerate results
Enumerated entries are now accepted directly without passing through toEnumerationEntry, so requested type constraints are no longer enforced by the engine. This matters for adapters where type is not in filterKeys (e.g., Linear), because matchesRequestedFilters won’t filter by type there; if an enumerate backend returns mixed entity kinds, type-scoped requests can include wrong entities. The list/search path previously enforced this via toEnumerationEntry.
Useful? React with 👍 / 👎.
…leakage, duplicate apiFallback
Three bugs surfaced by automated review (codex P1, codex P2, devin P1):
1. Empty `requestedTypes` collapsed `enumerate` roots to []
When the query supplied no `type:` filter (e.g. `state:open team:ENG`),
`requestedTypes` returned []. The new enumerate branch passed that
empty array straight into `adapter.listRoots(types, filters)`. Every
current adapter implements `listRoots` as `types.map(...)`, so the
result was `roots: []` and the property-bearing backend was queried
with no roots — guaranteed empty result for any non-type-scoped
filtered query. Mirrors the existing list-path's
`adapter.entityTypes` default by computing
`effectiveTypes = types.length > 0 ? types : adapter.entityTypes`
once and using it for both the roots arg and the type-constraint
enforcement below.
2. Type-constraint enforcement bypassed for enumerate path
Enumerated entries were accepted directly via `entries.map(entry =>
({entry, enumerationType: inferEntityType(entry)}))`, skipping
`toEnumerationEntry`. For adapters where `type` is NOT in
`filterKeys` (Linear), `matchesRequestedFilters` doesn't filter by
type — so an `enumerate` backend returning mixed entity kinds could
leak the wrong type past a `type:`-scoped query. Route enumerated
entries through `toEnumerationEntry(adapter, entry, effectiveTypes)`
like the list path already does. Same fix applied to the
apiFallback paths for consistency.
3. Duplicate apiFallback call when first attempt's entries fail filter
The post-filter-empty safety net at the bottom of the handler called
`loadFallbackEntries` with identical `instruction`, `text`,
`filters`, `types` even when the zero-entry gate above had already
tried apiFallback and gotten back entries that all failed the
post-filter. Track `apiFallbackAttempted` and short-circuit the
safety net when the fallback was already invoked for this turn.
Three regression tests added — one per issue. Coverage now 10 cases on
`librarian-engine.test.ts`. Full `@agent-assistant/specialists` suite
27/27 passing; typecheck clean.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Summary
LibrarianVfs.enumeratefor providers with indexed, property-bearing enumeration backends.enumeratewhen available, while keeping existing list/search behavior for providers that do not implement it.Production failure
In production, sage's
github_specialist({ capability: 'enumerate', query: 'open PRs in AgentWorkforce/cloud', filters: { state: 'open' } })could return empty findings and eventually surface as a generic Slack failure. The root cause is that the cloud VFS provider enumerates through RelayFile'streeendpoint.TreeEntryincludespropertyCountbut not actualproperties, while the librarian engine filters entries by structured fields such asentry.properties.state. That meant list returned entries, every entry failed thestate:openpost-filter, the normal zero-entry apiFallback gate did not fire, and the specialist exhausted its internal harness.Design
This implements the senior-reviewed design by adding an optional metadata-bearing
enumeratemethod toLibrarianVfs. Providers with an indexed/property-aware backend can implement it and return entries withpropertiespopulated. The engine still defensively post-filters all returned entries, so implementations must not silently drop unsupported filters.The engine now prefers
vfs.enumerateonly when filters are active. Unfiltered requests continue through the existing list/search path, and providers withoutenumerateretain the previous list/search behavior. The apiFallback interface is unchanged.Breaking change
LibrarianSourceno longer has the broad'vfs'variant. It now reports more specific VFS sources:'vfs-list','vfs-enumerate', and'vfs-search', plus the existing'apiFallback'and'mixed'. Runtime JSON consumers still receive a string, but TypeScript consumers that exhaustively handle the union need to update their cases.Cloud follow-up
The cloud-side
relayfile-vfs-provider.tsshould plug into this by implementingenumerate({ roots, filters, limit })with a property-bearing backend, likely via RelayFile's filequeryendpoint instead of thetreeendpoint. It should preserve the documented filter semantics: OR within each filter key, AND across keys, and return entries that the specialists engine can re-filter defensively.Validation
npm installnpm test --workspace @agent-assistant/specialistsnpx tsc -p packages/specialists/tsconfig.json --noEmit