Skip to content

feat(specialists): add NotionLibrarian (mirrors LinearLibrarian)#62

Merged
khaliqgant merged 2 commits intomainfrom
feat/notion-librarian
Apr 26, 2026
Merged

feat(specialists): add NotionLibrarian (mirrors LinearLibrarian)#62
khaliqgant merged 2 commits intomainfrom
feat/notion-librarian

Conversation

@khaliqgant
Copy link
Copy Markdown
Member

@khaliqgant khaliqgant commented Apr 26, 2026

Summary

Adds NotionLibrarian to @agent-assistant/specialists, mirroring the existing LinearLibrarian shape.

  • New packages/specialists/src/notion/types.ts
  • New packages/specialists/src/notion/librarian.ts with NotionLibrarianAdapter + createNotionLibrarian factory
  • New packages/specialists/src/notion/index.ts
  • Re-exported from root packages/specialists/src/index.ts
  • Tests covering filter behaviour, apiFallback wiring, and the post-filter-empty safety net

Why

Production: Slack DM asking sage about Notion content returns "I could not complete that request right now." The agent card lists only github/linear/pr_investigation skills. Cloud has no Notion specialist to wire because the upstream librarian/adapter doesn't exist.

Follow-up (separate PR)

After this lands and @agent-assistant/specialists republishes, the cloud-side workflow fix-notion-specialist-apifallback.ts builds notion-api-client, notion-api-fallback, /api/v1/notion/query proxy, and the notion-specialist-agentic wiring.

Validation

  • specialists workspace tests green (33/33)
  • tsc clean
  • mirrors LinearLibrarian (1:1 structural symmetry)

🤖 Generated with Claude Code


Open in Devin Review

Production trigger: cloud has no Notion specialist at all (the agent
card advertises only pr_investigation, github.enumerate, linear.enumerate).
Slack DM "do you see any info about our investors in notion?" returned
the canned "could not complete" reply because there's no specialist
for sage to delegate to.

This is the upstream half. Cloud-side wiring (notion-api-client +
notion-api-fallback + /api/v1/notion/query proxy + notion-specialist-agentic)
lands in fix-notion-specialist-apifallback workflow against the cloud repo
after this PR is merged and republished.

Mirrors LinearLibrarian shape: NotionLibrarianAdapter + createNotionLibrarian
factory + types + tests. No GitHubLibrarian-style investigator (Notion
doesn't have a clear investigation analog yet — add later if needed).

Tests + typecheck green (33/33 specialists tests).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: df92261785

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

const filters = params.filters ?? {};

for (const key of NOTION_FILTER_KEYS) {
for (const value of filters[key] ?? []) parts.push(`${key}:${value}`);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Preserve spaces when serializing Notion filter values

enumerateNotion() builds an instruction string by appending raw key:value tokens, but values are not quoted/escaped. Any multi-word filter (for example database:Product Roadmap or title:Launch Plan) is split by whitespace during parsing, so only the first word is retained as the filter and the rest is treated as free text. This causes common Notion filters to miss valid entries and also sends incorrect filters to apiFallback.

Useful? React with 👍 / 👎.

…seQuery roundtrip

Codex P1 review on PR #62: enumerateNotion built `key:value` tokens by
direct concatenation, but the shared parseQuery splits the resulting
instruction on /\s+/. Any whitespace-bearing value (e.g.
`database:Product Roadmap`) parsed as filter `database:Product` (truncated)
plus stray text `Roadmap` — silently mangling the filter so it neither
matched the intended Notion content nor reached the apiFallback in the
expected shape.

Fix: in buildEnumerationInstruction, append whitespace-bearing values as
bare text tokens instead of as `key:value`. The engine's
inferEnumerationFilters can then re-extract them from text via its
existing pattern matchers, OR the apiFallback can use the natural-language
text as additional context. Single-word values continue to use `key:value`.

Single regression test asserting enumerateNotion completes cleanly with
multi-word `database` and `title` filters and dispatches a VFS call —
proving the mangled-token codepath is gone.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown

@devin-ai-integration devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 2 potential issues.

View 5 additional findings in Devin Review.

Open in Devin Review

Comment on lines +138 to +139
if (/\s/.test(value)) {
parts.push(value);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Multi-word filter values silently dropped by buildEnumerationInstruction

When enumerateNotion receives filter values containing whitespace (e.g., { database: ['Product Roadmap'], title: ['Launch Plan'] }), buildEnumerationInstruction pushes them as bare text without a key prefix (line 139: parts.push(value)). The instruction becomes "investor info Product Roadmap Launch Plan". Downstream, inferEnumerationFilters can only recover filter values via either key:value tokens (handled by inferExplicitFilters) or natural-language cue words like database "..." or title "...". Since the bare text has no cue word prefix, none of the cue patterns at librarian.ts:161-178 fire, and the filters are silently lost — the query degrades to an unfiltered text search.

The comment on lines 134–137 claims inferEnumerationFilters can "pattern-match them downstream," but the cue-based regex patterns require the key name to precede the value (e.g., /\bdatabase\s+.../), which bare text lacks.

Suggested approach

Instead of pushing bare text, emit the filter value with its key as a cue-word prefix and quote the value, e.g., database "Product Roadmap". The existing cue patterns in inferEnumerationFilters already support quoted values: /\bdatabase\s+(?:"([^"]+)"|...)/i would capture Product Roadmap correctly from database "Product Roadmap".

Suggested change
if (/\s/.test(value)) {
parts.push(value);
if (/\s/.test(value)) {
parts.push(`${key} "${value}"`);
Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Comment on lines +337 to +342
const match = /\/notion\/(pages|databases|blocks|comments)\/[^/]+(?:\.json)?$/i.exec(path);
if (match?.[1] === 'pages') return 'page';
if (match?.[1] === 'databases') return 'database';
if (match?.[1] === 'blocks') return 'block';
if (match?.[1] === 'comments') return 'comment';
return undefined;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Case-insensitive regex with case-sensitive comparison in collectionItemTypeFromPath

The regex at librarian.ts:337 uses the i flag (/\/notion\/(pages|databases|blocks|comments)\/[^/]+(?:\.json)?$/i), making it match paths with non-lowercase collection segments (e.g., /notion/Pages/foo.json). However, the captured group match?.[1] preserves the original case from the input, and the subsequent comparisons use strict lowercase equality (match?.[1] === 'pages'). If a path like /notion/Pages/foo.json is encountered, the regex matches but the comparison fails, returning undefined instead of 'page'. The Linear (librarian.ts:294) and GitHub (librarian.ts:197) adapters correctly omit the i flag, making the Notion adapter inconsistent.

Suggested change
const match = /\/notion\/(pages|databases|blocks|comments)\/[^/]+(?:\.json)?$/i.exec(path);
if (match?.[1] === 'pages') return 'page';
if (match?.[1] === 'databases') return 'database';
if (match?.[1] === 'blocks') return 'block';
if (match?.[1] === 'comments') return 'comment';
return undefined;
const match = /\/notion\/(pages|databases|blocks|comments)\/[^/]+(?:\.json)?$/i.exec(path);
const collection = match?.[1]?.toLowerCase();
if (collection === 'pages') return 'page';
if (collection === 'databases') return 'database';
if (collection === 'blocks') return 'block';
if (collection === 'comments') return 'comment';
return undefined;
Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

@khaliqgant khaliqgant merged commit 180ddd0 into main Apr 26, 2026
1 check passed
@khaliqgant khaliqgant deleted the feat/notion-librarian branch April 26, 2026 07:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant