fix: trust structured mention uids without cache validation#218
fix: trust structured mention uids without cache validation#218Jerry-Xin merged 3 commits intodmwork-org:developfrom
Conversation
|
Good. I have everything I need. Let me compile the review. 🔥 Code Review: PR #218 —
|
lml2468
left a comment
There was a problem hiding this comment.
Reviewed the diff. Core change is clean and well-reasoned.
Core change: remove validUids filtering ✅
Traced all three call sites (actions.ts:192, channel.ts:427, inbound.ts:1528) — the simplification is consistent. The rationale in mention-utils.ts:129-133 is correct: server-side rejection of unknown uids is a better failure mode than client-side cache-miss dropping mentions entirely. Tests updated accordingly and the new test case (actions.test.ts:333) correctly validates the intent.
Inbound text fallback — agreeing with @Jerry-Xin's analysis
Both issues identified are valid:
-
Dead code path (problem 1): confirmed via code trace.
uidToNameMapis populated bygetGroupMembersFromCache()+refreshGroupMemberCache()+ a few narrow paths (reply sender, DM user). There's no code path that writescredentials.robot_id → botNameinto the map. Unless the bot happens to appear in the group member list response,uidToNameMap.get(botUid)returnsundefinedand the entire fallback block is skipped. Suggest either: (a) seed the map on startup with the bot's own name from afetchUserInfocall, or (b) document explicitly that this fallback only fires when bot is in the member list. -
Regex CJK boundary (problem 2): the new tests in
mention-utils.test.tsshow@BotName你好correctly doesn't match (lookahead handles it), but the lookbehind[^\\w]doesn't exclude CJK —你好@BotNamewould match. Jerry-Xin's suggested fix (add CJK ranges to lookbehind) is the right approach. Worth applying even if the scenario is rare in practice.
Suggestion: consider splitting the fallback into a follow-up PR once the botUid → botName seeding issue is resolved. The core validUids removal is ready to merge; the fallback adds complexity that isn't functional yet.
lml2468
left a comment
There was a problem hiding this comment.
Code Review — dmwork-adapters#218
Step 1 — Code Quality: PASS
+113/−39 across 7 files. Clean simplification of mention handling.
Core change (validUids removal): Excellent
- Consistent removal across all 3 call sites (
actions.ts:192,channel.ts:427,inbound.ts:1528). - The rationale is correct: server-side rejection of unknown uids is a strictly better failure mode than client-side cache-miss silently dropping mentions. This fixes a real production issue (bot-to-bot mentions failing on cold cache).
- Tests updated to verify all structured mentions now generate entities.
Inbound text fallback: Functional but has known issues
Agreeing with @lml2468's analysis:
- The
uidToNameMap.get(botUid)dead-code path — bot uid is rarely present in the group member cache map, so this fallback rarely activates in practice. - CJK lookbehind gap —
[^\\w]doesn't exclude CJK, so你好@BotNamewould be a false positive. The lookahead correctly handles CJK-after (@BotName你好→ no match), but the lookbehind needs CJK ranges added for symmetry.
@lml2468's suggestion to split the fallback into a follow-up is sound — the core validUids removal is the high-value fix and is independently correct. The fallback can be hardened (seed botUid→botName on init, fix lookbehind) in a focused PR.
Approving based on the core change quality. Fallback issues are non-blocking for merge but should be tracked.
Step 2 — Language Compliance: PASS
PR title, description, commit messages all in English.
|
Addressed review feedback: Issue 1 (fallback dead code) → Tracked separately as #220. The core fix (removing validUids filter) is correct and ships in this PR. The fallback at best is a no-op; fixing it requires populating uidToNameMap with the bot's own uid→name during registration, which is a follow-up. Issue 2 (CJK false positive regex) → Fixed in commit 76b1005. The lookbehind now excludes CJK and extended Latin ranges ( |
|
Now I have full context. Here is my review. Code Review: PR #218 — trust structured mention uids without cache validation总体评价:
|
| 级别 | 数量 | 关键项 |
|---|---|---|
| 🔴 Critical | 1 | Test Block 1 正则与生产代码不一致 |
| 🟡 Warning | 3 | 注释不准确、CJK 漏检 trade-off 未文档化、空名防御 |
| 🔵 Suggestion | 3 | 重复测试、正则缓存、hallucinated UID 风险说明 |
建议: 修复 Critical 项后可以合并。Warning 项可以在后续 PR 中处理。
✅ Code Review: PR #218 —
|
- Remove validUids filtering from convertStructuredMentions() - Add left-boundary lookbehind to inbound text fallback regex - Add 10 regex boundary unit tests - Clean up stale validUids from all call sites and tests - 610/610 tests passing
- Lookbehind now excludes CJK and extended Latin ranges in addition to \w, preventing false positive matches like '你好@botName' - Aligns boundary logic with MENTION_PATTERN in mention-utils.ts - Add 7 unit tests covering CJK boundary, underscore, suffix word chars Refs dmwork-org#220
- Remove old test block with stale regex (lookbehind lacked CJK exclusion) - Consolidate into single test block with 13 cases matching production regex - Fix comment: 'more conservative than MENTION_PATTERN' (not 'consistent with') - Add trade-off comment: CJK-adjacent false negative is intentional - Guard botName with .trim() to prevent weak regex from whitespace-only names - Remove duplicate test case 'should generate entities for known uid' - Add hallucinated UID risk note in mention-utils.ts comment Refs dmwork-org#220
d11b606 to
d9fe732
Compare
Jerry-Xin
left a comment
There was a problem hiding this comment.
Well-scoped fix for a real cache-miss race condition in bot-to-bot mentions, with solid test coverage.
✅ Highlights
-
Root cause is correct: The
validUidsfilter usinguidToNameMap.keys()was a cache-dependent gate. On first message beforerefreshGroupMemberCachepopulates the map, structured mentions with valid UIDs would be silently dropped. Removing this filter is the right fix — structured mentions are already LLM-generated from the system prompt's member list, so the uid is almost certainly valid. -
Defense-in-depth for inbound: The text fallback at
inbound.ts:1225-1241is a good safety net for bot-to-bot messages wherepayload.mentionmay be absent. The conservative regex (excluding CJK lookbehind) correctly favors false negatives over false positives for bot activation. -
Test coverage is thorough: The fallback regex tests (
mention-utils.test.ts:686-751) cover both positive and negative boundary cases well — email-like patterns, CJK adjacency, domain-like suffixes, and word-char lookahead. -
Consistent across all three call sites:
actions.ts:192,channel.ts:427, andinbound.ts:1565all have thevalidUidsparameter removed consistently.
💬 Non-blocking
-
🔵 Suggestion — Duplicated regex between
inbound.tsand test (inbound.ts:1234,mention-utils.test.ts:690-694)The fallback regex is constructed identically in both
inbound.tsandbuildFallbackRegex()in the test file. If either copy drifts, the tests become meaningless. Consider extracting abuildMentionFallbackRegex(botName: string): RegExpfunction intomention-utils.tsand importing it from both places. The test comment "Must mirror the regex in inbound.ts" acknowledges the risk but doesn't prevent it. -
🔵 Suggestion —
new RegExpper inbound message (inbound.ts:1234)The fallback regex is compiled on every inbound group message where
!isMentioned. For high-traffic groups this is a per-message allocation. It's not a performance problem today (the regex is simple, the string is short), but if bot names are stable per session, a cached regex keyed onbotUidwould be trivial. Low priority — just noting it. -
🔵 Suggestion — Comment length in
convertStructuredMentions(mention-utils.ts:129-136)The 7-line comment explaining why validation was removed is useful context for this PR, but is long relative to the surrounding code. The first sentence ("Always generate entity for structured mentions") plus the key insight ("filtering here causes cache-miss false negatives") would suffice. The rest reads more like PR description than code comment.
-
🟡 Warning —
botName?.trim()empty-string check (inbound.ts:1227)uidToNameMap.get(botUid)could return a whitespace-only string if the member cache is populated with bad data, which.trim()catches. Good. However, ifbotNamecontains regex-special characters (e.g., a display name likeC++Botor(Test)), theescapeRegExpcall at line 1228 handles it correctly. No issue — just confirming I checked.
Summary
The change is clean, well-reasoned, and correctly addresses the cache-miss race condition described in #217. The tradeoff (trusting LLM-generated UIDs instead of validating against cache) is explicitly documented and the risk is genuinely low given the 32-char hex hash uid space. The inbound text fallback is a sensible addition with appropriate conservatism. Ship it.
Summary
validUidsfiltering from structured mention conversionProblem
Bot-to-bot
@[uid:Name]mentions in groups could fail on the first attempt when member cache was not warmed. The message text was converted to@name, butpayload.mentionwas omitted, so the receiving bot was not triggered.Closes #217
Testing