Skip to content

Comments

fix(message-parser): parse underscore decimals as italic (fixes #31920)#38907

Open
smirk-dev wants to merge 1 commit intoRocketChat:developfrom
smirk-dev:fix/message-parser-decimal-italic
Open

fix(message-parser): parse underscore decimals as italic (fixes #31920)#38907
smirk-dev wants to merge 1 commit intoRocketChat:developfrom
smirk-dev:fix/message-parser-decimal-italic

Conversation

@smirk-dev
Copy link
Contributor

@smirk-dev smirk-dev commented Feb 22, 2026

Summary

Fixes decimal and dot-containing underscore emphasis parsing (e.g. _1.5_) being treated as plain text instead of italic.

Root cause

AutoLinkURL (schemeless branch) was matching underscore-prefixed text first, preventing Emphasis from parsing it as italic in PEG ordered choice.

What changed

  • Updates AutoLinkURL in src/grammar.pegjs to avoid matching when input starts with _
  • Adds regression test cases in tests/emphasis.test.ts:
    • _1.5_
    • _a.b_
    • _hello.world_
    • _1.5 text_

Testing

  • Full packages/message-parser test suite passes

Fixes #31920

Summary by CodeRabbit

  • Bug Fixes

    • Refined URL autolinking behavior to properly handle underscores in hostnames.
  • Tests

    • Expanded test coverage for emphasis parsing, including scenarios with decimal numbers and dots within underscored text.

…es (RocketChat#31920)

The PEG parser was greedily matching patterns like '_1.5_' as schemeless
domain names via AutolinkedURL before the Emphasis rule could try italic
parsing. Since PEG ordered choice doesn't backtrack from a successful
match, autoLink() returning plain() still prevented the italic parse.

Add a negative lookahead '!"_"' to AutoLinkURL's schemeless domain
alternative so strings starting with underscore are not consumed as
URLs. This allows the Emphasis rule to correctly parse '_1.5_' as italic
while preserving URL parsing for schemed URLs like 'http://te_st.com'
and schemeless URLs like 'te_st.com'.

Fixes RocketChat#31920
@changeset-bot
Copy link

changeset-bot bot commented Feb 22, 2026

⚠️ No Changeset found

Latest commit: c73f863

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

@dionisio-bot
Copy link
Contributor

dionisio-bot bot commented Feb 22, 2026

Looks like this PR is not ready to merge, because of the following issues:

  • This PR is missing the 'stat: QA assured' label
  • This PR is missing the required milestone or project

Please fix the issues and try again

If you have any trouble, please check the PR guidelines

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Feb 22, 2026

Walkthrough

This PR fixes a parsing bug where decimal numbers and dotted text enclosed in underscores (e.g., _1.5_) couldn't be italicized. It modifies the AutoLinkURL grammar rule to add a negative lookahead preventing underscore-prefixed matches, and adds test cases validating the fix works correctly.

Changes

Cohort / File(s) Summary
PEG Grammar Rule Update
packages/message-parser/src/grammar.pegjs
Modified AutoLinkURL rule to add negative lookahead !"_" and change capture operator from $ to @$, preventing autolink matching when underscore prefixes the URL authority host. This allows underscores around decimal numbers and dotted text to be correctly interpreted as italic markers.
Emphasis Test Coverage
packages/message-parser/tests/emphasis.test.ts
Added 5 new test cases covering italic formatting for content with decimal numbers and periods: _1.5_, _a.b_, _hello.world_, and _1.5 text_.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~12 minutes

Poem

🐰 A tale of decimals and dots so fine,
Once trapped by underscores, now they shine!
With lookahead magic, the parser sees clear—
Italic numbers are finally here!

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and specifically summarizes the main fix: parsing underscore-wrapped decimals as italic instead of plain text, with explicit issue reference.
Linked Issues check ✅ Passed Code changes fully address issue #31920 requirements: updated AutoLinkURL rule prevents underscore-prefixed text from matching as URLs, enabling Emphasis rule to parse underscores as italic formatting.
Out of Scope Changes check ✅ Passed All changes are directly related to fixing issue #31920: the grammar.pegjs modification and emphasis.test.ts additions are scoped to the underscore-decimal italic parsing fix.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No issues found across 2 files

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
packages/message-parser/tests/emphasis.test.ts (2)

129-129: Remove inline comment — violates the no-comments guideline.

The // Decimal numbers and dots inside italic (#31920) comment is unnecessary; the issue reference belongs in the commit message rather than test source.

🧹 Proposed change
-	// Decimal numbers and dots inside italic (`#31920`)
 	['_1.5_', [paragraph([italic([plain('1.5')])])]],

As per coding guidelines, **/*.{ts,tsx,js}: "Avoid code comments in the implementation."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/message-parser/tests/emphasis.test.ts` at line 129, Remove the
inline comment "// Decimal numbers and dots inside italic (`#31920`)" from the
test in emphasis.test.ts; locate the test block that covers decimal/italic
behavior (the test near the comment) and delete that comment line so the test
source contains no code comments, keeping any issue references in the commit
message instead.

130-133: Add a regression test documenting the schemeless URL behavior change for patterns starting with underscore.

The grammar fix for issue #31920 (adding tests for dots in italic content) also silently changes how schemeless URLs starting with _ are handled. Patterns like _domain.com previously auto-linked but now render as plain text or broken emphasis. None of the new tests cover this side-effect, creating a regression risk if future refactors inadvertently revert to URL-first matching without test coverage to catch it.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/message-parser/tests/emphasis.test.ts` around lines 130 - 133, Add a
regression test that covers schemeless URLs that start with an underscore by
extending the same test vectors array used in emphasis.test.ts (the list
containing entries like '_1.5_' and functions/constructors paragraph, italic,
plain) with cases such as '_domain.com_' and '_domain.com/path_' and assert they
produce an autolink/link node instead of an italic/paragraph node; place these
new entries alongside the existing patterns so the parser test verifies
URL-first matching for leading-underscore schemeless URLs and prevents the
regression introduced by the dots-in-italic grammar change.
📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 3145c41 and c73f863.

📒 Files selected for processing (2)
  • packages/message-parser/src/grammar.pegjs
  • packages/message-parser/tests/emphasis.test.ts
🧰 Additional context used
📓 Path-based instructions (1)
**/*.{ts,tsx,js}

📄 CodeRabbit inference engine (.cursor/rules/playwright.mdc)

**/*.{ts,tsx,js}: Write concise, technical TypeScript/JavaScript with accurate typing in Playwright tests
Avoid code comments in the implementation

Files:

  • packages/message-parser/tests/emphasis.test.ts
🧠 Learnings (2)
📚 Learning: 2025-11-24T17:08:17.065Z
Learnt from: CR
Repo: RocketChat/Rocket.Chat PR: 0
File: .cursor/rules/playwright.mdc:0-0
Timestamp: 2025-11-24T17:08:17.065Z
Learning: Applies to apps/meteor/tests/e2e/**/*.spec.ts : Utilize Playwright fixtures (`test`, `page`, `expect`) for consistency in test files

Applied to files:

  • packages/message-parser/tests/emphasis.test.ts
📚 Learning: 2025-11-24T17:08:17.065Z
Learnt from: CR
Repo: RocketChat/Rocket.Chat PR: 0
File: .cursor/rules/playwright.mdc:0-0
Timestamp: 2025-11-24T17:08:17.065Z
Learning: Applies to **/*.spec.ts : Use descriptive test names that clearly communicate expected behavior in Playwright tests

Applied to files:

  • packages/message-parser/tests/emphasis.test.ts
🧬 Code graph analysis (1)
packages/message-parser/tests/emphasis.test.ts (1)
packages/message-parser/src/utils.ts (3)
  • paragraph (27-27)
  • italic (62-62)
  • plain (65-65)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: 📦 Build Packages
  • GitHub Check: cubic · AI code reviewer
  • GitHub Check: CodeQL-Build
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@packages/message-parser/tests/emphasis.test.ts`:
- Line 129: Remove the inline comment "// Decimal numbers and dots inside italic
(`#31920`)" from the test in emphasis.test.ts; locate the test block that covers
decimal/italic behavior (the test near the comment) and delete that comment line
so the test source contains no code comments, keeping any issue references in
the commit message instead.
- Around line 130-133: Add a regression test that covers schemeless URLs that
start with an underscore by extending the same test vectors array used in
emphasis.test.ts (the list containing entries like '_1.5_' and
functions/constructors paragraph, italic, plain) with cases such as
'_domain.com_' and '_domain.com/path_' and assert they produce an autolink/link
node instead of an italic/paragraph node; place these new entries alongside the
existing patterns so the parser test verifies URL-first matching for
leading-underscore schemeless URLs and prevents the regression introduced by the
dots-in-italic grammar change.

Comment on lines 432 to 435
AutoLinkURL
= $(URLScheme URLAuthority AutoLinkURLBody*)
/ $(URLAuthorityHost AutoLinkURLBody*)
/ !"_" @$(URLAuthorityHost AutoLinkURLBody*)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Find any test asserting autoLink / URL node for inputs starting with `_`
rg -n "autoLink\|AutoLink\|autolink\|_[a-zA-Z0-9]" packages/message-parser/tests/ -A2 | grep -A2 "'_"

Repository: RocketChat/Rocket.Chat

Length of output: 48


🏁 Script executed:

#!/bin/bash
# List test files
echo "=== Test files in packages/message-parser/tests/ ==="
find packages/message-parser/tests/ -type f -name "*.ts" -o -name "*.js" | head -20

echo ""
echo "=== Searching for AutoLink/autoLink test cases ==="
rg -i "autolink" packages/message-parser/tests/ -l

echo ""
echo "=== Searching for schemeless URL tests ==="
rg "schemeless\|domain\\.com" packages/message-parser/tests/ -n -B2 -A2

Repository: RocketChat/Rocket.Chat

Length of output: 1198


🏁 Script executed:

#!/bin/bash
# Check size and read url.test.ts
wc -l packages/message-parser/tests/url.test.ts
echo ""
echo "=== File contents ==="
cat -n packages/message-parser/tests/url.test.ts

Repository: RocketChat/Rocket.Chat

Length of output: 11320


🏁 Script executed:

#!/bin/bash
# Check emphasis tests for underscore patterns
echo "=== Checking emphasis.test.ts for underscore patterns ==="
rg "_[0-9]|_1\.5" packages/message-parser/tests/emphasis.test.ts -n -B1 -A1

echo ""
echo "=== Checking grammar.pegjs around lines 432-435 ==="
sed -n '420,445p' packages/message-parser/src/grammar.pegjs

Repository: RocketChat/Rocket.Chat

Length of output: 1049


The change correctly fixes the emphasis bug (_1.5_ now italicizes properly), but schemeless URLs starting with _ will no longer auto-link.

The !"_" negation on the schemeless alternative blocks all such URLs — not just edge cases with emphasis markers. No existing tests cover schemeless URLs starting with _ (like _domain.com or _dmarc.example.com), so none will fail, but this is a real side effect to accept as part of the fix.

@codecov
Copy link

codecov bot commented Feb 22, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 70.55%. Comparing base (3145c41) to head (c73f863).
⚠️ Report is 1 commits behind head on develop.

Additional details and impacted files

Impacted file tree graph

@@             Coverage Diff             @@
##           develop   #38907      +/-   ##
===========================================
- Coverage    70.55%   70.55%   -0.01%     
===========================================
  Files         3189     3189              
  Lines       112703   112702       -1     
  Branches     20429    20429              
===========================================
- Hits         79519    79518       -1     
  Misses       31123    31123              
  Partials      2061     2061              
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]:Decimal number can't be converted into italic format.

1 participant