test(message-parser): add benchmark suite for parser performance baseline#38905
test(message-parser): add benchmark suite for parser performance baseline#38905smirk-dev wants to merge 1 commit intoRocketChat:developfrom
Conversation
Add a comprehensive performance benchmark suite for @rocket.chat/message-parser using tinybench. This establishes baseline performance metrics across 11 categories of message parsing, which can be used to measure improvements from the High-Performance Message Parser Rewrite GSoC project. Categories benchmarked: - Plain text (short/medium/long) - Emphasis/formatting (bold, italic, strike, nested, multiple) - URLs and auto-linking - Emoji (shortcode, unicode, BigEmoji) - Mentions (user, channel) - Code (inline, block) - Structured blocks (lists, tasks, blockquotes, headings, spoilers) - KaTeX math expressions - Adversarial / stress test inputs - Real-world mixed messages - Timestamps Also adds a bench:size script to report bundle size metrics. Run with: yarn bench (in packages/message-parser/)
|
Looks like this PR is not ready to merge, because of the following issues:
Please fix the issues and try again If you have any trouble, please check the PR guidelines |
|
WalkthroughA comprehensive performance benchmarking suite is added to the message-parser package. The suite includes a new Jest-based benchmark test file using TinyBench to measure parsing performance across 14+ categories, along with new npm scripts and dependencies to execute and analyze the results. Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (1)
packages/message-parser/benchmarks/parser.bench.test.ts (1)
38-38:benchscript should use--runInBandto avoid process-level interference with benchmark timing.When Jest spawns worker processes (the default), the OS scheduler and inter-process communication can introduce jitter that inflates variance in the timing measurements. For a baseline benchmark, a single-process run produces more consistent results.
♻️ Proposed change
- "bench": "jest --testPathPatterns='parser.bench' --no-coverage --verbose", + "bench": "jest --testPathPatterns='parser.bench' --no-coverage --verbose --runInBand",🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/message-parser/benchmarks/parser.bench.test.ts` at line 38, Update the project's Jest benchmark invocation to run in a single process by adding the --runInBand flag to the "bench" script so Jest does not spawn worker processes; locate the npm script named "bench" (in the package.json scripts section) and append --runInBand to the jest command invocation to reduce OS scheduler/inter-process jitter during parser benchmarks.
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
packages/message-parser/benchmarks/parser.bench.test.tspackages/message-parser/package.json
🧰 Additional context used
📓 Path-based instructions (1)
**/*.{ts,tsx,js}
📄 CodeRabbit inference engine (.cursor/rules/playwright.mdc)
**/*.{ts,tsx,js}: Write concise, technical TypeScript/JavaScript with accurate typing in Playwright tests
Avoid code comments in the implementation
Files:
packages/message-parser/benchmarks/parser.bench.test.ts
🧠 Learnings (2)
📓 Common learnings
Learnt from: gabriellsh
Repo: RocketChat/Rocket.Chat PR: 37419
File: packages/i18n/src/locales/en.i18n.json:918-921
Timestamp: 2025-11-19T18:20:07.720Z
Learning: Repo: RocketChat/Rocket.Chat — i18n/formatting
Learning: This repository uses a custom message formatting parser in UI blocks/messages; do not assume standard Markdown rules. For keys like Call_ended_bold, Call_not_answered_bold, Call_failed_bold, and Call_transferred_bold in packages/i18n/src/locales/en.i18n.json, retain the existing single-asterisk emphasis unless maintainers request otherwise.
📚 Learning: 2025-12-10T21:00:54.909Z
Learnt from: KevLehman
Repo: RocketChat/Rocket.Chat PR: 37091
File: ee/packages/abac/jest.config.ts:4-7
Timestamp: 2025-12-10T21:00:54.909Z
Learning: Rocket.Chat monorepo: Jest testMatch pattern '<rootDir>/src/**/*.spec.(ts|js|mjs)' is valid in this repo and used across multiple packages (e.g., packages/tools, ee/packages/omnichannel-services). Do not flag it as invalid in future reviews.
Applied to files:
packages/message-parser/benchmarks/parser.bench.test.ts
🧬 Code graph analysis (1)
packages/message-parser/benchmarks/parser.bench.test.ts (1)
packages/message-parser/src/utils.ts (1)
task(53-57)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: 📦 Build Packages
- GitHub Check: cubic · AI code reviewer
- GitHub Check: CodeQL-Build
🔇 Additional comments (2)
packages/message-parser/package.json (2)
38-38:--testPathPatterns(plural) is correct for Jest 30 — LGTM.The
--testPathPatternflag was renamed to--testPathPatternsin Jest 30, so the bench script is correctly using the updated flag.
64-64: No changes needed—ts-nodeis actually required.Jest 30.2.0 requires
ts-nodeto load TypeScript configuration files (jest.config.ts) by default. When thebenchscript runsjest, Jest must read and parse thejest.config.tsfile, which depends onts-nodesince no alternate loader (likeesbuild-register) is specified via a docblock pragma in the config. This is a transitive dependency, not a direct script invocation, which is why it's necessary to keep.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@packages/message-parser/benchmarks/parser.bench.test.ts`:
- Around line 99-101: The fixture named spoilerWithFormatting is defined but
never used; either remove the unused fixture from the fixtures object or add it
to the benchmark group blockBench so it is exercised. To fix, locate the
fixtures definition (symbol: spoilerWithFormatting) and either delete that
entry, or add a corresponding .add(...) call to blockBench (the benchmark group
symbol: blockBench) using fixtures.spoilerWithFormatting so the fixture is
included alongside fixtures.spoiler.
- Around line 237-244: Enable sample retention by passing retainSamples: true to
every Bench constructor (the Bench(...) calls) so task.result?.samples will be
populated; replace unsafe non-null assertions like task.result! with optional
chaining and sensible fallbacks in the reporting code (e.g., use task.result?.hz
?? 0, task.result?.mean ?? 0, and task.result?.samples.length ?? 0) when
building the table and summary so an errored task or missing result cannot crash
the report; update the summary sorting/slicing usages to rely on these safe
fallbacks as well.
---
Nitpick comments:
In `@packages/message-parser/benchmarks/parser.bench.test.ts`:
- Line 38: Update the project's Jest benchmark invocation to run in a single
process by adding the --runInBand flag to the "bench" script so Jest does not
spawn worker processes; locate the npm script named "bench" (in the package.json
scripts section) and append --runInBand to the jest command invocation to reduce
OS scheduler/inter-process jitter during parser benchmarks.
| // Category 13: Spoiler | ||
| spoiler: '||This is a spoiler||', | ||
| spoilerWithFormatting: '||**bold** and _italic_ spoiler||', |
There was a problem hiding this comment.
spoilerWithFormatting fixture is defined but never added to any benchmark group.
The fixture at line 101 is never passed to any .add() call — blockBench only uses fixtures.spoiler. This is dead code.
🛠️ Options
Either remove the fixture, or add it to blockBench:
blockBench
.add('block: ordered list', () => parse(fixtures.orderedList))
.add('block: unordered list', () => parse(fixtures.unorderedList))
.add('block: task list', () => parse(fixtures.taskList))
.add('block: blockquote', () => parse(fixtures.blockquote))
.add('block: heading', () => parse(fixtures.heading))
.add('block: heading multi-level', () => parse(fixtures.headingMultiLevel))
- .add('block: spoiler', () => parse(fixtures.spoiler));
+ .add('block: spoiler', () => parse(fixtures.spoiler))
+ .add('block: spoiler with formatting', () => parse(fixtures.spoilerWithFormatting));🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@packages/message-parser/benchmarks/parser.bench.test.ts` around lines 99 -
101, The fixture named spoilerWithFormatting is defined but never used; either
remove the unused fixture from the fixtures object or add it to the benchmark
group blockBench so it is exercised. To fix, locate the fixtures definition
(symbol: spoilerWithFormatting) and either delete that entry, or add a
corresponding .add(...) call to blockBench (the benchmark group symbol:
blockBench) using fixtures.spoilerWithFormatting so the fixture is included
alongside fixtures.spoiler.
| 'ops/sec': Math.round(task.result!.hz).toLocaleString(), | ||
| 'Avg (ms)': (task.result!.mean * 1000).toFixed(4), | ||
| 'Min (ms)': (task.result!.min * 1000).toFixed(4), | ||
| 'Max (ms)': (task.result!.max * 1000).toFixed(4), | ||
| 'P99 (ms)': (task.result!.p99 * 1000).toFixed(4), | ||
| 'Samples': task.result!.samples.length, | ||
| })), | ||
| ); |
There was a problem hiding this comment.
task.result!.samples.length will always report 0 — retainSamples must be enabled.
By default, tinybench does not keep the samples for latency and throughput to minimize memory usage. To enable sample retention you need to set the retainSamples option to true when creating a Bench instance. Without it, task.result!.samples is always an empty array, so the Samples column is permanently 0 across every group, making that column meaningless.
Additionally, the bare ! non-null assertion on task.result is unsafe: if a task errors during bench.run() (which the adversarial inputs could trigger against the PEG parser), result may be undefined, causing a crash at the reporting step. Prefer optional chaining with a fallback.
🔧 Proposed fix
- const plainTextBench = new Bench({ time: 1000, warmupTime: 200 });
+ const plainTextBench = new Bench({ time: 1000, warmupTime: 200, retainSamples: true });Apply retainSamples: true to all Bench constructors, and replace non-null assertions with optional chaining in the table map:
bench.tasks.map((task) => ({
- 'Task': task.name,
- 'ops/sec': Math.round(task.result!.hz).toLocaleString(),
- 'Avg (ms)': (task.result!.mean * 1000).toFixed(4),
- 'Min (ms)': (task.result!.min * 1000).toFixed(4),
- 'Max (ms)': (task.result!.max * 1000).toFixed(4),
- 'P99 (ms)': (task.result!.p99 * 1000).toFixed(4),
- 'Samples': task.result!.samples.length,
+ 'Task': task.name,
+ 'ops/sec': Math.round(task.result?.hz ?? 0).toLocaleString(),
+ 'Avg (ms)': ((task.result?.mean ?? 0) * 1000).toFixed(4),
+ 'Min (ms)': ((task.result?.min ?? 0) * 1000).toFixed(4),
+ 'Max (ms)': ((task.result?.max ?? 0) * 1000).toFixed(4),
+ 'P99 (ms)': ((task.result?.p99 ?? 0) * 1000).toFixed(4),
+ 'Samples': task.result?.samples.length ?? 0,
})),And in the summary sort/slice:
- const sorted = [...allTasks].sort((a, b) => a.result!.hz - b.result!.hz);
+ const sorted = [...allTasks].sort((a, b) => (a.result?.hz ?? 0) - (b.result?.hz ?? 0));
- console.log(` ${Math.round(task.result!.hz).toLocaleString().padStart(12)} ops/sec │ ${task.name}`);
+ console.log(` ${Math.round(task.result?.hz ?? 0).toLocaleString().padStart(12)} ops/sec │ ${task.name}`);🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@packages/message-parser/benchmarks/parser.bench.test.ts` around lines 237 -
244, Enable sample retention by passing retainSamples: true to every Bench
constructor (the Bench(...) calls) so task.result?.samples will be populated;
replace unsafe non-null assertions like task.result! with optional chaining and
sensible fallbacks in the reporting code (e.g., use task.result?.hz ?? 0,
task.result?.mean ?? 0, and task.result?.samples.length ?? 0) when building the
table and summary so an errored task or missing result cannot crash the report;
update the summary sorting/slicing usages to rely on these safe fallbacks as
well.
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## develop #38905 +/- ##
========================================
Coverage 70.55% 70.55%
========================================
Files 3189 3189
Lines 112703 112703
Branches 20429 20429
========================================
Hits 79519 79519
Misses 31123 31123
Partials 2061 2061 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
1 issue found across 2 files
Prompt for AI agents (all issues)
Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.
<file name="packages/message-parser/benchmarks/parser.bench.test.ts">
<violation number="1" location="packages/message-parser/benchmarks/parser.bench.test.ts:270">
P2: Benchmark file with `.test.ts` extension will be executed during standard test runs, unnecessarily slowing down CI. Consider excluding benchmark files from default test discovery via `testPathIgnorePatterns` or renaming to not match test patterns.</violation>
</file>
Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.
| @@ -0,0 +1,276 @@ | |||
| /** | |||
There was a problem hiding this comment.
P2: Benchmark file with .test.ts extension will be executed during standard test runs, unnecessarily slowing down CI. Consider excluding benchmark files from default test discovery via testPathIgnorePatterns or renaming to not match test patterns.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At packages/message-parser/benchmarks/parser.bench.test.ts, line 270:
<comment>Benchmark file with `.test.ts` extension will be executed during standard test runs, unnecessarily slowing down CI. Consider excluding benchmark files from default test discovery via `testPathIgnorePatterns` or renaming to not match test patterns.</comment>
<file context>
@@ -0,0 +1,276 @@
+ console.log();
+}
+
+it(
+ 'benchmark: message-parser performance',
+ async () => {
</file context>
Summary
Adds a dedicated benchmark suite for
@rocket.chat/message-parserto measure parser throughput across representative message patterns.What changed
benchmarks/parser.bench.test.tsusingtinybenchpackages/message-parser/package.jsonWhy
This establishes a measurable baseline for the High-Performance Message Parser Rewrite project and enables before/after performance comparisons.
Testing
Summary by CodeRabbit
Tests
Chores
benchandbench:sizenpm scripts for running performance benchmarks and checking bundle size