Skip to content

EPMRPP-113100 || Update the Schema.org markup for the documentation#1103

Open
maria-hambardzumian wants to merge 3 commits into
developfrom
feature/EPMRPP-113100-Update-the-Schema-markup-for-the-documentation
Open

EPMRPP-113100 || Update the Schema.org markup for the documentation#1103
maria-hambardzumian wants to merge 3 commits into
developfrom
feature/EPMRPP-113100-Update-the-Schema-markup-for-the-documentation

Conversation

@maria-hambardzumian
Copy link
Copy Markdown
Contributor

@maria-hambardzumian maria-hambardzumian commented Apr 23, 2026

Summary by CodeRabbit

Release Notes

  • New Features

    • Automatic Schema.org JSON-LD structured data generation for all documentation pages to improve search engine visibility and overall SEO performance
    • Support for multiple content types including articles, FAQs, how-to guides, and software applications with automatic content-based type detection
  • Chores

    • Added automated validation to ensure all structured data is properly formatted and meets quality standards during the build process

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 23, 2026

Warning

Rate limit exceeded

@maria-hambardzumian has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 53 minutes and 15 seconds before requesting another review.

Your organization is not enrolled in usage-based pricing. Contact your admin to enable usage-based pricing to continue reviews beyond the rate limit, or try again in 53 minutes and 15 seconds.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 6416ef14-91da-4d1b-8807-6b77ef718128

📥 Commits

Reviewing files that changed from the base of the PR and between 6e15827 and 5233be0.

📒 Files selected for processing (1)
  • src/theme/DocItem/Layout/index.js

Walkthrough

A new Docusaurus plugin framework adds Schema.org JSON-LD generation to documentation pages by scanning markdown files, extracting structured metadata, building JSON-LD objects, injecting them into page headers, and validating the output during CI with a dedicated validation script. Configuration and dependencies updated to support the feature.

Changes

Cohort / File(s) Summary
CI/CD Integration
.github/workflows/ci.yml
Added npm run validate:schema step to CI pipeline for JSON-LD validation.
Build Configuration
docusaurus.config.js
Registered new ./plugins/plugin-schema-org plugin in Docusaurus configuration.
Dependencies & Scripts
package.json
Added validate:schema script and dev dependencies for markdown/AST parsing, remark utilities, cheerio, and fast-glob.
Schema Plugin
plugins/plugin-schema-org/index.js, plugins/plugin-schema-org/mapping.js, plugins/plugin-schema-org/parser.js
New plugin scans markdown files, extracts frontmatter, maps content to Schema.org types (via resolveSchemaTypes), parses FAQ/HowTo/SoftwareApplication structures (via parseDoc, extractFaqs, extractHowTo), and stores results in Docusaurus global data.
Schema Generation & Validation
src/lib/schemaOrg.js, scripts/validate-schema.js
Library module builds JSON-LD objects (Article, TechArticle, HowTo, FAQPage, SoftwareApplication); validation script parses built HTML, extracts JSON-LD blocks, and enforces per-type field requirements and structure constraints.
Theme Enhancement
src/theme/DocItem/Layout/index.js
Custom layout wrapper injects generated JSON-LD scripts into page <Head> for each documentation item.

Sequence Diagram

sequenceDiagram
    participant Build as Docusaurus Build
    participant Plugin as Schema Plugin
    participant Parser as Parser Module
    participant Layout as DocItem Layout
    participant Output as HTML Output

    Build->>Plugin: Load plugin
    Plugin->>Plugin: Scan markdown/mdx files
    Plugin->>Plugin: Extract frontmatter & paths
    Plugin->>Parser: Resolve schema types (mapping.js)
    Parser-->>Plugin: Type mappings
    Plugin->>Parser: Parse FAQPage/HowTo/SoftwareApp (parser.js)
    Parser-->>Plugin: Extracted structures (faq, howto, etc.)
    Plugin->>Build: Store schema entries in global data
    
    Build->>Layout: Render doc page
    Layout->>Layout: Read doc metadata & frontmatter
    Layout->>Layout: Load schema entry from plugin data
    Layout->>Layout: buildJsonLd() generates JSON-LD array
    Layout->>Output: Inject <script type="application/ld+json">
    Layout-->>Output: Render DocItem with JSON-LD
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes


🐰 Through markdown fields with structure so neat,
Schema types and JSON-LD we meet,
Validation hops to check every block,
Documentation answers like a well-built flock,
The web now understands our organized talk! 📚✨

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main objective of the pull request—implementing Schema.org markup updates for documentation pages.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feature/EPMRPP-113100-Update-the-Schema-markup-for-the-documentation

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (2)
scripts/validate-schema.js (2)

65-82: Reduce duplicated errors when a mainEntity entry is not a valid object.

If a Question entry is null or not an object, checkFaqPage still runs all three sub-checks and emits three errors for the same malformed item. Short-circuiting once shape is invalid keeps the report signal-to-noise higher.

♻️ Proposed refactor
   questions.forEach((question, index) => {
-    if (!question || question['@type'] !== 'Question') {
+    if (!question || typeof question !== 'object') {
+      errors.report(file, `FAQPage mainEntity[${index}] is not an object`);
+      return;
+    }
+    if (question['@type'] !== 'Question') {
       errors.report(file, `FAQPage mainEntity[${index}] `@type` != Question`);
     }
-    if (!question || !question.name) {
+    if (!question.name) {
       errors.report(file, `FAQPage mainEntity[${index}] missing name`);
     }
-    const answer = question && question.acceptedAnswer;
+    const answer = question.acceptedAnswer;
     if (!answer || answer['@type'] !== 'Answer' || !answer.text) {
       errors.report(file, `FAQPage mainEntity[${index}].acceptedAnswer invalid`);
     }
   });
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/validate-schema.js` around lines 65 - 82, The checkFaqPage function
currently emits multiple errors for the same malformed mainEntity entry; modify
the loop in checkFaqPage so that after converting node.mainEntity to questions
you first validate whether question is a non-null object and has '@type' ===
'Question' (using the existing question and questions identifiers and
errors.report) and if that validation fails emit a single descriptive error
(e.g., about invalid or non-object mainEntity[${index}]) then continue to the
next iteration; only run the subsequent name and acceptedAnswer checks (using
question.name and the answer variable/acceptedAnswer/@type/text) when that
initial shape check passes to avoid duplicate reports.

153-159: Type stats under-report @graph-wrapped payloads.

The countsByType key is derived from the outer node's @type, but when the JSON-LD block is a @graph container the wrapper itself has no @type, so every such page is bucketed under (no @type) even though its children carry real types (e.g., Article, BreadcrumbList). Counting the children (which are already validated on line 155) would make the summary accurate.

♻️ Proposed refactor
     toArray(parsed).forEach((node) => {
       const graphChildren = node && node['@graph'] ? toArray(node['@graph']) : [node];
-      graphChildren.forEach((child) => validateNode(child, errors, file));
-
-      const typeKey = toArray(node && node['@type']).join(',') || '(no `@type`)';
-      stats.countsByType[typeKey] = (stats.countsByType[typeKey] || 0) + 1;
+      graphChildren.forEach((child) => {
+        validateNode(child, errors, file);
+        const typeKey = toArray(child && child['@type']).join(',') || '(no `@type`)';
+        stats.countsByType[typeKey] = (stats.countsByType[typeKey] || 0) + 1;
+      });
     });
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/validate-schema.js` around lines 153 - 159, The current stats logic
computes typeKey from the outer node (variable node/@type), which causes
`@graph-wrapped` documents to be counted as "(no `@type`)"; change the logic in the
block that builds graphChildren (where validateNode is called) to, when
node['@graph'] exists, iterate graphChildren and derive each child's typeKey
from toArray(child['@type']).join(',') || '(no `@type`)' and increment
stats.countsByType for each child, otherwise keep the existing single-node
counting; ensure you still call validateNode(child, errors, file) as before and
update stats.countsByType in the same scope so graph-wrapped child types are
correctly tallied.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@package.json`:
- Around line 55-62: The package.json currently allows "cheerio": "^1.0.0" which
resolves to cheerio@1.2.0 requiring Node >=20.18.1 while your engines.node is
">=18.0"; update package.json to either set "engines.node" to ">=20.18.1" (to
match cheerio) or change the cheerio dependency to a version compatible with
Node 18 (e.g., pin "cheerio" to a specific older release) and run install to
regenerate lockfile; ensure you reference the "cheerio" entry in package.json
and the "engines.node" field when making the change.

In `@src/lib/schemaOrg.js`:
- Around line 198-206: buildJsonLd can emit a fallback schema for pages that
were intentionally skipped because it resolves types even when entry is missing;
change buildJsonLd to early-return an empty array when entry is falsy so no
schema is emitted for skipped docs. Specifically, in buildJsonLd, add a guard
(e.g., if (!entry) return []) before calling resolveTypes and using
BUILDERS/hasEnoughContent so that resolveTypes(entry, frontMatter) and
subsequent BUILDERS[type] mapping never run for absent/skipped entries.

In `@src/theme/DocItem/Layout/index.js`:
- Around line 27-34: The new Layout file uses JSX but is named index.js and also
triggers lint rules for array keys and prop spreading; rename
src/theme/DocItem/Layout/index.js to index.jsx (or adjust the project lint rule
to allow JSX in .js) and in the component ensure stable keys and avoid broad
prop spreading: keep the jsonLdBlocks map using a stable unique key (e.g., use
an explicit id field from each block instead of `index` in the Head key) and
replace the `...props` spread when rendering OriginalLayout with explicit prop
names passed through (or selectively spread only known safe props) to satisfy
the prop-spread lint rule; locate symbols jsonLdBlocks, Head, OriginalLayout,
and props in the file to apply these changes.

---

Nitpick comments:
In `@scripts/validate-schema.js`:
- Around line 65-82: The checkFaqPage function currently emits multiple errors
for the same malformed mainEntity entry; modify the loop in checkFaqPage so that
after converting node.mainEntity to questions you first validate whether
question is a non-null object and has '@type' === 'Question' (using the existing
question and questions identifiers and errors.report) and if that validation
fails emit a single descriptive error (e.g., about invalid or non-object
mainEntity[${index}]) then continue to the next iteration; only run the
subsequent name and acceptedAnswer checks (using question.name and the answer
variable/acceptedAnswer/@type/text) when that initial shape check passes to
avoid duplicate reports.
- Around line 153-159: The current stats logic computes typeKey from the outer
node (variable node/@type), which causes `@graph-wrapped` documents to be counted
as "(no `@type`)"; change the logic in the block that builds graphChildren (where
validateNode is called) to, when node['@graph'] exists, iterate graphChildren
and derive each child's typeKey from toArray(child['@type']).join(',') || '(no
`@type`)' and increment stats.countsByType for each child, otherwise keep the
existing single-node counting; ensure you still call validateNode(child, errors,
file) as before and update stats.countsByType in the same scope so graph-wrapped
child types are correctly tallied.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 0a74417d-d650-4700-918e-4b880aa4f8b4

📥 Commits

Reviewing files that changed from the base of the PR and between 3e3ff72 and 6e15827.

⛔ Files ignored due to path filters (1)
  • package-lock.json is excluded by !**/package-lock.json
📒 Files selected for processing (9)
  • .github/workflows/ci.yml
  • docusaurus.config.js
  • package.json
  • plugins/plugin-schema-org/index.js
  • plugins/plugin-schema-org/mapping.js
  • plugins/plugin-schema-org/parser.js
  • scripts/validate-schema.js
  • src/lib/schemaOrg.js
  • src/theme/DocItem/Layout/index.js

Comment thread package.json
Comment on lines +55 to +62
"cheerio": "^1.0.0",
"fast-glob": "^3.3.2",
"gray-matter": "^4.0.3",
"mdast-util-to-string": "^4.0.0",
"remark-mdx": "^3.0.0",
"remark-parse": "^11.0.0",
"unified": "^11.0.4",
"unist-util-visit": "^5.0.0"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "Declared package Node engine:"
jq -r '.engines.node' package.json

echo
echo "Resolved newly added dependency versions from package-lock.json, if present:"
for dep in cheerio fast-glob gray-matter mdast-util-to-string remark-mdx remark-parse unified unist-util-visit; do
  resolved=""
  if [ -f package-lock.json ]; then
    resolved="$(jq -r --arg dep "$dep" '.packages["node_modules/" + $dep].version // empty' package-lock.json)"
  fi

  if [ -n "$resolved" ]; then
    echo "$dep@$resolved"
    npm view "$dep@$resolved" engines --json || true
  else
    range="$(jq -r --arg dep "$dep" '.devDependencies[$dep] // empty' package.json)"
    echo "$dep range: $range"
    npm view "$dep@$range" version engines --json || true
  fi
  echo
done

Repository: reportportal/docs

Length of output: 405


Update engines.node to match resolved cheerio dependency or constrain the cheerio range.

The resolved version of cheerio@1.2.0 (from the caret range ^1.0.0) requires node >=20.18.1, but the project declares node >=18.0. Either raise engines.node to >=20.18.1 or pin cheerio to a version compatible with Node 18.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@package.json` around lines 55 - 62, The package.json currently allows
"cheerio": "^1.0.0" which resolves to cheerio@1.2.0 requiring Node >=20.18.1
while your engines.node is ">=18.0"; update package.json to either set
"engines.node" to ">=20.18.1" (to match cheerio) or change the cheerio
dependency to a version compatible with Node 18 (e.g., pin "cheerio" to a
specific older release) and run install to regenerate lockfile; ensure you
reference the "cheerio" entry in package.json and the "engines.node" field when
making the change.

Comment thread src/lib/schemaOrg.js
Comment on lines +198 to +206
export function buildJsonLd({ siteConfig, metadata, frontMatter, entry }) {
if (!metadata || !metadata.permalink) return [];

const context = { siteConfig, metadata, frontMatter, entry };
const types = resolveTypes(entry, frontMatter).map((t) => String(t).trim());

return types
.filter((type) => BUILDERS[type] && hasEnoughContent(type, entry))
.map((type) => BUILDERS[type](context));
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Do not emit fallback schema for skipped docs.

The plugin intentionally skips draft/unlisted docs, but buildJsonLd still falls back to TechArticle when entry is missing. That can expose structured data for pages the plugin excluded and can hide lookup mismatches.

Proposed guard
 export function buildJsonLd({ siteConfig, metadata, frontMatter, entry }) {
   if (!metadata || !metadata.permalink) return [];
+  if (frontMatter && (frontMatter.draft === true || frontMatter.unlisted === true)) return [];
+  if (!entry) return [];

   const context = { siteConfig, metadata, frontMatter, entry };
   const types = resolveTypes(entry, frontMatter).map((t) => String(t).trim());
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/lib/schemaOrg.js` around lines 198 - 206, buildJsonLd can emit a fallback
schema for pages that were intentionally skipped because it resolves types even
when entry is missing; change buildJsonLd to early-return an empty array when
entry is falsy so no schema is emitted for skipped docs. Specifically, in
buildJsonLd, add a guard (e.g., if (!entry) return []) before calling
resolveTypes and using BUILDERS/hasEnoughContent so that resolveTypes(entry,
frontMatter) and subsequent BUILDERS[type] mapping never run for absent/skipped
entries.

Comment thread src/theme/DocItem/Layout/index.js
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant