Skip to content

EPMRPP-114936 || Implement llms.txt#1115

Open
maria-hambardzumian wants to merge 5 commits into
developfrom
feature/EPMRPP-114936-Implement-llms.txt
Open

EPMRPP-114936 || Implement llms.txt#1115
maria-hambardzumian wants to merge 5 commits into
developfrom
feature/EPMRPP-114936-Implement-llms.txt

Conversation

@maria-hambardzumian
Copy link
Copy Markdown
Contributor

@maria-hambardzumian maria-hambardzumian commented May 18, 2026

https://jiraeu.epam.com/browse/EPMRPP-114936

Summary by CodeRabbit

  • New Features

    • Added automated documentation index and AI-friendly sitemap generation for improved content discoverability
  • Chores

    • Set up GitHub Actions workflow for continuous documentation regeneration

Review Change Stack

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 18, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: c789ccb4-f921-4dab-acf9-f7efab9c5055

📥 Commits

Reviewing files that changed from the base of the PR and between f4b6b51 and f806a1f.

📒 Files selected for processing (3)
  • .github/workflows/gen-llms.yml
  • static/ai-sitemap.json
  • static/llms.txt
✅ Files skipped from review due to trivial changes (1)
  • static/llms.txt
🚧 Files skipped from review as they are similar to previous changes (1)
  • .github/workflows/gen-llms.yml

Walkthrough

Introduces an LLM documentation generator that scans the docs/ directory, extracts Markdown/MDX metadata and front-matter, groups pages into sections, and generates two outputs: a human-readable sitemap (static/llms.txt) and a JSON index (static/ai-sitemap.json). Automation is handled via a new npm script and GitHub Actions workflow that regenerates outputs on PR or manual trigger.

Changes

LLM Docs Generator

Layer / File(s) Summary
Generator script: data model and parsing utilities
scripts/gen-llms.js (constants, parsing, page/section models)
Site constants and paths, recursive filesystem traversal collecting .md/.mdx docs, front-matter and _category_.json parsing, URL and title derivation with slug handling, and sorted page/section model construction from discovered docs.
Generator script: rendering and file output
scripts/gen-llms.js (rendering functions, main), static/llms.txt
Markdown formatting helpers for section/page listing, JSON structure generation with site metadata, file system I/O, and entry-point logic that produces the generated llms.txt sitemap and ai-sitemap.json index with counts and summary logging.
Automation: npm script and GitHub Actions workflow
package.json, .github/workflows/gen-llms.yml
New gen-llms npm script, workflow triggers on PR/manual dispatch with path filters, conditional execution based on repository and configuration, ref resolution (develop for manual, PR head for PR runs), checkout/Node setup, generator execution, staged file detection, and conditional commit/push to the resolved target branch.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Suggested reviewers

  • pressayuliya
  • AmsterGet

Poem

🐰 A rabbit's nibble through the docs so neat,
Scanning paths where markdown and metadata meet,
Sections grouped, sitemaps signed and sealed,
GitHub workflows automate what's revealed—
LLM gardens now thrive in the light! ✨

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly identifies the main change: implementing llms.txt functionality, which aligns with adding the generation script, workflow, and output files.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feature/EPMRPP-114936-Implement-llms.txt

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
docs/FAQ/test-llms.md (1)

1-10: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Please remove this temporary smoke-test page before merge.

The content explicitly marks itself as temporary; shipping it would add non-product FAQ noise.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@docs/FAQ/test-llms.md` around lines 1 - 10, The file docs/FAQ/test-llms.md
contains a temporary smoke-test page titled "llms.txt regeneration smoke test"
that must be removed before merging; delete this file (or remove its temporary
content and metadata) so the temporary FAQ entry and its frontmatter (title,
description, sidebar_position) are not shipped in the product.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In @.github/workflows/gen-llms.yml:
- Around line 39-44: The run block that echoes the PR branch embeds `${{
github.event.pull_request.head.ref }}` directly into the shell, risking command
injection; change the step to assign `${{ github.event.pull_request.head.ref }}`
to an environment variable (e.g., PR_HEAD_REF) via the step's env: mapping and
then use that safe variable in the run script when writing to GITHUB_OUTPUT
(replace direct `${{ ... }}` usage with "$PR_HEAD_REF" in the echo to preserve
quoting and prevent shell metacharacter interpretation).

In `@scripts/gen-llms.js`:
- Around line 121-125: The current loadSections function skips pages with empty
page.sectionDir, and main only treats rootPage specially, which causes other
root-level docs to be omitted from llms.txt and ai-sitemap.json; update
loadSections and main so root-level pages (where page.sectionDir is falsy) are
collected (e.g., into a dedicated "root" section or appended to the top-level
list) except for the designated overview/rootPage, and ensure those collected
root entries are emitted into the llms.txt and ai-sitemap.json output generation
paths so all top-level docs are included. Use the loadSections function and the
main logic that references rootPage and page.sectionDir to locate where to add
this handling.

---

Outside diff comments:
In `@docs/FAQ/test-llms.md`:
- Around line 1-10: The file docs/FAQ/test-llms.md contains a temporary
smoke-test page titled "llms.txt regeneration smoke test" that must be removed
before merging; delete this file (or remove its temporary content and metadata)
so the temporary FAQ entry and its frontmatter (title, description,
sidebar_position) are not shipped in the product.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 858a5cad-cafd-4025-9a3a-dc1dc5fb68a6

📥 Commits

Reviewing files that changed from the base of the PR and between b0efb70 and dd678bd.

📒 Files selected for processing (4)
  • .github/workflows/gen-llms.yml
  • docs/FAQ/test-llms.md
  • package.json
  • scripts/gen-llms.js

Comment thread .github/workflows/gen-llms.yml
Comment thread scripts/gen-llms.js
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In @.github/workflows/gen-llms.yml:
- Around line 59-69: The workflow currently interpolates
steps.target.outputs.ref directly into the shell's git push invocation (git push
origin HEAD:${{ steps.target.outputs.ref }}), which allows a malicious ref to
inject shell commands; fix it by exporting the target ref into an environment
variable (e.g. TARGET_REF) from the job/step and then use a quoted refspec in
the push (git push origin "HEAD:${TARGET_REF}") so the value is not evaluated by
the shell; update the step that performs the push to reference the env var
(TARGET_REF or similar) and ensure it's quoted, and adjust any related uses of
steps.target.outputs.ref in this script to use the env variable instead.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 9abc6ae9-fe82-4dc5-97e5-4313cda697a0

📥 Commits

Reviewing files that changed from the base of the PR and between dd678bd and f4b6b51.

📒 Files selected for processing (4)
  • .github/workflows/gen-llms.yml
  • docs/FAQ/test-llms-second.md
  • static/ai-sitemap.json
  • static/llms.txt
✅ Files skipped from review due to trivial changes (2)
  • docs/FAQ/test-llms-second.md
  • static/llms.txt

Comment thread .github/workflows/gen-llms.yml Outdated
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants