Replace flat sitemap llms.txt with curated, AI-optimized index#954
Replace flat sitemap llms.txt with curated, AI-optimized index#954nortonandreev wants to merge 3 commits intomasterfrom
Conversation
Docs build
|
1 similar comment
Docs build
|
b8bbdab to
695453a
Compare
Deploying documentation with
|
| Latest commit: |
85b8a9d
|
| Status: | ✅ Deploy successful! |
| Preview URL: | https://148a2bd3.documentation-21k.pages.dev |
| Branch Preview URL: | https://improve-llm-config.documentation-21k.pages.dev |
b1dcfd9 to
30a73ca
Compare
Docs build
|
1 similar comment
Docs build
|
|
As a follow-up, the pages that are included in the LLMs, should be added a |
5706819 to
3e39252
Compare
|
This is the new export: Few comments:
cc: @Lougarou |
PR Compliance Guide 🔍Below is a summary of compliance checks for this PR:
Compliance status legend🟢 - Fully Compliant🟡 - Partial Compliant 🔴 - Not Compliant ⚪ - Requires Further Human Verification 🏷️ - Compliance label |
||||||||||||||||||||||||
PR Code Suggestions ✨Explore these optional code suggestions:
|
||||||||||||||||||||||||
Docs Preview
|
I agree with this, but this would require a more substantial update of the documentation. Not against it – let's decide on which pages we want to include in the file, once approved, I can update them to include the |
Docs build
|
|
@nortonandreev what worked for me before is to ask Claude to validate and implement Qodo's suggestions. Would you be willing to try that? |
I looked into the dodo suggestions when I pushed the PR – most of the suggestions are actually breaking the logic. The main one that makes sense is the curation logic, which requires updating the documentation files. So I wanted to keep this for after we have decided on the set of links we want to include (at least the first to start with, then the logic should pick up stuff automatically). Let me double check with Stephen if he has had the chance to review the list, if not, I will just go with what I have included for now. |
|
@nortonandreev sorry it took a while for me to respond. While the new export is an improvement from previous one, I wonder it should be structured hierarchically the same way as it is done in docs. E.g. top level: getting started, kurrent cloud, kurrent server etc. Because right now it is not entirely clear whether some of the sections belong to the server or another product. |
User description
Replace flat sitemap
llms.txtwith curated, AI-optimized indexSummary
This PR replaces the auto-generated flat sitemap
llms.txtwith a curated, categorized index optimized for LLM consumption. Both versions are generated via@vuepress/plugin-llms— the difference is that we now supply custom template getters (getLlmsPluginOptions()) to filter, deduplicate, organize, and describe pages instead of dumping everything into a single flat list.The exhaustive dump is preserved as
llms-full.txtand cross-referenced from the new file's header. Per-page markdown (llms-page.txt) continues to generate as before.Why this matters
When an LLM (ChatGPT, Claude, Copilot, etc.) ingests
llms.txt— whether via an agent fetch, RAG pipeline, or IDE integration — the file's quality directly determines whether the model can find the right content, for the right version, without wasting context window tokens.The old file made this nearly impossible. The new file makes it straightforward.
What changed
Implementation
A new
getLlmsPluginOptions()function provides custom template getters for 15 sections of the curatedllms.txt. The code is organized around four reusable factory helpers:createSlugOrderSectioncreatePrefixSectioncreateFilterSectioncreateFilterSlugOrderSectionAdditional helpers:
matchIndexSlug— reusable predicate for matching index/overview pages (e.g.,/quick-start/or/quick-start/index)getPageDescription— extracts description from frontmatter or auto-excerptnormalizeIndexUrl— fixes broken/.mdURLs to/index.mdKey design decisions:
versioning.latestdetermines the server prefix; the Kubernetes operator version is resolved fromversioning.versions. No hardcoded version strings.createFilterSectioncan pass asortBycomparator for deterministic output regardless of VuePress page order.http-apipaths, so this page appears only under APIs.The plugin is configured with all three output modes:
Output comparison at a glance
llms.txtllms.txt## Table of Contents)versioning.latest)/server/v26.0/...)https://docs.kurrent.io/...)llms-full.txtcross-referencesortBySection order
Sections are ordered by importance for LLM comprehension — understanding what the product is, then how to use it, then how to operate it:
Detailed comparison
Before: flat dump, no version awareness
Problems:
After: curated, categorized, described
How the generation works
Description priority: frontmatter
description→ auto-excerpt → omitted. Descriptions improve automatically as authors adddescriptionto page frontmatter.Scenario walkthrough
"How do I append events using the Python client?"
Python – Appending eventswith description"How do I get started with Kurrent Cloud?"
"How do I deploy KurrentDB on Kubernetes?"
Limitations and known issues
1. Descriptions are auto-generated excerpts, not hand-written summaries
Most descriptions come from VuePress's auto-excerpt (~first 200 characters). This causes:
Auto-Scavenge: Auto-Scavenge,Redaction: Redaction...using thi...Backup and restore: Backup and restore Backing up...Kafka Sink,Elasticsearch Sink(no description)Mitigation: The generator prioritizes frontmatter
description. Adding it to any page immediately improves the output:2. Community section is hardcoded
Community links are static strings rather than VuePress-derived data. Fine for rarely-changing URLs, but requires a code change if any URL changes.
Recommended follow-ups
descriptionto ~25 pages with empty/echo descriptionsPR Type
Enhancement
Description
Replace flat sitemap with curated, AI-optimized llms.txt index
Organize documentation into 15 categorized sections by importance
Reduce entry count from 500+ to ~100 links with descriptions
Use dynamic version resolution and custom template getters
Diagram Walkthrough
File Walkthrough
config.ts
Wire up custom LLMs plugin configurationdocs/.vuepress/config.ts
getLlmsPluginOptionsfunction from llms configllmsPlugin()instead of empty configllms.ts
Implement curated LLMs plugin configuration with section gettersdocs/.vuepress/configs/llms.ts
configuration, etc.)
createPrefixSection,createSlugOrderSection,createFilterSection,createFilterSlugOrderSectionobject