Follow-up improvements to #412 / #413, based on further testing of the files and feedback (thread: [llms.txt feedback #13648](https://github.com/wagtail/wagtail/discussions/13648)). - [x] Track files’ token lengths - [x] Make the files discoverable directly from the site: #416 - [x] remove "About" and "Contributing" pages to increase signal to noiset - [x] set a target of token count so the file fits in small models’ context windows (for example to fit in a 32k context) ## Postponed Potential improvements where I’m not sure how high the ROI is: - [ ] TBC: docs-focused MCP server - [ ] add inline heading links to Markdown to encourage direct linking to relevant sections - [ ] change the "full" format so there is less ambiguity about documents’ start and end - [ ] Extract as a package - [ ] Set up CDN-level caching of `text/plain` and `text/markdown` responses - [ ] Remove "releases" pages from the files (they shouldn’t be relevant to answer questions about the CMS) - [x] #415 - [ ] Unit tests for lack of HTML escaping - [ ] Trial LLM-focused information - [ ] Information only present in llms.txt - [ ] Information only present in pages’ markdown representations - [ ] Information only present in llms-full.txt - [ ] Page only linked from llms.txt
Follow-up improvements to #412 / #413, based on further testing of the files and feedback (thread: llms.txt feedback #13648).
Postponed
Potential improvements where I’m not sure how high the ROI is:
text/plainandtext/markdownresponses