guide llms.txt improvements

Follow-up improvements to #412 / #413, based on further testing of the files and feedback (thread: [llms.txt feedback #13648](https://github.com/wagtail/wagtail/discussions/13648)).

- [x] Track files’ token lengths
- [x] Make the files discoverable directly from the site: #416 
- [x] remove "About" and "Contributing" pages to increase signal to noiset
- [x] set a target of token count so the file fits in small models’ context windows (for example to fit in a 32k context)

## Postponed

Potential improvements where I’m not sure how high the ROI is:

- [ ] TBC: docs-focused MCP server
- [ ] add inline heading links to Markdown to encourage direct linking to relevant sections
- [ ] change the "full" format so there is less ambiguity about documents’ start and end
- [ ] Extract as a package
- [ ] Set up CDN-level caching of `text/plain` and `text/markdown` responses
- [ ] Remove "releases" pages from the files (they shouldn’t be relevant to answer questions about the CMS)
- [x] #415
- [ ] Unit tests for lack of HTML escaping
- [ ] Trial LLM-focused information
	- [ ] Information only present in llms.txt
	- [ ] Information only present in pages’ markdown representations
	- [ ] Information only present in llms-full.txt
	- [ ] Page only linked from llms.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

guide llms.txt improvements #414

Postponed

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

guide llms.txt improvements #414

Description

Postponed

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions