Skip to content

Conversation

@thisisjofrank
Copy link
Collaborator

  • Added LLM resource outputs: llms-summary.txt and structured llms.json (from Orama summary), plus generation wiring in build and test scripts.
  • Improved LLM generation to parse YAML frontmatter, respect frontmatter URLs, and add summaries when descriptions are missing.
  • Added AI entrypoint page and linked it from llms.txt
  • updated robots.txt to allow LLM-related endpoints.
  • Fixed malformed frontmatter title
  • Linked John's Deno LLM Skills repo in the Optional section of LLM indexes.

@thisisjofrank thisisjofrank marked this pull request as ready for review February 2, 2026 10:25
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds comprehensive LLM optimization features to the Deno documentation site, introducing new resource files and improving frontmatter parsing for better LLM consumption of the documentation.

Changes:

  • Added new LLM resource outputs: llms-summary.txt (compact index), llms.json (structured JSON from Orama), and enhanced existing llms.txt and llms-full.txt generation
  • Improved LLM generation to parse YAML frontmatter, respect frontmatter URLs, extract summaries from content, and handle missing descriptions
  • Added AI entrypoint page at /ai/ with links to LLM resources and updated robots.txt to allow LLM-related endpoints
  • Fixed malformed frontmatter title in tunnel database tutorial
  • Updated standard library package versions (auto-generated from JSR)

Reviewed changes

Copilot reviewed 25 out of 25 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
generate_llms_files.ts Core changes: added YAML parsing, URL resolution from frontmatter, summary extraction, new functions for generating llms-summary.txt and llms.json, scoring logic for summary candidates
test_llms_gen.ts Updated test to include new generation functions (generateLlmsSummaryTxt, generateLlmsJson, loadOramaSummaryIndex)
_config.ts Updated build script to generate new LLM resource files during site build
static/robots.txt Added Allow directives for /ai/ and all LLM resource files
ai/index.md New AI entrypoint page documenting available LLM resources and usage notes
examples/tutorials/tunnel_database.md Fixed duplicate "title:" prefix in frontmatter
runtime/reference/std/*.md Auto-generated version bumps for standard library packages (legitimate updates from JSR)

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 221 to 223
while ((match = H2_REGEX.exec(markdownContent)) !== null) {
h2Sections.push(match[1]);
}
Copy link

Copilot AI Feb 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The H2_REGEX uses the global flag (/gm) and is reused across multiple files. Regular expressions with the global flag maintain a lastIndex property that persists between calls. Although the regex should be reset naturally when processing different content, it's safer to either: (1) reset H2_REGEX.lastIndex = 0 before the while loop, or (2) use markdownContent.matchAll(H2_REGEX) instead of exec() in a loop. This prevents potential bugs where the regex might not match correctly if its lastIndex wasn't properly reset.

Copilot uses AI. Check for mistakes.

function extractSummary(markdownContent: string): string | null {
const withoutCode = markdownContent.replace(/```[\s\S]*?```/g, "");
const withoutHtml = withoutCode.replace(/<[^>]+>/g, "");
Copy link

Copilot AI Feb 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The regex pattern for removing code blocks only matches triple backtick code blocks (```). This means inline code with single backticks will remain in the summary. While this is likely acceptable, consider whether inline code should be preserved or stripped from summaries. If inline code should be preserved (which is reasonable for technical documentation), this is fine. Otherwise, you may want to strip single backticks as well.

Suggested change
const withoutHtml = withoutCode.replace(/<[^>]+>/g, "");
const withoutInlineCode = withoutCode.replace(/`([^`]+)`/g, "$1");
const withoutHtml = withoutInlineCode.replace(/<[^>]+>/g, "");

Copilot uses AI. Check for mistakes.
donjo and others added 2 commits February 5, 2026 09:13
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Copy link
Member

@donjo donjo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, I'll let you decide if you want to address any of the copilot questions

@thisisjofrank thisisjofrank merged commit df77291 into main Feb 6, 2026
2 checks passed
@thisisjofrank thisisjofrank deleted the llms-optimise branch February 6, 2026 15:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants