Experiment: improve docs performance #2524

ohmayr · 2026-01-15T19:26:45Z

This PR improves the docs generation performance.

gemini-code-assist · 2026-01-15T19:27:05Z

Summary of Changes

Hello @ohmayr, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the performance of documentation generation by optimizing the Markdown-to-RST conversion process. It introduces a multi-tiered approach, prioritizing a new, fast, pure-Python converter for most common cases and leveraging a caching mechanism. The external pypandoc tool is now reserved as a fallback for more intricate Markdown elements, thereby reducing its invocation and improving overall efficiency.

Highlights

Optimized Markdown to RST Conversion: Introduced a new pure-Python Markdown-to-RST converter (_tuned_fast_convert) that efficiently handles common Markdown elements like code blocks, various link types, bold/italics, headings, and lists using regular expressions.
Conversion Caching: Implemented a caching mechanism (_RAW_RST_CACHE) to store previously converted RST snippets, significantly reducing redundant conversions and improving performance for frequently encountered text blocks.
Tiered Conversion Strategy: Refactored the rst conversion logic to prioritize a 'Super Fast Path' for unformatted text, followed by a cache lookup, then the new fast Python converter, and finally falling back to the slower pypandoc only for complex Markdown structures such as tables and images.
Code Refactoring: Extracted common post-processing logic for handling trailing newlines and appending periods into a new helper function _finalize, improving code organization and readability within the rst function.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This PR aims to improve documentation generation performance by introducing a custom, pure-Python Markdown-to-RST converter for common cases, which is a good strategy. While reviewing the changes, I found a few areas for improvement. I've identified a bug in heading conversion that can produce invalid RST, an omission in handling numbered lists, and a formatting inconsistency. Most critically, the new logic is not covered by tests, and the changes appear to break existing tests. Please see my detailed comments for suggestions on how to fix these problems. Adding a robust test suite will be essential to ensure the correctness of this performance optimization.

gemini-code-assist · 2026-01-15T19:28:46Z

gapic/utils/rst.py

+# Cache for the few complex items we actually send to pandoc
+_RAW_RST_CACHE: Dict[str, str] = {}
+
+def _tuned_fast_convert(text: str) -> Optional[str]:


This new conversion logic in _tuned_fast_convert is a critical part of the performance improvement, but it lacks unit tests. Additionally, the changes in the rst function appear to break the existing tests in tests/unit/utils/test_rst.py (e.g., test_rst_formatted), as they now bypass the mocked pypandoc.convert_text call.

It's crucial to:

Add comprehensive unit tests for _tuned_fast_convert covering all the regex conversions and edge cases (links, code blocks, headings, lists).

Update the existing tests for the rst function to reflect the new logic, including caching and the fast-path converter.

Without proper tests, it's hard to verify the correctness of these performance optimizations and prevent future regressions.

gemini-code-assist · 2026-01-15T19:28:46Z

gapic/utils/rst.py

+    converted = re.sub(r"^# (.*)$", r"\1\n" + "=" * 10, converted, flags=re.MULTILINE)
+    converted = re.sub(r"^## (.*)$", r"\1\n" + "-" * 10, converted, flags=re.MULTILINE)


The RST specification requires that heading underlines are at least as long as the heading text. Using a fixed length of 10 characters for the underline can result in invalid RST for headings longer than 10 characters, which would break the documentation build. The replacement in re.sub should be a function that calculates the correct underline length based on the matched heading text.

Suggested change

converted = re.sub(r"^# (.*)$", r"\1\n" + "=" * 10, converted, flags=re.MULTILINE)

converted = re.sub(r"^## (.*)$", r"\1\n" + "-" * 10, converted, flags=re.MULTILINE)

converted = re.sub(r"^# (.*)$", lambda m: f"{m.group(1)}\n{'=' * len(m.group(1))}", converted, flags=re.MULTILINE)

converted = re.sub(r"^## (.*)$", lambda m: f"{m.group(1)}\n{'-' * len(m.group(1))}", converted, flags=re.MULTILINE)

gemini-code-assist · 2026-01-15T19:28:46Z

gapic/utils/rst.py

+    converted = re.sub(r"^## (.*)$", r"\1\n" + "-" * 10, converted, flags=re.MULTILINE)
+
+    # F. LISTS: Markdown (- item) needs a preceding newline for RST.
+    converted = re.sub(r"(\n[^-*].*)\n\s*([-*] )", r"\1\n\n\2", converted)


The regex for list conversion only handles unordered lists starting with - or *. It does not handle numbered lists (e.g., 1. item). This will cause numbered lists to be rendered incorrectly, as they also require a blank line preceding them in RST. The logic should be updated to also handle numbered lists.

gemini-code-assist · 2026-01-15T19:28:46Z

gapic/utils/rst.py

+    if "::" in raw_rst or ".. code" in raw_rst:
+        answer = raw_rst.replace("\n", f"\n{' ' * indent}")
+    else:
+        answer = wrap(raw_rst, indent=indent, offset=indent, width=width - indent)


The call to wrap here uses offset=indent, while the "Super Fast Path" on line 73 uses offset=indent + 3. This inconsistency can lead to different wrapping for plain text versus text with markdown, resulting in inconsistent formatting. For consistent output, the offset should be the same in both places.

Suggested change

answer = wrap(raw_rst, indent=indent, offset=indent, width=width - indent)

answer = wrap(raw_rst, indent=indent, offset=indent + 3, width=width - indent)

improve docs performance

0f05b61

product-auto-label bot added the size: m Pull request size is medium. label Jan 15, 2026

gemini-code-assist bot reviewed Jan 15, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Experiment: improve docs performance #2524

Experiment: improve docs performance #2524

ohmayr commented Jan 15, 2026

Uh oh!

gemini-code-assist bot commented Jan 15, 2026

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Jan 15, 2026

Uh oh!

gemini-code-assist bot Jan 15, 2026

Uh oh!

gemini-code-assist bot Jan 15, 2026

Uh oh!

gemini-code-assist bot Jan 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

		converted = re.sub(r"^# (.)$", r"\1\n" + "=" 10, converted, flags=re.MULTILINE)
		converted = re.sub(r"^## (.)$", r"\1\n" + "-" 10, converted, flags=re.MULTILINE)

	answer = wrap(raw_rst, indent=indent, offset=indent, width=width - indent)
	answer = wrap(raw_rst, indent=indent, offset=indent + 3, width=width - indent)

Experiment: improve docs performance #2524

Are you sure you want to change the base?

Experiment: improve docs performance #2524

Conversation

ohmayr commented Jan 15, 2026

Uh oh!

gemini-code-assist bot commented Jan 15, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Jan 15, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Jan 15, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Jan 15, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Jan 15, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant