Write generated docs as UTF-8 in `typer ... utils docs --output` by Sreekant13 · Pull Request #1881 · fastapi/typer

Sreekant13 · 2026-07-03T23:12:31Z

Discussion: #1882

Description

typer <app> utils docs --output FILE writes the generated Markdown with Path.write_text(clean_docs), which uses the platform's default encoding. When the CLI's help contains non-ASCII characters (emojis are common in Typer/Rich apps) this raises UnicodeEncodeError on interpreters whose locale encoding isn't UTF-8 (for example cp1252 on Windows).

Reproduction:

# emoji_app.py
import typer

app = typer.Typer()


@app.command()
def hello(name: str):
    """Say hello 👋 to someone."""

$ typer emoji_app utils docs --output out.md
UnicodeEncodeError: 'charmap' codec can't encode character '\U0001f44b' ...

This writes the file as UTF-8, which matches how the docs are read back in the tests (read_text(encoding="utf-8")).

I also added a regression test that forces a non-UTF-8 locale (LC_ALL=C, PYTHONUTF8=0) so it fails on the old behavior on any platform, not just Windows.

`typer <app> utils docs --output FILE` wrote the Markdown file using the platform's default encoding, so non-ASCII help (for example emojis, which are common in Typer/Rich CLIs) raised UnicodeEncodeError on interpreters where the locale encoding is not UTF-8, such as cp1252 on Windows. Write the file as UTF-8, matching how the docs are read back in the tests, and add a regression test that forces a non-UTF-8 locale so it fails on the old behavior on any platform.

Sreekant13 · 2026-07-04T00:16:22Z

Heads up: check-labels is red only because no category label is set yet. This is a bug fix, so it'd be bug. I can't add labels myself.

phalberg · 2026-07-04T10:45:14Z

Hey, thanks for trying to contribute! A few notes, even though I am no maintainer;

Some of the previous tests still use the old read_text() without read_text(encoding="utf-8"), changing to this in the same in test_doc_output and test_doc_title_output could be considered for consistency.
Also the test in my opinion is doing two main concerns now, checking for emoji support and one that forces a non-UTF8 env. that uses UTF8 as standard under the hood, splitting it up in two tests could also be considered.
Another point is that you are forcing quite a lot of things in the env. PYTHONUTF8=0 is probably enough and I could even think of that this test case is hard to maintain in the future if the env. is so "artifical".

These are just some thoughts I had while looking at the PR..

LC_ALL=C already overrides LANG, so LANG=C was redundant. Keep LC_ALL=C (forces a non-UTF-8 locale) and PYTHONUTF8=0 (keeps it non-UTF-8 on 3.15+ where UTF-8 mode is on by default), with a comment explaining why each is needed for the test to fail on the old behavior on any platform.

Sreekant13 · 2026-07-04T17:38:45Z

Thanks for the review, @phalberg!

On the env vars: good call on LANG=C. LC_ALL already overrides it, so I have dropped it (just pushed). I kept LC_ALL=C and PYTHONUTF8=0 though: PYTHONUTF8=0 on its own isn't enough, because most CI locales are already UTF-8, so getpreferredencoding() stays UTF-8 with UTF-8 mode off and the old code wouldn't fail there. LC_ALL=C is what forces a non-UTF-8 preferred encoding. I keep PYTHONUTF8=0 too so it still holds on Python 3.15+, where UTF-8 mode is on by default (PEP 686). Added a comment explaining that.
On splitting the test: I kept it as one on purpose. The regression only shows up when both conditions hold together: non-ASCII content and a non-UTF-8 locale. Emoji content alone passes on the old code in a UTF-8 environment, and a non-UTF-8 locale with ASCII content passes too, so splitting them would mean neither half fails without the fix. Happy to restructure if you have a split in mind that still guards the regression.
On updating the older tests to read_text(encoding="utf-8"): makes sense for consistency. I left them alone to keep this PR focused on the fix, but I am glad to include that here if you and the maintainers would prefer.

Thanks again!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Write generated docs as UTF-8 in `typer ... utils docs --output`#1881

Write generated docs as UTF-8 in `typer ... utils docs --output`#1881
Sreekant13 wants to merge 2 commits into
fastapi:masterfrom
Sreekant13:fix/docs-output-utf8

Sreekant13 commented Jul 3, 2026 •

edited

Loading

Uh oh!

Sreekant13 commented Jul 4, 2026

Uh oh!

phalberg commented Jul 4, 2026

Uh oh!

Sreekant13 commented Jul 4, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Uh oh!

Conversation

Sreekant13 commented Jul 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Uh oh!

Sreekant13 commented Jul 4, 2026

Uh oh!

phalberg commented Jul 4, 2026

Uh oh!

Sreekant13 commented Jul 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Sreekant13 commented Jul 3, 2026 •

edited

Loading

Sreekant13 commented Jul 4, 2026 •

edited

Loading