Skip to content

fix: add explicit encoding='utf-8' to text-mode open() calls#7774

Closed
avoronov-explyt wants to merge 1 commit into
microsoft:mainfrom
avoronov-explyt:fix/utf8-encoding-5566
Closed

fix: add explicit encoding='utf-8' to text-mode open() calls#7774
avoronov-explyt wants to merge 1 commit into
microsoft:mainfrom
avoronov-explyt:fix/utf8-encoding-5566

Conversation

@avoronov-explyt
Copy link
Copy Markdown

Problem

On systems with non-UTF-8 default encoding (e.g., cp950 on Chinese Windows), open() without encoding= causes UnicodeDecodeError when reading or writing files containing non-ASCII characters.

This is reported in #5566 where PlaywrightController.__init__ fails reading page_script.js on a cp950 system.

Changes

The original bug in playwright_controller.py was already fixed on main. This PR addresses the remaining open() calls in autogen-ext that lack explicit encoding="utf-8":

File Lines changed
page_logger.py 3 open() calls (writing hash text, HTML call tree, HTML pages)
chat_completion_client_recorder.py 2 open() calls (reading/writing JSON session files)
docker_jupyter/_docker_jupyter.py 1 open() call (writing HTML output)

All changes add encoding="utf-8" to text-mode open() calls. Binary-mode calls ("rb", "wb") are left unchanged.

Testing

Added test_utf8_encoding.py with tests verifying:

  • page_script.js can be read with explicit UTF-8 encoding
  • JSON round-trip with non-ASCII characters (Cyrillic, CJK, emoji)
  • HTML write with non-ASCII characters
  • Source files contain encoding="utf-8" in all text-mode open() calls

Fixes #5566

On systems with non-UTF-8 default encoding (e.g., cp950 on Chinese
Windows), open() without encoding= causes UnicodeDecodeError when
reading or writing files containing non-ASCII characters.

Fixes:
- playwright_controller.py: already fixed on main (verified)
- page_logger.py: 3 open() calls added encoding='utf-8'
- chat_completion_client_recorder.py: 2 open() calls added encoding='utf-8'
- docker_jupyter/_docker_jupyter.py: 1 open() call added encoding='utf-8'

Added test_utf8_encoding.py to verify encoding behavior.

Fixes #5566
@avoronov-explyt avoronov-explyt closed this by deleting the head repository May 30, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

open needs encoding='utf-8' for non-english environment, error in playwright_controller.py

1 participant