Fix Unicode escape sequences in tool input output by tim-watcha · Pull Request #36 · ZeroSumQuant/claude-conversation-extractor

tim-watcha · 2025-10-29T01:44:01Z

Summary

Fixes Unicode characters (Korean, Chinese, Japanese, etc.) being displayed as escape sequences like \uc0ac\uc6a9\uc790 instead of actual readable text in HTML, JSON, and Markdown exports when using the --detailed flag.

Problem

When exporting conversations with --detailed flag, tool inputs containing non-ASCII characters were being escaped:

Before:

{
  "thought": "\uc0ac\uc6a9\uc790\uac00 \ucf58\ud150\uce20 \uc81c\ubaa9\uc744 \ud568\uaed8 \uc54c\ub824\ub2ec\ub77c\uace0 \uc694\uccad\ud588\uc2b5\ub2c8\ub2e4",
  "thoughtNumber": 1
}

After:

{
  "thought": "사용자가 콘텐츠 제목을 함께 알려달라고 요청했습니다",
  "thoughtNumber": 1
}

This made exported conversations with non-ASCII tool inputs unreadable, especially problematic for international users.

Solution

Added ensure_ascii=False parameter to json.dumps() calls when serializing tool inputs. Python's json.dumps() uses ensure_ascii=True by default, which escapes all non-ASCII characters.

Changes

Line 125 (extract_conversation method): Added ensure_ascii=False to tool_use content formatting
Line 186 (_extract_text_content method): Added ensure_ascii=False to detailed mode tool_use formatting

Testing

✅ Core unit tests pass (test_extract_text_content_list)
✅ Manual verification with Unicode characters confirms proper display
✅ No breaking changes to existing functionality

Impact

Improves readability for all non-ASCII characters in tool inputs
Affects HTML, JSON, and Markdown exports when using --detailed flag
No impact on basic extraction without --detailed flag
Backward compatible - no API or CLI changes

🤖 Generated with Claude Code

Added ensure_ascii=False parameter to json.dumps() calls when serializing tool inputs in both extract_conversation() and _extract_text_content() methods. This prevents non-ASCII characters (e.g., Korean, Chinese, Japanese) from being escaped as \uXXXX sequences in HTML, JSON, and Markdown exports. Changes: - Line 125: Added ensure_ascii=False to tool_use content formatting - Line 186: Added ensure_ascii=False to detailed mode tool_use formatting Fixes issue where tool inputs with Unicode characters were displaying as escape sequences like \uc0ac\uc6a9\uc790 instead of actual text. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

ZeroSumQuant · 2025-10-29T21:03:47Z

Thank you for submitting this! I'll be working on the repo more here soon and I'll check this out and get it sorted!

Combined with PR #37's INDENT_NUMBER constant while preserving ensure_ascii=False for proper Unicode handling.

- Implement project filtering with `--project` flag (Issue #38) - Improve Windows compatibility: - Add UTF-8 stdout reconfiguration - Remove Unix-specific code (`realtime_search.py`) - Replace print with logging in `extract_claude_logs.py` (Issue #28) - Add comprehensive type hints (Issue #27) - Fix interactive UI tests by mocking `Path.stat` - Add PDF/DOCX export capabilities (PRs #34, #36, #37 logic integrated) - Support metadata (--title, --description, --tags) and todo extraction - Clean up magic numbers into `constants.py`

…onstants Combined with PR ZeroSumQuant#36's ensure_ascii=False for proper Unicode handling.

ZeroSumQuant pushed a commit that referenced this pull request Jan 1, 2026

Merge PR #36: Fix Unicode escape sequences in tool input output

81d2af1

Combined with PR #37's INDENT_NUMBER constant while preserving ensure_ascii=False for proper Unicode handling.

sytelus added a commit to sytelus/claude-sessions that referenced this pull request Jan 7, 2026

Merge PR ZeroSumQuant#37: Refactor - Extract magic numbers to named c…

fdfcdb0

…onstants Combined with PR ZeroSumQuant#36's ensure_ascii=False for proper Unicode handling.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix Unicode escape sequences in tool input output#36

Fix Unicode escape sequences in tool input output#36
tim-watcha wants to merge 1 commit into
ZeroSumQuant:mainfrom
tim-watcha:fix/unicode-escape-in-tool-output

tim-watcha commented Oct 29, 2025

Uh oh!

ZeroSumQuant commented Oct 29, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

tim-watcha commented Oct 29, 2025

Summary

Problem

Solution

Changes

Testing

Impact

Uh oh!

ZeroSumQuant commented Oct 29, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants