Skip to content

Fix multi-byte IME input loss in poll_event#7

Open
twokidsCarl wants to merge 1 commit intomarcoroth:mainfrom
twokidsCarl:fix/ime-multi-byte-input
Open

Fix multi-byte IME input loss in poll_event#7
twokidsCarl wants to merge 1 commit intomarcoroth:mainfrom
twokidsCarl:fix/ime-multi-byte-input

Conversation

@twokidsCarl
Copy link
Copy Markdown

Problem

When typing CJK characters via IME (Input Method Editor), such as Chinese "你好", only the first character appears in the TextArea. The second character is silently dropped.

Root cause: The terminal delivers multiple UTF-8 characters in a single read() call. poll_event() calls tea_parse_input_with_consumed() once, which parses only the first UTF-8 character and returns consumed = 3 (bytes). The remaining bytes (the second character) are discarded because the consumed value is never used to process the rest of the buffer.

Fix

Add an internal byte buffer (pending_buf / pending_len) to bubbletea_program_t. After parsing, any unconsumed bytes (bytes_avail - consumed) are saved to this buffer. On the next poll_event() call, buffered bytes are consumed before reading from stdin.

This ensures all characters from a single read() are eventually delivered as separate events across multiple poll_event() calls.

Changes

  • extension.h: Added pending_buf[256] and pending_len fields to bubbletea_program_t
  • program.c: Modified poll_event to check/use buffer before reading stdin, save remaining bytes after parse

Behavior

  • Transparent to Ruby layer: poll_event() still returns a single Hash per call
  • No API change: The run_loop naturally processes buffered events on subsequent iterations
  • Backward compatible: Single-byte ASCII input works identically (buffer stays empty)

Test

Typed "你好" (6 bytes: E4 BD A0 E5 A5 BD) in a Bubbletea TUI with TextArea.

Before: Only "你" appears
After: "你好" appears correctly

Compilation

Verified: compiles cleanly on macOS arm64 with Ruby 4.0.0.

When typing CJK characters via IME (e.g. "你好"), the terminal
delivers multiple UTF-8 characters in a single read(). Previously,
poll_event() called tea_parse_input_with_consumed() once, which
parsed only the first character and discarded the remaining bytes.

This caused only the first character to appear (e.g. "你" instead
of "你好").

Fix: Add an internal byte buffer to bubbletea_program_t. After
parsing, any unconsumed bytes are saved to the buffer. On the next
poll_event() call, buffered bytes are consumed before reading
from stdin. This ensures all characters from a single read are
eventually delivered as separate events.

The fix is transparent to the Ruby layer — poll_event() still
returns a single Hash per call, and the run_loop naturally
processes buffered events on subsequent iterations.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant