Conversation
brliron
commented
Aug 29, 2018
- Fix unicode input (before this fix, the é character moved the cursor 2 characters to the right and required 2 backspaces to remove).
- Switch input to non-blocking mode.
- Fix end key.
- Implement delete key.
There was a problem hiding this comment.
- I like the ESCDELAY feature. I didn't know about that.
- The
while call in gc.ui_thread.funcsblocks are there so that functions actually finish execution before moving on. Leaving them out would cause problems. - Wide characters take up more memory. Normal characters still work with unicode, specifically UTF-8. Using wide characters may also cause locale issues since that would be UTF-16. Mixing UTF-8 and UTF-16 is messy.
- the
if ch == -1is necessary because I've said input tonodelay. If it's omitted, it continues execution, causing issues. You did remove thenodelay, and it may not be necessary now since Discline-curses uses threads and doesn't rely on asyncio for everything anymore. I'll do some tests to see if it's still required. - The input buffer needs to be a list since characters need to be inserted in various places according to where the cursor is. Strings are immutable and therefore a new string would need to be created each and every time there is a change.
- The various
time.sleeplines were originally necessary because of Discline-curses' dependence on asyncio. They are probably no longer required which you noticed.
As it stands now, I cannot approve the pull request.
I thought they weren't needed, but after thinking about it, I can see a lot of reasons why they may be. I'll put them back.
It messes up the cursor on my end. Let's take for example the é character: https://www.fileformat.info/info/unicode/char/e9/index.htm. Its UTF-8 code is 0xC3 0xA9. With getch, you will read 2 characters: first the character 0xC3, then the character 0xA9. They will take 2 slots in the inputBuffer list. With the string "étude" for example, the inputBuffer list has a size of 6. And the cursor calculations use this size of 6, with the "é" string having a size of 2. If the cursor is right after the letter "d", pressing backspace will remove the "u" letter instead of the "d" letter. And every time you type a non-ascii character, it becomes worse. I really don't want to try with Japanese or Cyrillic... I didn't think this wall of text would end up being that big.
While the Python getch is documented to return -1 when no input is available in no-delay mode, the Python get_wch is documented to throw an exception instead (https://docs.python.org/3/library/curses.html#curses.window.getch). If I understand correctly, getch never returns -1 in blocking mode, and get_wch never returns -1 in any mode.
Yeah, I worked around that by creating a new string every time, but using a list of strings (I mean a list of character, but a character is a string of size 1) would probably be more efficient. I will change that.
I guessed they were needed to avoid eating 100% of the CPU by asking repetitively "do you have a key?" "And now?" "Still no new key?" "Gimme a new key plz", which is fixed by using blocking mode. |
|
Wide chars aren't required because multibyte characters will be caught each segment at a time. So if a character is 3 bytes long, There's another problem with wide chars and that is that they may not work with characters more than 2 bytes in length. Wide chars are UTF-16, so they store 2 bytes. I'm not sure they'd be stored correctly if captured using You're right that nodelay mode is no longer required because Discline-curses now uses threads for most things. I can remove that and clean up some code. |
Everything I found leads me to think this is wrong. I didn't find where the C lib curses converts its characters, but the API returns a wint_t, which is 4 bytes wide. Then, the Python wrapper uses PyUnicode_FromOrdinal. The parameter of this function is an int (4 bytes), and "must be in range(0x110000)" - which is the full Unicode range. That function returns a Python string, which uses UTF-32 (or ASCII when the string contains only ASCII characters, because that uses less memory). And I'm not talking about #2. Actually, my pull request has the #2 bug and I should fix it. But #2 is related to fonts, and asian characters being displayed 2 times bigger than other characters. The bug I tried to fix causes 2 issues:
If you really don't want to go for get_wch to fix this, you can get the width of an unicode character yourself and do the maths yourself. The array at https://en.wikipedia.org/wiki/UTF-8#Description describes how to do it. |
I'd happily merge your pull request once I decide if I want |