non-readline PyOS_StdioReadline when used with PyOS_InputHook fails is buggy with long input

# Bug report

To get this bug you must:
 - not be using the `readline` based stdin code path
 - must have a GUI toolkit with install as `PyOS_InputHook` imported
 - try to enter stings into `input` that are 99 characters or longer.
 - stdin is in line buffered mode

The source of the problem is that in 

https://github.com/python/cpython/blob/385d8d2b6438c7825006df6106dab41a95332a6a/Parser/myreadline.c#L47-L58

which is called from https://github.com/python/cpython/blob/385d8d2b6438c7825006df6106dab41a95332a6a/Parser/myreadline.c#L304-L322

We get the following sequence of events:

 - user calls `input`
 - `my_fgets` calls the input hook which blocks until stdin reports ready to read
 - `fgets` reads up to the first 99 characters and `my_fgets` returns
 - if the input is longer than 99 (including the new line), then the last read character will not be the newline and the calling loop will call `my_fgets` again
 - The input hook will be called again, but because there is no new user input (just remaining characters from the last pass) the inputhook blocks.
 - if the user hits enter a second time the inputhook will return and `fgets` will read to the original new line
 - the second new line will still be in the stdin buffer and will come out immediately the next time `input` is called.


Possible flaws in my understanding:

 - I am not clear why the next `input` call immediately rather than getting stuck in the input hook
 - entries that are multiples of 100 do not require multiple extra enters


This script demonstrates the problem:

```python
import string
import sys
from tkinter import Tk
from tkinter import ttk


def run():
    """
    This sets up a minimal tk application that has enough functionality
    to verify it is "live" and the inputhook is running while waiting for
    user input.
    """
    root = Tk()
    frm = ttk.Frame(root, padding=10)
    frm.grid()
    lbl = ttk.Label(frm, text="push count = 0")
    lbl.grid(column=0, row=0)

    j = 0

    def set_label():
        nonlocal j
        j += 1
        lbl["text"] = f"push count = {j}"

    ttk.Button(frm, text="Push me!", command=set_label).grid(column=1, row=0)

    return root, frm


run()

print("This is a demo of a bug in the non-readline based stdio code\n\n")
print(f"The readline module is not loaded: {'readline' in sys.modules=}")

test_string = (string.ascii_lowercase + string.ascii_uppercase) * 2

print(
    f"""
We are using the test string :

\t{test_string}

as it is easy to eyeball the length (it is 104 characters long in 26
character blocks).

You should see a tk window with a button that says "Push me!" and a counter.
Pushing the button should increment the counter.

Follow the instructions to demonstrate the bug.


"""
)

print(
    f"""
To see a case where it works paste

{test_string[:10]}

into the prompt below.  Before hitting return, try pushing
the button on the UI to verify that the inputhook is running.
"""
)

a = input("paste here >> ")
print(f"You pasted {a}")


print(
    f"""
To see it fail past the full string

{test_string}

into the prompt below (you will have to hit enter twice)
"""
)

a = input("paste here >> ")
print(f"You pasted {a}")

print("There is still a newline in the buffer, this input will be 'skipped'\n")
a = input("you can not input here >> ")
print(f"we got an empty string!: {a=!r} (also note no new line in stdout)")


print(
    f"""

you can now play with it or ctrl-d to exit.

The longest string that works is (98 letters + new line):

{test_string[:98]}

"""
)

while True:
    print("\n")
    a = input("test input >> ")
    print(f"what you entered: {a=}")
```

This needs to be run as `python demo.py` not pasted into a shell because the code paths that rely on `readline` work correctly.  Running as `python -uu demo.py` also works correctly.

# Your environment



- CPython versions tested on: 3.10.10, 3.11.3, 3.9+
- Operating system and architecture: (arch) linux x86, OSX

This was originally reported via https://github.com/matplotlib/matplotlib/issues/25756 where you can see my notes as I sorted this out.

Based on the code paths I expect this to not be reproducible on Windows.

I think this bug goes back to at least 717c6f95bebcd4693781e25bae3f7f9900cece07 so I expect all currently supported versions of Python to be affected.

I will shortly open a PR with a proposed fix.


### Linked PRs
* gh-103931

	while (1) {
	if (PyOS_InputHook != NULL) {
	(void)(PyOS_InputHook)();
	}

	errno = 0;
	clearerr(fp);
	char *p = fgets(buf, len, fp);
	if (p != NULL) {
	return 0; /* No error */
	}
	int err = errno;

	do {
	size_t incr = (n > 0) ? n + 2 : 100;
	if (incr > INT_MAX) {
	PyMem_RawFree(p);
	PyEval_RestoreThread(tstate);
	PyErr_SetString(PyExc_OverflowError, "input line too long");
	PyEval_SaveThread();
	return NULL;
	}
	pr = (char *)PyMem_RawRealloc(p, n + incr);
	if (pr == NULL) {
	PyMem_RawFree(p);
	PyEval_RestoreThread(tstate);
	PyErr_NoMemory();
	PyEval_SaveThread();
	return NULL;
	}
	p = pr;
	int err = my_fgets(tstate, p + n, (int)incr, sys_stdin);

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

non-readline PyOS_StdioReadline when used with PyOS_InputHook fails is buggy with long input #103929

Bug report

Your environment

Linked PRs

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Uh oh!

non-readline PyOS_StdioReadline when used with PyOS_InputHook fails is buggy with long input #103929

Description

Bug report

Your environment

Linked PRs

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions