Add Python help contract regression coverage#47
Draft
t-kalinowski wants to merge 10 commits intomainfrom
Draft
Conversation
Finding: - [P1] Gate the TestResult import on non-Windows — /Users/tomasz/.codex/worktrees/24fc/mcp-repl/tests/python_help_snapshots.rs:5-5 On Windows, every item that references `TestResult` in this file is behind `#[cfg(not(windows))]`, but this import is unconditional. The Windows CI clippy step uses `-D warnings`, so the new test target fails with an unused-import warning before any tests run; put this import under the same cfg or gate the whole file. Response: - Moved the `TestResult` import under the existing `#[cfg(not(windows))]` gate alongside `McpSnapshot`, and gated the new `Regex` import the same way. Finding: - [P2] Normalize the Python help banner before snapshotting — /Users/tomasz/.codex/worktrees/24fc/mcp-repl/tests/snapshots/python_help_snapshots__python_help_contract.snap:70-70 When this snapshot runs with any `python3` other than 3.13, `help()` emits that interpreter's minor version in both the welcome text and docs URL, while the test only checks `common::python_available()` and does not pin Python. This makes the new snapshot fail on non-3.13 environments; normalize or redact the versioned banner before asserting the snapshot. Response: - Normalized the Python help welcome line and docs URL before asserting both rendered and transcript snapshots, then updated the stored snapshots to use `<VERSION>`.
Finding: - [P2] Isolate Python help tests from pager environment — /Users/tomasz/.codex/worktrees/24fc/mcp-repl/tests/python_backend.rs:732-735 When the test process has `PAGER` or `MANPAGER` set to an interactive pager, e.g. `PAGER=less TERM=xterm`, this session inherits it; because the Python worker runs on a PTY, `pydoc` launches the pager for `help(len)` and the call times out as busy instead of returning inline text. The snapshot test uses the same inherited environment, so the new coverage is environment-dependent unless the test sanitizes the pager env or the runtime forces pydoc's plain in-band pager. Response: - Routed the Python help regression sessions through a test-only environment that clears `PAGER` and `MANPAGER` and sets `TERM=dumb`, which keeps pydoc on its plain in-band pager path. The snapshot normalization now also strips Python's overstrike bolding so snapshots stay stable across terminal settings.
Finding: [P2] Test help without disabling pydoc's pager When the server inherits a normal TTY environment, such as `TERM=xterm` with `PAGER`/`MANPAGER` set to `less` or with `less` available, stdlib `pydoc` will launch an external pager because the Python worker only sets `PYTHON_BASIC_REPL`. This helper forces `PAGER`/`MANPAGER` empty and `TERM=dumb` for the new help tests, which disables pydoc's pager selection, so the regression tests can pass while the documented public help contract still fails for those users. Response: Python worker startup now overrides `pydoc.pager` with the plain in-band pager and waits briefly for the request-start signal before returning from pydoc output, so the normal prompt path can complete the request. The Python help tests now run with `PAGER=less`, `MANPAGER=less`, and `TERM=xterm`, so they exercise the inherited interactive-pager environment instead of masking it.
Normalize current Codex CLI fields that are not part of the approval snapshots, and make the idle guardrail pager unit test assert against a full in-memory page so parallel test-worker output cannot push the notice off the first page. This keeps the required plain cargo test command passing without accepting environment-specific snapshot churn.
Finding: [P1] Call plainpager with one argument When the selected interpreter is Python 3.12 or older, `pydoc.plainpager` only accepts `text`; passing `title` raises `TypeError` for `help(len)` and `pydoc.help(len)`. Since the Python backend resolves whatever `python3`/venv Python is available, this breaks the new help contract on common supported installations; calling `pydoc.plainpager(text)` is compatible with both older Python and 3.13+. Response: The pydoc wrapper now calls `pydoc.plainpager(text)` with one argument. Local Python 3.13 accepts the one-argument call, and the focused Python help tests plus the full required check set pass.
Finding: round 5 review stderr reported `git diff --check` failures in `tests/snapshots/python_help_snapshots__python_help_contract@transcript.snap`: tests/snapshots/python_help_snapshots__python_help_contract@transcript.snap:10: trailing whitespace. +<<< tests/snapshots/python_help_snapshots__python_help_contract@transcript.snap:18: trailing whitespace. +<<< tests/snapshots/python_help_snapshots__python_help_contract@transcript.snap:28: trailing whitespace. +<<< tests/snapshots/python_help_snapshots__python_help_contract@transcript.snap:33: trailing whitespace. +<<< tests/snapshots/python_help_snapshots__python_help_contract@transcript.snap:37: trailing whitespace. +<<< tests/snapshots/python_help_snapshots__python_help_contract@transcript.snap:40: trailing whitespace. +<<< help> tests/snapshots/python_help_snapshots__python_help_contract@transcript.snap:46: trailing whitespace. +<<< tests/snapshots/python_help_snapshots__python_help_contract@transcript.snap:49: trailing whitespace. +<<< help> tests/snapshots/python_help_snapshots__python_help_contract@transcript.snap:54: trailing whitespace. +<<< Response: The Python help snapshot normalizer now trims trailing whitespace from rendered snapshot lines before assertion, and the transcript snapshot has been updated to remove prompt and blank-line trailing spaces while preserving the visible transcript content.
The plain cargo test check exposed two test-harness races: - timed_out_request_end_with_exited_worker_reports_session_end_immediately assumed a short sleep was enough for the test child to exit and IPC request_end to be observed. - pager_empty_input_preserves_idle_guardrail_notice could lose its guardrail notice when another parallel test reset the shared output ring mid-assertion. Response: make the timed-out request test wait for the test child and use a bounded IPC completion wait; make the pager guardrail test retry its isolated setup when the shared output ring is reset concurrently.
The Linux CI cargo test job failed in python_help_contract_snapshot because the rendered help banner and prompt-only entries vary across Python installations. Response: normalize the help() intro banner to a sentinel and remove incidental primary/continuation prompt-only entries from the rendered snapshot while keeping the direct behavior test responsible for prompt readiness and follow-up execution.
Linux CI showed Python help() can emit a blank transcript line before the help utility banner. Response: treat that blank line as part of the normalized help banner prologue so the snapshot remains focused on the help contract instead of interpreter banner formatting.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Add public regression coverage for the native Python help contract.
The new tests cover
help(len),pydoc.help(len), and an interactivehelp()roundtrip through the public Python REPL surface. They assert that help output stays inline, does not expose pager prompts, returns to the Python prompt, and leaves the session usable for follow-up input.This also pins Python startup to
pydoc's plain in-band pager so inheritedPAGER,MANPAGER, or terminal settings cannot hand help output to an external pager. The PR adds a files-mode snapshot for the same help flow and moves the Python help contract plan from active to completed.Public changes
help()/pydoc.help()behavior.Internal changes
McpSnapshot.pydoc.pagertopydoc.plainpagerduring Python worker startup.cargo testcheck under parallel execution.Diff composition