Skip to content

Print inline draws in falsifying example#4716

Open
DRMacIver wants to merge 19 commits intomasterfrom
DRMacIver/deferred-pretty-printing
Open

Print inline draws in falsifying example#4716
DRMacIver wants to merge 19 commits intomasterfrom
DRMacIver/deferred-pretty-printing

Conversation

@DRMacIver
Copy link
Copy Markdown
Member

If Hegel has taught us anything it's that inline draws are awesome, and that it's a shame that the UX for them in Hypothesis isn't better. This makes the UX better. You now get good printing of draws as part of the falsifying example, and you can use the DataObject in @example.

@given(data=st.data())
def test(data):
    x = data.draw(st.integers(), label="Something")
    ...

This will now print as:

Falsifying example: test(
    data=DataObject(draws=[
        # Something
        0,
    ]),
)

@DRMacIver DRMacIver requested a review from Liam-DeVoe April 24, 2026 14:04
@Zac-HD
Copy link
Copy Markdown
Member

Zac-HD commented Apr 24, 2026

Nice! There's a similar trick I've been meaning to set up for st.functions(), representing them as a dict lookup for pure functions and something like lambda ..., __returns__=[...][::-1]: __returns__.pop() for impure functions. Just never got around to implementing it and seeing whether I think it's actually an improvement...

@DRMacIver
Copy link
Copy Markdown
Member Author

Nice! There's a similar trick I've been meaning to set up for st.functions(), representing them as a dict lookup for pure functions and something like lambda ..., __returns__=[...][::-1]: __returns__.pop() for impure functions. Just never got around to implementing it and seeing whether I think it's actually an improvement...

Yeah I think the deferred pretty-printer approach will work well for a lot of other similar cases. st.functions() and st.randoms(use_true_random=False) are both on my hit list.

Copy link
Copy Markdown
Member

@Liam-DeVoe Liam-DeVoe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Can we split this into two PRs, one for the new pretty-printing logic and one for the new ability to specify interactive draws on @example? The release notes in this PR bury the lede, because the latter is substantially more impactful to the user API than the former
  • For the @example-PR: DataObject is now part of the public API, and we should update the docs accordingly

Comment on lines +569 to +571

def finalize(self) -> None:
"""Replay all outstanding deferreds created on this printer and
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't love this name, because python already uses the term finalize for GC: https://docs.python.org/3/library/weakref.html#weakref.finalize

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, good point. I'll rename it.

Comment on lines +373 to +377
"""
import functools
import inspect

from hypothesis import given
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

move imports to top level

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ugh I'm going to add a lint and format-refactor for this.

Comment on lines 50 to 55
with raises(AssertionError) as err:
test()
assert "Draw 1 (Some numbers): [0, 0]" in err.value.__notes__
assert "Draw 2 (A number): 0" in err.value.__notes__
notes = "\n".join(err.value.__notes__)
assert "# Some numbers" in notes
assert "# A number" in notes

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we change this to assert a multiline regex? This loses the assertion of the minimal values

DRMacIver and others added 19 commits April 27, 2026 12:12
deferred() returns a new printer whose output will be inserted at the
position deferred() was called once finalize() is invoked. Primitive
calls are recorded as concrete operations so the recording is unaffected
by later mutation of pretty-printed objects.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
finalize() is now called on the parent printer (the one deferred() was
called on) rather than on the returned deferred printer. Any deferred
printer - nested deferreds included - raises on further use after its
parent is finalized, while new deferreds can still be created on the
parent afterwards.

Remove the pre-deferred buffer flush so that line-wrap decisions made
by the parent printer are not forced prematurely by the act of calling
deferred(), aligning replayed output with what the same sequence of
primitive calls would produce without deferral.

Add a stateful test that fuzzes printing programs through both a direct
printer and a printer driven via deferred()/finalize(), asserting their
outputs agree.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Add a `draws=` parameter to DataObject (with the ConjectureData parameter
now optional, exactly one of the two required). In replay mode, each
draw() call reads the next value off the pre-recorded draws list.

DataObject._repr_pretty_ now prints "DataObject(draws=[", opens a
deferred printer on the parent, stores it as a class-level attribute
`DataObject.printer`, and closes with "])". Each subsequent draw() call
records a snapshot of the drawn value onto that deferred. The test
runner finalizes the parent printer after the test body returns so the
recorded draws are spliced into the reported output.

Result: when a test using `st.data()` fails, the falsifying example now
shows `DataObject(draws=[0, [0]])` with the actual values drawn,
rather than the opaque `data(...)` placeholder. Because draws are
pretty-printed at draw time, mutations made to drawn values after
data.draw() returns do not appear in the output.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…notes

- Move the deferred printer from a class attribute to an instance
  attribute. This removes the shared-state hack and the corresponding
  reset in core.py.
- Render labeled draws with the label as a comment on the preceding line,
  e.g. ``# Cool thing`` above the value. Each draw is always emitted on
  its own line using `break_()` so the surrounding indentation (e.g. from
  ``repr_call``) is respected.
- Drop the per-draw ``Draw N: value`` notes, both the in-test ``note()``
  call and the observability tail that iterated
  ``data._observability_args``. The same information now appears inline in
  ``DataObject(draws=[...])``.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Pin the exact rendered form for several draw scenarios - empty, single
unlabeled, multiple unlabeled, labeled, all-labeled, mixed, nested value,
alongside other args, and the two-st.data()-args case - so that any
changes to indentation, comma placement, or label formatting have to be
made deliberately by regenerating the snapshot.

The two_data_args snapshot captures an existing quirk: because
DataStrategy.do_draw memoizes its DataObject on the underlying
ConjectureData, ``@given(st.data(), st.data())`` yields the same
DataObject instance for both args, so both draws end up attributed to
the second printer.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
``@snapshot_given(*strategies)`` now decorates the test body directly.
It builds the corresponding Hypothesis property test (always forcing a
failure) and returns a pytest test function taking the ``snapshot``
fixture that asserts the captured falsifying-example output equals the
snapshot value. The decorated function's own name is used for the test
name, so the "Falsifying example: <name>(" line matches the test.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Snapshot tests go alongside the other snapshot-based tests under
``tests/snapshots/`` and use the shared ``SNAPSHOT_SETTINGS`` +
``run_test_for_falsifying_example`` helpers. The
``test_combinators.py::test_data_draw`` snapshot is also regenerated to
reflect the new ``DataObject(draws=[...])`` rendering.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
- Each draw now registers its choice-node range on
  ``conjecture_data.arg_slices``, so the explain phase varies it and
  populates ``slice_comments`` for it.  When the deferred printer
  records the drawn value, we check the printer's ``slice_comments``
  for that range and, if present, emit the comment next to the value
  (matching the top-level ``repr_call`` annotation style, e.g.
  ``0,  # or any other generated value``).

- Add a re-entrancy guard (``_pretty_printing_draw``): when ``draw()``
  calls ``printer.pretty(result)`` on a value that happens to be the
  DataObject itself, the re-entrant ``_repr_pretty_`` sees the flag and
  emits ``DataObject(...)`` rather than trying to open another deferred
  (which previously produced garbled output).

- Switch the snapshot-test ``snapshot_given`` helper to
  ``EXPLAIN_SETTINGS`` so the new explain annotations are exercised in
  the snapshots; update all snapshots accordingly.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The ``snapshot_given`` decorator is a general-purpose helper for any
snapshot test that wants to assert on the falsifying-example output of
a Hypothesis property test, so it belongs in
``tests/snapshots/conftest.py`` alongside ``SNAPSHOT_SETTINGS`` /
``EXPLAIN_SETTINGS`` rather than inline in ``test_data_object.py``.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…ls.py

``SNAPSHOT_SETTINGS``, ``EXPLAIN_SETTINGS`` and ``snapshot_given`` now
live alongside ``run_test_for_falsifying_example`` in
``tests/common/utils.py``, which is where the rest of the shared test
utilities already live. ``tests/snapshots/conftest.py`` no longer has
any content and is removed; the other ``tests/snapshots/test_*.py``
modules are updated to import from ``tests.common.utils``.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
``_DeferredPrinter`` now copies its parent's ``stack`` at creation time.
That way, when a ``data.draw(st.just(data))`` later re-enters
``pretty(data)`` through the deferred, the pretty-printer's normal cycle
mechanism sees ``id(data)`` already in the inherited stack, sets
``cycle=True``, and ``DataObject._repr_pretty_`` can bail to
``DataObject(...)`` with no ad-hoc instance/class flags.

The mutual-recursion test is kept as a snapshot so the behaviour in
that case is pinned rather than hidden.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Stack inheritance alone catches self-reference (``data.draw(st.just(data))``)
- the inherited stack still contains ``id(data)`` when the draw later
pretty-prints it, so the pretty-printer sets ``cycle=True``.

Mutual recursion (``d1.draw(st.just(d2)); d2.draw(st.just(d1))`` with
two distinct DataObjects) isn't caught by stack inheritance, because by
the time ``d1.draw`` pretty-prints ``d2``, the outer ``pretty(d2)`` has
already popped its entry. The result was that ``d2``'s top-level
rendering got emptied out while its content was nested inside ``d1``.

``RepresentationPrinter`` now exposes a ``root`` attribute (itself for
top-level printers, inherited for deferreds), and
``DataObject._repr_pretty_`` treats "my live printer shares its root
with the caller ``p``" as a cycle too - so the second pretty of d2
bails to ``DataObject(...)`` and d2's top-level draws list is
preserved.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Wrap the deferred in ``p.group(4)`` so that ``break_()`` calls recorded
inside it emit at ``parent_indent + 4``, and drop the literal
``text("    ")`` spaces that previously did the indenting. That literal
was fixed-width and didn't compose when a DataObject was drawn as a
value inside another DataObject's draws list - e.g.
``test_data_from_data`` was rendering as

    d1=DataObject(draws=[
        DataObject(draws=[
        0,  # or any other generated value
    ]),
        1,
    ]),

with the inner ``0`` and ``])`` visually misaligned.  It now renders as

    d1=DataObject(draws=[
        DataObject(draws=[
            0,  # or any other generated value
        ]),
        1,
    ]),

Also drop the trailing ``break_()`` from each draw - it was producing a
double-newline against the closing ``break_()`` emitted before ``])``.
Closing break is kept so ``])`` sits on its own line at the outer
indent.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Verify that ``@example(DataObject(draws=[...]))`` plus ``@given(st.data())``:

- Feeds the drawn values to the test body in order, ignoring the
  ``strategy`` argument (mixed types are fine).
- Accepts an empty draws list for tests that don't call ``data.draw``.
- Works with multiple ``@example`` decorators, each with their own
  draws.
- Renders the example's drawn values in the falsifying-example output
  ("Falsifying explicit example: ...") just like regular runs.
- Preserves labels: labelled draws show up as ``# label`` comments
  above the value in the rendering.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
- ``pretty._replay_calls``: add ``assert child._recording is not None``
  so mypy accepts it as ``list[...]`` rather than ``list[...] | None``.
- ``test_custom_reprs::test_reprs_as_created_interactive``: regenerate
  snapshot (was the pre-feature ``data=data(...)\nDraw 1: Bar(10)`` form).
- ``test_provider::test_realization_with_verbosity_draw`` /
  ``test_realization_with_observability``: update expected strings - the
  per-draw ``Draw N: value`` notes are no longer emitted (their info is
  inline in ``DataObject(draws=[...])``); the verbosity test now just
  asserts the symbolic marker is present.
- ``test_data_object_pretty`` falsifying-example regexes: with the
  ``explain`` phase on (the CI default for these tests), each draw's
  line has an inline ``# or any other generated value`` comment that
  broke the tight ``\s*,\s*`` patterns. Extract the ``draws=[...]``
  section first, then check the values appear in order.
- Re-run ``shed`` for the formatting nits the check-format job flagged.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The format and lint jobs on CI run a newer shed/ruff than my local
toolchain; they flag adjacent string-literal concatenations like
``"foo" "bar"`` (20 occurrences in the previous commit) and want them
merged into single strings. Merged the literals and dropped a spurious
blank line in ``test_pretty_deferred_stateful.py``.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The ``if printer is not None and printer._dead:`` branch in ``draw()``
clears a stale deferred-printer reference left over from a previous
printing session (e.g. the printer was finalized externally between
the pretty-print and the first subsequent ``data.draw(...)`` call).
CI's coverage check flagged lines 2401-2402 as uncovered. Add a test
that exercises that exact sequence.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@DRMacIver DRMacIver force-pushed the DRMacIver/deferred-pretty-printing branch from cbd2311 to 41c122f Compare April 27, 2026 12:12
@DRMacIver
Copy link
Copy Markdown
Member Author

  • Can we split this into two PRs, one for the new pretty-printing logic and one for the new ability to specify interactive draws on @example? The release notes in this PR bury the lede, because the latter is substantially more impactful to the user API than the former

I can definitely improve the release notes to not bury the lede, but I think the answer is... not really, in any sensible way. Literally the only change the pretty-print logic is there to enable is that you can print st.draws() like this and it's sortof crazy to add that printing and have it not work.

@Liam-DeVoe
Copy link
Copy Markdown
Member

Liam-DeVoe commented Apr 30, 2026

OK, let's definitely make the ability to use @example with interactive draws the headlining feature of this release, not the changing output format. And add some tests for this new capability of @example.


We'll need to figure out the right way to expose DataObject here. I don't want to expose it as-is in this PR, with the data parameter and with many undocumented attributes—count, conjecture_data, draws, etc. Thoughts:

  • make all of those underscore-private
  • split DataObject in two: publicly-constructed and privately-constructed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants