test: Measure performance of writing Singer stream to stdout#3673
test: Measure performance of writing Singer stream to stdout#3673edgarrmondragon wants to merge 2 commits into
Conversation
Reviewer's GuideAdds pytest benchmarks that measure the throughput of writing Singer messages to stdout for different writer implementations and message mixes, including setup fixtures for schema and state messages and realistic stdout redirection to /dev/null so flush syscalls are exercised. File-Level Changes
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
There was a problem hiding this comment.
Hey - I've found 1 issue, and left some high level feedback:
- The manual stdout monkeypatching is repeated across benchmarks; consider extracting a small helper/context manager to redirect stdout to /dev/null so the setup/teardown is less error-prone and easier to reuse.
- In the MsgSpecWriter benchmark,
_DevNullStdoutonly exposesbufferandflush; if any future code callswriteon stdout this will break unexpectedly, so adding a no-opwritemethod (or usingcontextlib.redirect_stdoutaround a binary wrapper) would make the fake stdout more robust. - For
devnull_b, using awithstatement instead of manually opening and closing the file would simplify resource management and avoid potential leaks if the benchmark body is modified in the future.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- The manual stdout monkeypatching is repeated across benchmarks; consider extracting a small helper/context manager to redirect stdout to /dev/null so the setup/teardown is less error-prone and easier to reuse.
- In the MsgSpecWriter benchmark, `_DevNullStdout` only exposes `buffer` and `flush`; if any future code calls `write` on stdout this will break unexpectedly, so adding a no-op `write` method (or using `contextlib.redirect_stdout` around a binary wrapper) would make the fake stdout more robust.
- For `devnull_b`, using a `with` statement instead of manually opening and closing the file would simplify resource management and avoid potential leaks if the benchmark body is modified in the future.
## Individual Comments
### Comment 1
<location path="tests/benchmarks/test_io.py" line_range="136-158" />
<code_context>
+ writer = MsgSpecWriter()
+ number_of_runs = 1000
+
+ devnull_b = open(os.devnull, "wb") # noqa: PTH123, SIM115
+
+ class _DevNullStdout:
+ buffer = devnull_b
+
+ def flush(self) -> None:
+ devnull_b.flush()
+
+ old_stdout = sys.stdout
+ sys.stdout = _DevNullStdout() # type: ignore[assignment]
+ try:
+
+ def run():
+ for record in itertools.repeat(bench_record_message, number_of_runs):
+ writer.write_message(record)
+
+ benchmark(run)
+ finally:
+ sys.stdout = old_stdout
+ devnull_b.close()
+
+
</code_context>
<issue_to_address>
**suggestion (testing):** Make the MsgSpecWriter benchmark stdout stub more faithful to real sys.stdout by adding a write method.
`_DevNullStdout` currently only exposes `buffer` and `flush()`. If `MsgSpecWriter` ever starts calling `sys.stdout.write(...)`, this benchmark will fail even though real `sys.stdout` supports that. Please add a `write(self, data: str) -> int` method (e.g. forwarding to `devnull_b.write(data.encode(...))` or a no-op that returns `len(data)`) so the stub more accurately matches `sys.stdout` and avoids brittle failures.
```suggestion
writer = MsgSpecWriter()
number_of_runs = 1000
devnull_b = open(os.devnull, "wb") # noqa: PTH123, SIM115
class _DevNullStdout:
buffer = devnull_b
def write(self, data: str) -> int:
"""Mimic TextIOBase.write, discarding data like /dev/null."""
return len(data)
def flush(self) -> None:
devnull_b.flush()
old_stdout = sys.stdout
sys.stdout = _DevNullStdout() # type: ignore[assignment]
try:
def run():
for record in itertools.repeat(bench_record_message, number_of_runs):
writer.write_message(record)
benchmark(run)
finally:
sys.stdout = old_stdout
devnull_b.close()
```
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #3673 +/- ##
=======================================
Coverage 94.12% 94.12%
=======================================
Files 73 73
Lines 6200 6200
Branches 762 762
=======================================
Hits 5836 5836
Misses 270 270
Partials 94 94
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
Merging this PR will not alter performance
Performance Changes
Comparing |
We currently measure the performance of _formatting_ a message for stdout, but we don't measure the cost of writing individual messages. Signed-off-by: Edgar Ramírez Mondragón <edgarrm358@gmail.com>
Co-authored-by: sourcery-ai[bot] <58596630+sourcery-ai[bot]@users.noreply.github.com>
358e96f to
f2b5a3e
Compare
We currently measure the performance of formatting a message for stdout, but we don't measure the cost of writing individual messages.
Summary by Sourcery
Add performance benchmarks for writing Singer messages to stdout using different writers and message patterns.
Tests: