Skip to content

pyperf may not always return accurate run info#72

Open
dvalinrh wants to merge 4 commits intomainfrom
time_unit
Open

pyperf may not always return accurate run info#72
dvalinrh wants to merge 4 commits intomainfrom
time_unit

Conversation

@dvalinrh
Copy link
Copy Markdown
Contributor

@dvalinrh dvalinrh commented Apr 23, 2026

There is an issue with the time units. The unit may switch from run to run of each subtest (if we are at the boundary). This will cause erroneous reporting of data and averaging. To remedy this, convert the time run for each subtest pass to be nanoseconds.

Description

Fixes the results so they are always in ns.

Before/After Comparison

Before: We could get a mix of units when summing up results. Last unit is what gets reported
After: Results are now reported in ns.

Clerical Stuff

This closes #71

Relates to JIRA: RPOPC-947

Testing:
Ran and verified the output is what we expect.

csv portion
Test,Avg,Unit,Start_Date,End_Date
2to3,247483333,ns,2026-04-23T14:45:49Z,2026-04-23T15:38:33Z
async_generators,316516666,ns,2026-04-23T14:45:49Z,2026-04-23T15:38:33Z
async_tree_none,393866666,ns,2026-04-23T14:45:49Z,2026-04-23T15:38:33Z
async_tree_cpu_io_mixed,613350000,ns,2026-04-23T14:45:49Z,2026-04-23T15:38:33Z
async_tree_cpu_io_mixed_tg,618100000,ns,2026-04-23T14:45:49Z,2026-04-23T15:38:33Z
async_tree_eager,98886666,ns,2026-04-23T14:45:49Z,2026-04-23T15:38:33Z
async_tree_eager_cpu_io_mixed,365216666,ns,2026-04-23T14:45:49Z,2026-04-23T15:38:33Z
async_tree_eager_cpu_io_mixed_tg,521300000,ns,2026-04-23T14:45:49Z,2026-04-23T15:38:33Z
async_tree_eager_io,1070666666,ns,2026-04-23T14:45:49Z,2026-04-23T15:38:33Z
async_tree_eager_io_tg,1036133333,ns,2026-04-23T14:45:49Z,2026-04-23T15:38:33Z
async_tree_eager_memoization,234616666,ns,2026-04-23T14:45:49Z,2026-04-23T15:38:33Z
async_tree_eager_memoization_tg,378816666,ns,2026-04-23T14:45:49Z,2026-04-23T15:38:33Z
async_tree_eager_tg,270533333,ns,2026-04-23T14:45:49Z,2026-04-23T15:38:33Z
async_tree_io,942900000,ns,2026-04-23T14:45:49Z,2026-04-23T15:38:33Z
async_tree_io_tg,954366666,ns,2026-04-23T14:45:49Z,2026-04-23T15:38:33Z

@qodo-code-review
Copy link
Copy Markdown
Contributor

ⓘ You are approaching your monthly quota for Qodo. Upgrade your plan

Review Summary by Qodo

Standardize time values to nanoseconds for accurate results

🐞 Bug fix ✨ Enhancement

Grey Divider

Walkthroughs

Description
• Convert all time values to nanoseconds for consistent unit handling
• Eliminate unit conversion errors by standardizing on nanosecond output
• Skip "Try to rerun" messages to prevent data corruption
• Simplify PCP metric conversion by removing redundant unit conversions
Diagram
flowchart LR
  A["Parse test results<br/>with various units"] -- "Convert to<br/>nanoseconds" --> B["Accumulate values<br/>in nanoseconds"]
  B -- "Calculate average<br/>in nanoseconds" --> C["Output CSV<br/>with ns unit"]
  C -- "Convert to seconds<br/>for PCP" --> D["Send to PCP<br/>metric system"]
Loading

Grey Divider

File Changes

1. pyperf/pyperf_run 🐞 Bug fix +17/-11

Standardize time unit handling to nanoseconds

• Convert all time values to nanoseconds at parse time using convert_val tool
• Update CSV output format to use nanoseconds as standard unit instead of variable units
• Add filter to skip "Try to rerun" messages that could corrupt data accumulation
• Remove redundant unit conversion logic for PCP metrics since values are already in nanoseconds
• Fix output formatting to use integer nanosecond values instead of floating point

pyperf/pyperf_run


Grey Divider

Qodo Logo

@qodo-code-review
Copy link
Copy Markdown
Contributor

qodo-code-review Bot commented Apr 23, 2026

Code Review by Qodo

🐞 Bugs (1) 📘 Rule violations (0) 📎 Requirement gaps (0)

Grey Divider


Action required

1. Duplicate PCP publish🐞 Bug ≡ Correctness
Description
In generate_csv_file(), the final metric is sent to PCP twice with identical arguments, which will
duplicate the same datapoint and skew downstream metric consumers.
Code

pyperf/pyperf_run[R245-248]

+			sec_results=$(echo "$results/1000000000" | bc -l)
+			result2pcp "$metric_name" "${sec_results}"
  		result2pcp "$metric_name" "${sec_results}"
  	fi
Evidence
The PCP-reporting block at the end of generate_csv_file() contains two identical consecutive calls
to result2pcp for the same metric_name/sec_results, meaning the same value will be emitted twice
whenever PCP is enabled.

pyperf/pyperf_run[239-248]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`generate_csv_file()` publishes the same PCP metric twice due to a duplicated `result2pcp` invocation.
### Issue Context
This happens in the PCP-enabled block after computing `sec_results` for the final test block (EOF path), resulting in double-emission of the same metric value.
### Fix Focus Areas
- pyperf/pyperf_run[239-248]
### Proposed fix
Remove one of the two identical `result2pcp "$metric_name" "${sec_results}"` lines so the metric is emitted exactly once.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools



Remediation recommended

2. Average precision truncated 🐞 Bug ≡ Correctness
Description
generate_csv_file() truncates the computed average to an integer by stripping the decimal portion
and prints it with %d, losing precision compared to the prior floating output and conflicting with
the schema’s Avg being a float.
Code

pyperf/pyperf_run[R209-210]

+					results=$(echo "${value_sum}/${res_count}" | bc -l | cut -d'.' -f1 )
+					printf "%s,%d,ns,%s,%s\n" $test_name $results $start_time $end_time >> ${1}.csv
Evidence
The average is computed with bc -l and then truncated via cut -d'.' -f1 and printed as an
integer (%d) in both the mid-file flush and the EOF flush paths. The schema models Avg as a
float, so dropping fractional precision is a behavioral regression in result fidelity (even if the
JSON parser still accepts integer-looking floats).

pyperf/pyperf_run[208-221]
pyperf/pyperf_run[239-248]
pyperf_schema.py[112-124]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
The script currently truncates benchmark averages to integer nanoseconds (`cut -d'.' -f1`) and prints them with `%d`, which discards fractional precision.
### Issue Context
The repository schema models `Avg` as a float, and prior behavior emitted a floating average; truncation reduces result fidelity and can change downstream interpretation.
### Fix Focus Areas
- pyperf/pyperf_run[208-211]
- pyperf/pyperf_run[239-242]
- pyperf_schema.py[112-124]
### Proposed fix
- Remove the `| cut -d'.' -f1` truncation.
- Print `Avg` as a float (e.g., `%.2f` or `%s` with the raw `bc -l` output) while keeping `Unit` as `ns`.
- Ensure both flush paths (the `reduce==2` block and the EOF block) use the same non-truncating formatting.
(If integer nanoseconds are required, round instead of truncating and still emit via a safe string format like `%s`.)

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


Grey Divider

Qodo Logo

@github-actions
Copy link
Copy Markdown

This relates to RPOPC-947

Comment thread pyperf/pyperf_run
@dvalinrh dvalinrh requested a review from kdvalin April 23, 2026 22:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Convert all time units to ns

1 participant