-
Notifications
You must be signed in to change notification settings - Fork 1
PR Part 1: Add sanity test harness and new plotter/dashboard module #4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
📝 WalkthroughWalkthroughAdds a new plotting module and extensive sanity tests, integrates optional per-iteration dashboard plotting and timing into training, forwards dashboard flags through the iteration runner, and allows a test-time override for external price start index in environment resets. Changes
Sequence Diagram(s)sequenceDiagram
participant Trainer as Training Loop
participant Env as Environment
participant Plot as Plotter
participant FS as File System
Trainer->>Env: reset(seed, options)
Env-->>Trainer: observation, info
loop per step
Trainer->>Env: step(action)
Env-->>Trainer: observation, reward, done, info
Trainer->>Trainer: accumulate per-step metrics → episode_costs
end
Trainer->>Plot: plot_dashboard(env, num_hours, max_nodes, episode_costs, ...)
Plot->>Env: read metrics (prices, nodes, queue, rewards)
Plot->>Plot: compute series & cumulative savings
Plot->>FS: save figure files
Plot-->>Trainer: return summary metrics
Estimated code review effort🎯 4 (Complex) | ⏱️ ~65 minutes 🚥 Pre-merge checks | ✅ 1 | ❌ 2❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (1 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
train_iter.py (1)
84-108: Critical:plot_dashboardanddashboard_hoursare undefined inrun()function.The function references
plot_dashboardanddashboard_hoursat lines 100-101, but these are neither function parameters nor global variables. Additionally, line 164 passesplot_dashboardanddashboard_hoursas keyword arguments, but the function signature doesn't accept them—this will raise aTypeError.🐛 Proposed fix: Add missing parameters to function signature
-def run(efficiency_weight, price_weight, idle_weight, job_age_weight, drop_weight, iter_limit_per_step, session, prices, job_durations, jobs, hourly_jobs): +def run(efficiency_weight, price_weight, idle_weight, job_age_weight, drop_weight, iter_limit_per_step, session, prices, job_durations, jobs, hourly_jobs, plot_dashboard=False, dashboard_hours=24*14): python_executable = sys.executable command = [ python_executable, "train.py",
🤖 Fix all issues with AI agents
In `@src/plotter.py`:
- Around line 152-161: The conditions for showing reward panels are inverted:
change the checks that use if not getattr(env, "plot_eff_reward", False) (and
similarly plot_price_reward, plot_idle_penalty, plot_job_age_penalty,
plot_total_reward) so they evaluate the flag directly (remove the leading not)
before calling add_panel; i.e., use getattr(env, "plot_eff_reward", False) etc.
This will make add_panel calls (function add_panel) occur only when the
corresponding env.plot_* flag is True.
In `@test/test_sanity_env.py`:
- Line 164: Replace the inconsistent CLI option name in p.add_argument by using
"--drop-weight" instead of "--drop_weight", and confirm any places that read the
parsed value (e.g., make_env_from_args) use args.drop_weight (argparse converts
hyphens to underscores), updating references if they currently use a different
name.
- Around line 1-10: Move the shebang line (#!/usr/bin/env python3) to the very
first line of the file so it appears before the module-level
docstring/triple-quoted string; update the top of test/test_sanity_env.py so the
shebang is line 1 and the existing docstring immediately follows.
🧹 Nitpick comments (7)
test/test_sanity_env.py (3)
126-132: Determinism test fingerprint relies on info keys not returned by the environment.The
rollout()function records values frominfodict (step_cost,num_unprocessed_jobs,num_on_nodes,dropped_this_episode), butComputeClusterEnv.step()returns an empty info dict{}. The test will pass (both rollouts get the same default values), but the fingerprint won't actually verify these metrics.Consider either:
- Enhancing
ComputeClusterEnv.step()to populate the info dict with these values, or- Removing these from the fingerprint since they don't add verification value currently.
54-70: Consider using imported constants instead of hardcoded values.The file imports
MAX_JOB_DURATION,CORES_PER_NODE, etc. at lines 21-26, butcheck_invariantsuses hardcoded values (170, 96, 168, 16). The comment at line 82 mentions avoiding imports, but they're already imported.♻️ Suggested improvement
# ---- Node bounds ---- assert np.all(nodes >= -1), f"nodes < -1 exists, min={nodes.min()}" - assert np.all(nodes <= 170), f"nodes > MAX_JOB_DURATION exists, max={nodes.max()}" + assert np.all(nodes <= MAX_JOB_DURATION), f"nodes > MAX_JOB_DURATION exists, max={nodes.max()}" ... - assert np.all((ca >= 0) & (ca <= 96)), f"cores_available out of bounds..." + assert np.all((ca >= 0) & (ca <= CORES_PER_NODE)), f"cores_available out of bounds..."
203-207: Fix indentation and remove commented code.Line 204 has inconsistent indentation (extra space), and line 231 has commented-out code.
🔧 Proposed fix
- # Train.py passes strings; the env treats "" as falsy in some places and truthy in others. + # Train.py passes strings; the env treats "" as falsy in some places and truthy in others. # To be safe: normalize "" -> None here.src/plotter.py (1)
289-291: Rename ambiguous variableltoline.Single-letter
lis easily confused with1orI.🔧 Proposed fix
# Combine legends lines = line1 + line1b + line2 + line2b - labels = [l.get_label() for l in lines] + labels = [line.get_label() for line in lines] ax1.legend(lines, labels, loc="center right", fontsize=9)train.py (3)
156-157: Remove unnecessary f-string prefix.The f-string at line 156 has no placeholders.
🔧 Proposed fix
- print(f"Starting a new model training...") + print("Starting a new model training...")
315-315: Convert suffix to string for consistency.
suffixis passed as an integer here but as a string at line 274. While Python's f-string formatting handles this, it's cleaner to be explicit.🔧 Proposed fix
- suffix=STEPS_PER_ITERATION * iters, + suffix=f"train_{STEPS_PER_ITERATION * iters}",
288-289: Progress message is confusing.The message mixes iteration count and step count in a confusing way.
iters + 1is the iteration number, but the message implies it's a step count.🔧 Proposed fix
if (iters+1)%10==0: - print(f"Running... at {iters + 1} of {STEPS_PER_ITERATION * (iters + 1)} steps") + print(f"Running iteration {iters + 1}, total steps so far: {STEPS_PER_ITERATION * iters}")
📜 Review details
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (5)
src/environment.pysrc/plotter.pytest/test_sanity_env.pytrain.pytrain_iter.py
🧰 Additional context used
📓 Path-based instructions (2)
train.py
📄 CodeRabbit inference engine (CLAUDE.md)
train.py: Implement stable-baselines3 PPO integration in the training script with tensorboard logging and model checkpointing every 100K steps
Structure evaluation output to include per-episode metrics (cost, savings, completion rate, wait time, queue size) and cumulative analysis (total savings, monthly reduction, annual projections)
Files:
train.py
{train.py,weights.py}
📄 CodeRabbit inference engine (CLAUDE.md)
Use weighted reward components for training: efficiency-weight (default 0.7), price-weight (default 0.2), idle-weight (default 0.1), and job-age-weight (default 0.0)
Files:
train.py
🧠 Learnings (6)
📚 Learning: 2025-12-04T08:51:13.779Z
Learnt from: CR
Repo: FairRootGroup/powersched PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-04T08:51:13.779Z
Learning: Applies to train.py : Structure evaluation output to include per-episode metrics (cost, savings, completion rate, wait time, queue size) and cumulative analysis (total savings, monthly reduction, annual projections)
Applied to files:
src/plotter.pytrain.py
📚 Learning: 2025-12-04T08:51:13.779Z
Learnt from: CR
Repo: FairRootGroup/powersched PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-04T08:51:13.779Z
Learning: Applies to {train.py,weights.py} : Use weighted reward components for training: efficiency-weight (default 0.7), price-weight (default 0.2), idle-weight (default 0.1), and job-age-weight (default 0.0)
Applied to files:
train_iter.py
📚 Learning: 2025-12-04T08:51:13.779Z
Learnt from: CR
Repo: FairRootGroup/powersched PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-04T08:51:13.779Z
Learning: Applies to train.py : Implement stable-baselines3 PPO integration in the training script with tensorboard logging and model checkpointing every 100K steps
Applied to files:
train.py
📚 Learning: 2025-12-04T08:51:13.779Z
Learnt from: CR
Repo: FairRootGroup/powersched PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-04T08:51:13.779Z
Learning: Applies to environment.py : Set environment constraints as: 2-week episodes (336 hours), max 1000 jobs in queue, max 1500 new jobs per hour, max 170 hours job runtime
Applied to files:
src/environment.py
📚 Learning: 2025-12-04T08:51:13.779Z
Learnt from: CR
Repo: FairRootGroup/powersched PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-04T08:51:13.779Z
Learning: Applies to environment.py : Configure cluster parameters as: 335 nodes maximum, 96 cores per node, up to 16 nodes per job, 150W idle power, 450W active power per node
Applied to files:
src/environment.py
📚 Learning: 2025-12-04T08:51:13.779Z
Learnt from: CR
Repo: FairRootGroup/powersched PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-04T08:51:13.779Z
Learning: Applies to sampler_hourly.py : Sample job characteristics independently (job count, duration, nodes, cores-per-node) from hour-of-day distributions in the Hourly Sampler
Applied to files:
src/environment.py
🧬 Code graph analysis (2)
train_iter.py (1)
src/plotter.py (1)
plot_dashboard(81-222)
train.py (1)
src/plotter.py (2)
plot_dashboard(81-222)plot_cumulative_savings(225-316)
🪛 Ruff (0.14.11)
src/plotter.py
81-81: Unused function argument: episode_costs
(ARG001)
290-290: Ambiguous variable name: l
(E741)
train_iter.py
100-100: Undefined name plot_dashboard
(F821)
101-101: Undefined name dashboard_hours
(F821)
test/test_sanity_env.py
10-10: Shebang should be at the beginning of the file
(EXE005)
124-124: Unpacked variable obs is never used
Prefix it with an underscore or any other dummy variable pattern
(RUF059)
241-241: Unpacked variable nodes is never used
Prefix it with an underscore or any other dummy variable pattern
(RUF059)
241-241: Unpacked variable prices is never used
Prefix it with an underscore or any other dummy variable pattern
(RUF059)
297-297: Unpacked variable o1 is never used
Prefix it with an underscore or any other dummy variable pattern
(RUF059)
298-298: Unpacked variable t1 is never used
Prefix it with an underscore or any other dummy variable pattern
(RUF059)
298-298: Unpacked variable tr1 is never used
Prefix it with an underscore or any other dummy variable pattern
(RUF059)
300-300: Unpacked variable o2 is never used
Prefix it with an underscore or any other dummy variable pattern
(RUF059)
301-301: Unpacked variable t2 is never used
Prefix it with an underscore or any other dummy variable pattern
(RUF059)
301-301: Unpacked variable tr2 is never used
Prefix it with an underscore or any other dummy variable pattern
(RUF059)
339-339: Unpacked variable r is never used
Prefix it with an underscore or any other dummy variable pattern
(RUF059)
339-339: Unpacked variable info is never used
Prefix it with an underscore or any other dummy variable pattern
(RUF059)
train.py
156-156: f-string without any placeholders
Remove extraneous f prefix
(F541)
276-276: Do not catch blind exception: Exception
(BLE001)
294-294: f-string without any placeholders
Remove extraneous f prefix
(F541)
317-317: Do not catch blind exception: Exception
(BLE001)
🔇 Additional comments (10)
train_iter.py (1)
138-140: LGTM on CLI arguments.The new
--plot-dashboardand--dashboard-hoursarguments are properly defined and will forward correctly totrain.pyonce therun()function signature is fixed.src/environment.py (2)
19-21: LGTM on import additions.Importing
MIN_CORES_PER_JOBandMIN_NODES_PER_JOBfromsrc.configmakes these constants available for environment configuration and aligns with the learnings specifying environment constraints. Based on learnings, these should support job constraints.
222-224: LGTM on price_start_index override.The optional
price_start_indexoverride viaoptionsis a clean approach for testing determinism. The modulo operation ensures the index stays within bounds.test/test_sanity_env.py (1)
277-359: LGTM on main test harness structure.The test harness provides good coverage:
- Basic determinism comparison (lines 290-315)
- Optional Gymnasium API compliance check
- Invariant validation loop
- Optional extended determinism test
The unused variable warnings from static analysis are acceptable for test code readability.
src/plotter.py (2)
8-16: LGTM on_as_serieshelper.Clean utility function that handles None inputs gracefully and pads shorter arrays with NaN values.
19-78: LGTM on_compute_cumulative_savings.The function correctly computes cumulative savings against two baselines with proper handling of empty inputs and array length alignment.
train.py (4)
48-52: LGTM on default weights.The default weights align with coding guidelines: efficiency=0.7, price=0.2, idle=0.1, job-age=0.0.
276-277: Broad exception handling is acceptable here.Catching
Exceptionfor non-fatal plotting operations is reasonable—it prevents visualization failures from interrupting training/evaluation. The error is logged for debugging.
214-251: LGTM on evaluation output structure.The evaluation output aligns well with coding guidelines, including:
- Per-episode metrics (cost, savings, completion rate, wait time, queue size)
- Cumulative analysis (total savings, monthly reduction, annual projections)
- Job processing metrics for both agent and baseline
152-159: Verify PPO hyperparameter choices for training consistency.
n_steps=64is significantly smaller than the stable-baselines3 PPO default (2048). WithSTEPS_PER_ITERATIONset to 100,000 steps per checkpoint, this results in approximately 1,562 policy updates per checkpoint cycle instead of the typical ~49 updates. While this enables more frequent gradient updates, it can increase training variance. Confirm this aggressive update frequency aligns with your episode structure and training objectives.
✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@train_iter.py`:
- Around line 119-122: The subprocess.run call stores its result in result but
didn't request captured output, so result.stdout will always be None; update the
subprocess.run(...) invocation in train_iter.py (the call that assigns to
result) to include capture_output=True (or stdout=subprocess.PIPE) if you intend
to return the subprocess output, and then return result.stdout, otherwise remove
the misleading return result.stdout (or explicitly return None) and adjust
callers accordingly.
🧹 Nitpick comments (4)
test/test_sanity_env.py (2)
47-100: Use imported constants instead of hardcoded magic numbers.The file imports
MAX_JOB_DURATION,MIN_NODES_PER_JOB,MAX_NODES_PER_JOB,MIN_CORES_PER_JOB, andCORES_PER_NODEbut then uses hardcoded values (170, 96, 16, 168) in the invariant checks. This creates maintenance risk if constants change.♻️ Suggested refactor to use imported constants
# ---- Node bounds ---- assert np.all(nodes >= -1), f"nodes < -1 exists, min={nodes.min()}" - assert np.all(nodes <= 170), f"nodes > MAX_JOB_DURATION exists, max={nodes.max()}" + assert np.all(nodes <= MAX_JOB_DURATION), f"nodes > MAX_JOB_DURATION exists, max={nodes.max()}" ... # ---- cores_available invariants (from env, not obs) ---- ca = env.cores_available assert ca.shape == nodes.shape - assert np.all((ca >= 0) & (ca <= 96)), f"cores_available out of bounds min={ca.min()} max={ca.max()}" + assert np.all((ca >= 0) & (ca <= CORES_PER_NODE)), f"cores_available out of bounds min={ca.min()} max={ca.max()}" ... - assert np.all(ca[idle] == 96), "idle nodes must have full cores_available" + assert np.all(ca[idle] == CORES_PER_NODE), "idle nodes must have full cores_available" ... - assert np.all((dur >= 0) & (dur <= 170)), f"duration out of bounds min={dur.min()} max={dur.max()}" + assert np.all((dur >= 0) & (dur <= MAX_JOB_DURATION)), f"duration out of bounds min={dur.min()} max={dur.max()}" ... if np.any(active): - assert np.all((nn[active] >= 1) & (nn[active] <= 16)), f"job nnodes out of bounds nn={nn[active]}" - assert np.all((cpn[active] >= 1) & (cpn[active] <= 96)), f"cores_per_node out of bounds cpn={cpn[active]}" + assert np.all((nn[active] >= MIN_NODES_PER_JOB) & (nn[active] <= MAX_NODES_PER_JOB)), f"job nnodes out of bounds nn={nn[active]}" + assert np.all((cpn[active] >= MIN_CORES_PER_JOB) & (cpn[active] <= CORES_PER_NODE)), f"cores_per_node out of bounds cpn={cpn[active]}"You'll also need to import
MAX_JOB_AGEfor the age bound check at line 85.
205-208: Fix inconsistent indentation.Line 205 has 5 spaces of indentation instead of 4, which is inconsistent with the rest of the codebase and could cause issues with strict linters.
🔧 Proposed fix
- # Train.py passes strings; the env treats "" as falsy in some places and truthy in others. - # To be safe: normalize "" -> None here. + # Train.py passes strings; the env treats "" as falsy in some places and truthy in others. + # To be safe: normalize "" -> None here.src/plotter.py (1)
288-291: Rename ambiguous variableltoline.The variable name
lis flagged as ambiguous (easily confused with1orI). Consider renaming for clarity.🔧 Proposed fix
# Combine legends lines = line1 + line1b + line2 + line2b - labels = [l.get_label() for l in lines] + labels = [line.get_label() for line in lines] ax1.legend(lines, labels, loc="center right", fontsize=9)train_iter.py (1)
178-178: Consider breaking up long function call for readability.The line is quite long. While functional, breaking it across multiple lines would improve readability.
♻️ Suggested format
- run(efficiency_weight, price_weight, idle_weight, job_age_weight, drop_weight, args.iter_limit_per_step, args.session, args.prices, args.job_durations, args.jobs, args.hourly_jobs,plot_dashboard=args.plot_dashboard,dashboard_hours=args.dashboard_hours) + run( + efficiency_weight, price_weight, idle_weight, job_age_weight, drop_weight, + args.iter_limit_per_step, args.session, + args.prices, args.job_durations, args.jobs, args.hourly_jobs, + plot_dashboard=args.plot_dashboard, + dashboard_hours=args.dashboard_hours, + )
📜 Review details
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
src/plotter.pytest/test_sanity_env.pytrain_iter.py
🧰 Additional context used
🧠 Learnings (2)
📚 Learning: 2025-12-04T08:51:13.779Z
Learnt from: CR
Repo: FairRootGroup/powersched PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-04T08:51:13.779Z
Learning: Applies to {train.py,weights.py} : Use weighted reward components for training: efficiency-weight (default 0.7), price-weight (default 0.2), idle-weight (default 0.1), and job-age-weight (default 0.0)
Applied to files:
test/test_sanity_env.pytrain_iter.py
📚 Learning: 2025-12-04T08:51:13.779Z
Learnt from: CR
Repo: FairRootGroup/powersched PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-04T08:51:13.779Z
Learning: Applies to train.py : Structure evaluation output to include per-episode metrics (cost, savings, completion rate, wait time, queue size) and cumulative analysis (total savings, monthly reduction, annual projections)
Applied to files:
src/plotter.py
🧬 Code graph analysis (2)
test/test_sanity_env.py (4)
src/environment.py (3)
reset(202-256)step(258-456)render(51-52)src/weights.py (1)
Weights(5-16)src/workloadgen.py (2)
WorkloadGenerator(56-110)WorkloadGenConfig(35-53)test/test_determinism_workloadgen.py (1)
check_determinism(8-19)
train_iter.py (1)
src/plotter.py (1)
plot_dashboard(81-222)
🪛 Ruff (0.14.11)
test/test_sanity_env.py
125-125: Unpacked variable obs is never used
Prefix it with an underscore or any other dummy variable pattern
(RUF059)
242-242: Unpacked variable nodes is never used
Prefix it with an underscore or any other dummy variable pattern
(RUF059)
242-242: Unpacked variable prices is never used
Prefix it with an underscore or any other dummy variable pattern
(RUF059)
298-298: Unpacked variable o1 is never used
Prefix it with an underscore or any other dummy variable pattern
(RUF059)
299-299: Unpacked variable t1 is never used
Prefix it with an underscore or any other dummy variable pattern
(RUF059)
299-299: Unpacked variable tr1 is never used
Prefix it with an underscore or any other dummy variable pattern
(RUF059)
301-301: Unpacked variable o2 is never used
Prefix it with an underscore or any other dummy variable pattern
(RUF059)
302-302: Unpacked variable t2 is never used
Prefix it with an underscore or any other dummy variable pattern
(RUF059)
302-302: Unpacked variable tr2 is never used
Prefix it with an underscore or any other dummy variable pattern
(RUF059)
340-340: Unpacked variable r is never used
Prefix it with an underscore or any other dummy variable pattern
(RUF059)
340-340: Unpacked variable info is never used
Prefix it with an underscore or any other dummy variable pattern
(RUF059)
src/plotter.py
81-81: Unused function argument: episode_costs
(ARG001)
290-290: Ambiguous variable name: l
(E741)
train_iter.py
119-119: subprocess call: check for execution of untrusted input
(S603)
🔇 Additional comments (16)
test/test_sanity_env.py (8)
1-10: LGTM!The shebang is now correctly placed at line 1, and the docstrings provide clear usage examples.
12-27: LGTM!Imports are well-organized with standard library, third-party, and local imports properly grouped.
30-36: LGTM!The function is straightforward for its test utility purpose.
102-108: LGTM!Good defensive check to ensure observations are copies and don't alias internal environment state.
110-141: LGTM with minor lint suggestion.The determinism test logic is solid. The unused
obsvariable at line 125 could be prefixed with_to satisfy linters.🔧 Optional lint fix
- obs, r, term, trunc, info = env.step(actions[i]) + _obs, r, term, trunc, info = env.step(actions[i])
147-177: LGTM!The CLI argument naming is now consistent with hyphens throughout, and the argument structure mirrors train.py appropriately.
236-274: LGTM with minor suggestion.The debug printing utility is well-structured. Consider prefixing unused unpacked variables with
_at line 242.🔧 Optional lint fix
- nodes, q, prices = _extract(obs) + _nodes, q, _prices = _extract(obs)
278-361: LGTM overall. Consider uncommenting or removing check_obs_is_copy.The main function structure is sound with good test coverage. Line 334 has
check_obs_is_copycommented out - if this check is no longer needed, consider removing it entirely; if it's temporarily disabled for debugging, add a comment explaining why.The unused variables from tuple unpacking (lines 298-302, 340) could be prefixed with
_to satisfy linters, though this is optional for test code.src/plotter.py (6)
1-16: LGTM!Imports are appropriate and
_as_seriesis a clean utility for normalizing array data with NaN padding.
19-78: LGTM!The cumulative savings computation is well-structured with proper handling of edge cases. The 2-week episode assumption and 4.33 weeks/month approximation are clearly documented.
81-86: Acknowledged: unused parameter for backwards compatibility.The
episode_costsparameter is documented as kept for backwards compatibility. Consider adding a# noqa: ARG001comment or using_episode_coststo suppress linter warnings more explicitly.
131-161: LGTM!The panel display logic correctly implements both patterns:
skip_plot_*flags:if not getattr(...)(skip=False → show panel)plot_*flags:if getattr(...)(plot=True → show panel)This resolves the previously flagged inverted logic issue.
163-222: LGTM!The dynamic grid layout, panel rendering, and figure save/show/close handling are well-implemented.
225-316: LGTM overall!The cumulative savings visualization with dual y-axes, summary box, and combined legends is well-implemented. The return dictionary provides useful metrics for downstream analysis.
train_iter.py (2)
84-98: LGTM!The function signature extension cleanly adds dashboard forwarding with sensible defaults.
152-154: LGTM!The new CLI arguments are properly defined with appropriate types and defaults, matching the
run()function signature.
✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.
| result = subprocess.run(command, text=True, env=current_env) | ||
| if result.returncode != 0: | ||
| print(f"Error occurred: {result.stderr}") | ||
| print("Error occurred: train.py returned a non-zero exit code.") | ||
| return result.stdout |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
cat -n train_iter.py | sed -n '110,130p'Repository: FairRootGroup/powersched
Length of output: 937
🏁 Script executed:
# Find where run() is called and its return value is used
rg -n "\.run\(" train_iter.py -A 5 -B 2Repository: FairRootGroup/powersched
Length of output: 423
🏁 Script executed:
# Search for assignments or usages of the run() function call results
rg -n "result\s*=" train_iter.pyRepository: FairRootGroup/powersched
Length of output: 136
🏁 Script executed:
# Check if run() return value is used in function calls
rg -n "run\(" train_iter.pyRepository: FairRootGroup/powersched
Length of output: 413
🏁 Script executed:
# Search for all calls to the run() function across the codebase
rg -n "run\(" . --type py -B 2 -A 2 | grep -E "(run\(|=.*run\()" | head -30Repository: FairRootGroup/powersched
Length of output: 461
🏁 Script executed:
# More specifically, search for assignments where run() is called
rg -n "=\s*run\(" . --type pyRepository: FairRootGroup/powersched
Length of output: 50
result.stdout is always None when not capturing output.
Since capture_output was not set in the subprocess.run() call at line 119, result.stdout will always be None. The return statement at line 122 is misleading since the return value is not used by callers. Consider changing to return None or removing the return statement entirely.
🧰 Tools
🪛 Ruff (0.14.11)
119-119: subprocess call: check for execution of untrusted input
(S603)
🤖 Prompt for AI Agents
In `@train_iter.py` around lines 119 - 122, The subprocess.run call stores its
result in result but didn't request captured output, so result.stdout will
always be None; update the subprocess.run(...) invocation in train_iter.py (the
call that assigns to result) to include capture_output=True (or
stdout=subprocess.PIPE) if you intend to return the subprocess output, and then
return result.stdout, otherwise remove the misleading return result.stdout (or
explicitly return None) and adjust callers accordingly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@test/test_sanity_env.py`:
- Around line 125-133: The test is relying on info keys that the environment
currently doesn't populate; update the environment's step() implementation (the
function named step in environment.py) to include the keys "step_cost",
"num_unprocessed_jobs", "num_on_nodes", and "dropped_this_episode" in the
returned info dict with sensible values (defaults when not applicable) so
info.get(...) in the test yields real data, and in the test (where obs is
unused) rename obs to _obs to satisfy static analysis; alternatively, if you
prefer changing the test instead of the env, simplify the fingerprint to only
use float(r) and drop all info.get(...) calls and rename obs to _obs.
♻️ Duplicate comments (1)
test/test_sanity_env.py (1)
17-17: ImportWeightsfrom its source module.
Weightsis defined insrc/weights.py, notsrc/environment. Import it directly:-from src.environment import ComputeClusterEnv, Weights +from src.environment import ComputeClusterEnv +from src.weights import WeightsBased on past review comments indicating EPISODE_HOURS and Weights should come from their corresponding source files.
🧹 Nitpick comments (6)
test/test_sanity_env.py (6)
55-91: Use imported constants instead of hardcoded magic numbers.The comment at line 83 says "avoid importing env constants," but you're already importing
MAX_JOB_DURATION,CORES_PER_NODE, etc. on lines 22-27. These hardcoded values (170, 96, 168, 16) could drift out of sync with actual config.♻️ Suggested refactor
+from src.config import ( + MAX_JOB_DURATION, MAX_JOB_AGE, MAX_QUEUE_SIZE, + MIN_NODES_PER_JOB, MAX_NODES_PER_JOB, + MIN_CORES_PER_JOB, CORES_PER_NODE, EPISODE_HOURS +) + # ---- Node bounds ---- assert np.all(nodes >= -1), f"nodes < -1 exists, min={nodes.min()}" -assert np.all(nodes <= 170), f"nodes > MAX_JOB_DURATION exists, max={nodes.max()}" +assert np.all(nodes <= MAX_JOB_DURATION), f"nodes > MAX_JOB_DURATION exists, max={nodes.max()}" # ---- cores_available invariants (from env, not obs) ---- ca = env.cores_available assert ca.shape == nodes.shape -assert np.all((ca >= 0) & (ca <= 96)), f"cores_available out of bounds min={ca.min()} max={ca.max()}" +assert np.all((ca >= 0) & (ca <= CORES_PER_NODE)), f"cores_available out of bounds" -assert np.all(ca[idle] == 96), "idle nodes must have full cores_available" +assert np.all(ca[idle] == CORES_PER_NODE), "idle nodes must have full cores_available"
232-232: Remove commented-out code or add TODO.- # plot_total_reward=False,If this parameter is planned for future use, add a TODO comment; otherwise remove it.
242-242: Prefix unused variables with underscore.Per static analysis,
nodesandpricesare unpacked but not used:- nodes, q, prices = _extract(obs) + _nodes, q, _prices = _extract(obs)
291-316: Consider consolidating or documenting the inline determinism check.This section (lines 291-316) performs a quick determinism comparison that partially overlaps with
determinism_test(). Additionally:
- Line 293 uses hardcoded
seed = 123instead ofargs.seed- Multiple unused variables (
o1,o2,t1,t2,tr1,tr2) per static analysisIf this is intentional debug code, consider gating it behind a flag or documenting its purpose. Otherwise, consolidate with the
--check-determinismpath:- seed = 123 + seed = args.seedFor unused variables:
- o1, _ = env.reset(seed=seed) - o1s, r1, t1, tr1, i1 = env.step(action) + _, _ = env.reset(seed=seed) + o1s, r1, _t1, _tr1, i1 = env.step(action)
334-334: Enable or remove the commented-out copy check.
check_obs_is_copyis defined but never called. Either enable it to improve test coverage or remove it with explanation:- `#check_obs_is_copy`(env, obs) + check_obs_is_copy(env, obs)
340-340: Prefix unused loop variables.Per static analysis,
randinfoare unused in the loop:- obs, r, term, trunc, info = env.step(action) + obs, _r, term, trunc, _info = env.step(action)
📜 Review details
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
src/environment.pytest/test_sanity_env.py
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-12-04T08:51:13.779Z
Learnt from: CR
Repo: FairRootGroup/powersched PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-04T08:51:13.779Z
Learning: Applies to {train.py,weights.py} : Use weighted reward components for training: efficiency-weight (default 0.7), price-weight (default 0.2), idle-weight (default 0.1), and job-age-weight (default 0.0)
Applied to files:
test/test_sanity_env.py
🧬 Code graph analysis (1)
test/test_sanity_env.py (3)
src/environment.py (4)
ComputeClusterEnv(46-456)reset(202-256)step(258-456)render(51-52)src/weights.py (1)
Weights(5-16)src/workloadgen.py (2)
WorkloadGenerator(56-110)WorkloadGenConfig(35-53)
🪛 Ruff (0.14.11)
test/test_sanity_env.py
125-125: Unpacked variable obs is never used
Prefix it with an underscore or any other dummy variable pattern
(RUF059)
242-242: Unpacked variable nodes is never used
Prefix it with an underscore or any other dummy variable pattern
(RUF059)
242-242: Unpacked variable prices is never used
Prefix it with an underscore or any other dummy variable pattern
(RUF059)
298-298: Unpacked variable o1 is never used
Prefix it with an underscore or any other dummy variable pattern
(RUF059)
299-299: Unpacked variable t1 is never used
Prefix it with an underscore or any other dummy variable pattern
(RUF059)
299-299: Unpacked variable tr1 is never used
Prefix it with an underscore or any other dummy variable pattern
(RUF059)
301-301: Unpacked variable o2 is never used
Prefix it with an underscore or any other dummy variable pattern
(RUF059)
302-302: Unpacked variable t2 is never used
Prefix it with an underscore or any other dummy variable pattern
(RUF059)
302-302: Unpacked variable tr2 is never used
Prefix it with an underscore or any other dummy variable pattern
(RUF059)
340-340: Unpacked variable r is never used
Prefix it with an underscore or any other dummy variable pattern
(RUF059)
340-340: Unpacked variable info is never used
Prefix it with an underscore or any other dummy variable pattern
(RUF059)
🔇 Additional comments (3)
src/environment.py (1)
222-223: LGTM! Test-time override for price start index.The logic is correct—using modulo ensures the index stays within bounds. Minor nit: "Purposes" should be lowercase "purposes" for consistency.
test/test_sanity_env.py (2)
30-36: LGTM!Clean price-loading helper with appropriate fallback for missing path.
102-107: LGTM!Good defensive check for observation buffer aliasing. Consider enabling this check in
main()(currently commented out at line 334).
✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.
| obs, r, term, trunc, info = env.step(actions[i]) | ||
| # record a small fingerprint | ||
| traj.append(( | ||
| float(r), | ||
| float(info.get("step_cost", 0.0)), | ||
| int(info.get("num_unprocessed_jobs", -1)), | ||
| int(info.get("num_on_nodes", -1)), | ||
| int(info.get("dropped_this_episode", -1)), | ||
| )) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Info dict keys are not populated by the environment.
The environment's step() method returns an empty info dict {} (line 456 in environment.py), so info.get("step_cost", 0.0), info.get("num_unprocessed_jobs", -1), etc. will always return default values. This makes the fingerprint comparison less meaningful—only float(r) varies.
Either populate these keys in the environment's step() return, or simplify the fingerprint:
traj.append((
float(r),
- float(info.get("step_cost", 0.0)),
- int(info.get("num_unprocessed_jobs", -1)),
- int(info.get("num_on_nodes", -1)),
- int(info.get("dropped_this_episode", -1)),
))Also, prefix unused obs with underscore per static analysis:
-obs, r, term, trunc, info = env.step(actions[i])
+_obs, r, term, trunc, info = env.step(actions[i])🧰 Tools
🪛 Ruff (0.14.11)
125-125: Unpacked variable obs is never used
Prefix it with an underscore or any other dummy variable pattern
(RUF059)
🤖 Prompt for AI Agents
In `@test/test_sanity_env.py` around lines 125 - 133, The test is relying on info
keys that the environment currently doesn't populate; update the environment's
step() implementation (the function named step in environment.py) to include the
keys "step_cost", "num_unprocessed_jobs", "num_on_nodes", and
"dropped_this_episode" in the returned info dict with sensible values (defaults
when not applicable) so info.get(...) in the test yields real data, and in the
test (where obs is unused) rename obs to _obs to satisfy static analysis;
alternatively, if you prefer changing the test instead of the env, simplify the
fingerprint to only use float(r) and drop all info.get(...) calls and rename obs
to _obs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will be addressed in the following PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@enlorenz, understood! Would you like me to create a GitHub issue to track this work for the next PR?
test/test_sanity_env.py
Outdated
|
|
||
| # ---- Node bounds ---- | ||
| assert np.all(nodes >= -1), f"nodes < -1 exists, min={nodes.min()}" | ||
| assert np.all(nodes <= 170), f"nodes > MAX_JOB_DURATION exists, max={nodes.max()}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't use magic numbers here, use the imported constants instead.
5a2729d to
7a445c5
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 4
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
train.py (1)
200-212: Fix attribute access paths for metrics in evaluation output.The code at lines 200-212 accesses
env.total_cost,env.baseline_cost,env.jobs_completed, andenv.jobs_submitteddirectly on the environment object. However, these attributes are stored in theMetricsTrackerinstance atenv.metrics. There are no property decorators or__getattr__methods inComputeClusterEnvto proxy these attributes, so accessing them directly will cause anAttributeErrorat runtime.Change the attribute access to use
env.metrics.*instead:
env.total_cost→env.metrics.total_costenv.baseline_cost→env.metrics.baseline_costenv.baseline_cost_off→env.metrics.baseline_cost_offenv.jobs_completed→env.metrics.jobs_completedenv.jobs_submitted→env.metrics.jobs_submittedenv.max_queue_size_reached→env.metrics.max_queue_size_reached
🤖 Fix all issues with AI agents
In `@src/plotter.py`:
- Around line 289-291: The list comprehension that builds labels uses an
ambiguous single-letter variable `l`; rename it to a clearer identifier (e.g.,
`line`) to avoid confusion and improve readability—update the comprehension in
the block where `lines = line1 + line1b + line2 + line2b` (the `labels =
[l.get_label() for l in lines]` line) to use `line` and ensure subsequent use in
`ax1.legend(lines, labels, ...)` remains correct.
In `@test/test_sanity_env.py`:
- Around line 204-205: Adjust the indentation for the comment line containing
'Train.py passes strings; the env treats "" as falsy in some places and truthy
in others.' to use 4 leading spaces (matching the surrounding block) instead of
5 so the comment aligns with the rest of the indented block in
test/test_sanity_env.py; locate that exact comment string and correct its
leading whitespace.
In `@train.py`:
- Line 294: The tb_log_name argument in the model.learn call is using an
unnecessary f-string; update the invocation of model.learn (the call that
currently passes tb_log_name=f"PPO" and callback=ComputeClusterCallback()) to
pass a plain string tb_log_name="PPO" instead of the f-prefixed string to avoid
the redundant f-string usage.
- Around line 152-159: The print statement before creating a new PPO model uses
an unnecessary f-string: change the print call in the new-model branch (the one
that constructs model = PPO('MultiInputPolicy', env, ...)) to use a normal
string literal (remove the leading f) so it becomes print("Starting a new model
training..."), leaving the surrounding PPO instantiation and its parameters
(ent_coef, n_steps, batch_size, tensorboard_log) unchanged.
♻️ Duplicate comments (2)
train_iter.py (1)
114-122:result.stdoutis alwaysNonewhen not capturing output.Since
capture_output=Trueis not set insubprocess.run(),result.stdoutwill always beNone. The return statement is misleading. Consider returningNoneexplicitly or the return code if that's more useful.🔧 Suggested fix
result = subprocess.run(command, text=True, env=current_env) if result.returncode != 0: print("Error occurred: train.py returned a non-zero exit code.") - return result.stdout + return result.returncode == 0 # or simply: return Nonetest/test_sanity_env.py (1)
125-125: Prefix unused variables with underscore.Static analysis correctly flags
obsas unused. Apply the same fix to other flagged variables.🔧 Suggested fix for multiple locations
# Line 125 - obs, r, term, trunc, info = env.step(actions[i]) + _obs, r, term, trunc, info = env.step(actions[i]) # Line 242 - nodes, q, prices = _extract(obs) + _nodes, q, _prices = _extract(obs) # Lines 298-302 - o1, _ = env.reset(seed=seed) - o1s, r1, t1, tr1, i1 = env.step(action) - o2, _ = env.reset(seed=seed) - o2s, r2, t2, tr2, i2 = env.step(action) + _o1, _ = env.reset(seed=seed) + o1s, r1, _t1, _tr1, i1 = env.step(action) + _o2, _ = env.reset(seed=seed) + o2s, r2, _t2, _tr2, i2 = env.step(action) # Line 340 - obs, r, term, trunc, info = env.step(action) + obs, _r, term, trunc, _info = env.step(action)
🧹 Nitpick comments (6)
train.py (3)
262-264: Replace hardcodedmax_nodes=335with config constant.The hardcoded value is fragile and will diverge if
MAX_NODESchanges. Import and use the config constant instead.♻️ Suggested fix
Add to imports at the top:
from src.config import MAX_NODESThen use it in both locations (lines 263 and 304):
- max_nodes=335, + max_nodes=MAX_NODES,
276-277: Consider narrowing exception handling or adding traceback logging.The broad
except Exceptioncatches are intentional to prevent dashboard failures from crashing training, but they may silently hide unexpected errors. Consider logging the full traceback at debug level or catching more specific exceptions (e.g.,OSError,ValueError, matplotlib exceptions).♻️ Optional improvement
+import traceback ... except Exception as e: - print(f"Could not generate dashboard plot: {e}") + print(f"Could not generate dashboard plot: {e}") + traceback.print_exc() # or use logging.debug
287-295: Progress message is redundant and potentially confusing.The "Running... at {iters + 1}" message prints before
model.learn()starts, but line 285 already prints "Training iteration {iters + 1}...". Consider removing the redundant message or moving it to after the learn call if it's meant to show progress during long runs.train_iter.py (1)
178-178: Add spaces after commas for readability.🔧 Suggested fix
- run(efficiency_weight, price_weight, idle_weight, job_age_weight, drop_weight, args.iter_limit_per_step, args.session, args.prices, args.job_durations, args.jobs, args.hourly_jobs,plot_dashboard=args.plot_dashboard,dashboard_hours=args.dashboard_hours) + run(efficiency_weight, price_weight, idle_weight, job_age_weight, drop_weight, args.iter_limit_per_step, args.session, args.prices, args.job_durations, args.jobs, args.hourly_jobs, plot_dashboard=args.plot_dashboard, dashboard_hours=args.dashboard_hours)test/test_sanity_env.py (1)
56-57: Replace magic numbers with imported constants.The value
170should beMAX_JOB_DURATIONfromsrc.config. This was noted in past review feedback about using imported constants instead of magic numbers.♻️ Suggested fix
# ---- Node bounds ---- assert np.all(nodes >= -1), f"nodes < -1 exists, min={nodes.min()}" - assert np.all(nodes <= 170), f"nodes > MAX_JOB_DURATION exists, max={nodes.max()}" + assert np.all(nodes <= MAX_JOB_DURATION), f"nodes > MAX_JOB_DURATION exists, max={nodes.max()}"Similar changes needed for:
- Line 65:
96→CORES_PER_NODE- Line 71:
96→CORES_PER_NODE- Line 84:
170→MAX_JOB_DURATION- Line 85:
168→ appropriate constant orMAX_JOB_AGEif available- Line 90:
16→MAX_NODES_PER_JOB- Line 91:
96→CORES_PER_NODEsrc/plotter.py (1)
81-86: Document the intentionally unusedepisode_costsparameter more prominently.The docstring mentions backwards compatibility, but consider using
_episode_coststo silence linter warnings or add# noqa: ARG001.♻️ Suggested fix
-def plot_dashboard(env, num_hours, max_nodes, episode_costs=None, save=True, show=True, suffix=""): +def plot_dashboard(env, num_hours, max_nodes, episode_costs=None, save=True, show=True, suffix=""): # noqa: ARG001Or rename to indicate it's intentionally unused:
-def plot_dashboard(env, num_hours, max_nodes, episode_costs=None, save=True, show=True, suffix=""): +def plot_dashboard(env, num_hours, max_nodes, _episode_costs=None, save=True, show=True, suffix=""):
| lines = line1 + line1b + line2 + line2b | ||
| labels = [l.get_label() for l in lines] | ||
| ax1.legend(lines, labels, loc="center right", fontsize=9) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rename ambiguous variable l to line.
The variable name l is ambiguous and can be confused with 1 (one).
🔧 Suggested fix
# Combine legends
lines = line1 + line1b + line2 + line2b
- labels = [l.get_label() for l in lines]
+ labels = [line.get_label() for line in lines]
ax1.legend(lines, labels, loc="center right", fontsize=9)📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| lines = line1 + line1b + line2 + line2b | |
| labels = [l.get_label() for l in lines] | |
| ax1.legend(lines, labels, loc="center right", fontsize=9) | |
| lines = line1 + line1b + line2 + line2b | |
| labels = [line.get_label() for line in lines] | |
| ax1.legend(lines, labels, loc="center right", fontsize=9) |
🧰 Tools
🪛 Ruff (0.14.13)
290-290: Ambiguous variable name: l
(E741)
🤖 Prompt for AI Agents
In `@src/plotter.py` around lines 289 - 291, The list comprehension that builds
labels uses an ambiguous single-letter variable `l`; rename it to a clearer
identifier (e.g., `line`) to avoid confusion and improve readability—update the
comprehension in the block where `lines = line1 + line1b + line2 + line2b` (the
`labels = [l.get_label() for l in lines]` line) to use `line` and ensure
subsequent use in `ax1.legend(lines, labels, ...)` remains correct.
|
|
||
| # Train.py passes strings; the env treats "" as falsy in some places and truthy in others. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix inconsistent indentation.
Line 205 has extra leading space (5 spaces instead of 4), causing misalignment.
🔧 Suggested fix
- # Train.py passes strings; the env treats "" as falsy in some places and truthy in others.
+ # Train.py passes strings; the env treats "" as falsy in some places and truthy in others.📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| # Train.py passes strings; the env treats "" as falsy in some places and truthy in others. | |
| # Train.py passes strings; the env treats "" as falsy in some places and truthy in others. |
🤖 Prompt for AI Agents
In `@test/test_sanity_env.py` around lines 204 - 205, Adjust the indentation for
the comment line containing 'Train.py passes strings; the env treats "" as falsy
in some places and truthy in others.' to use 4 leading spaces (matching the
surrounding block) instead of 5 so the comment aligns with the rest of the
indented block in test/test_sanity_env.py; locate that exact comment string and
correct its leading whitespace.
| model = PPO.load(latest_model_file, env=env, tensorboard_log=log_dir | ||
| , n_steps=64, batch_size=64 | ||
| ) | ||
| else: | ||
| print(f"Starting a new model training...") | ||
| model = PPO('MultiInputPolicy', env, verbose=1, tensorboard_log=log_dir, ent_coef=args.ent_coef) | ||
| model = PPO('MultiInputPolicy', env, verbose=1, tensorboard_log=log_dir, ent_coef=args.ent_coef | ||
| , n_steps=64, batch_size=64 | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove unnecessary f-prefix from string literals.
Line 156 uses f"Starting a new model training..." without any placeholders. The f prefix is unnecessary.
🔧 Suggested fix
- print(f"Starting a new model training...")
+ print("Starting a new model training...")📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| model = PPO.load(latest_model_file, env=env, tensorboard_log=log_dir | |
| , n_steps=64, batch_size=64 | |
| ) | |
| else: | |
| print(f"Starting a new model training...") | |
| model = PPO('MultiInputPolicy', env, verbose=1, tensorboard_log=log_dir, ent_coef=args.ent_coef) | |
| model = PPO('MultiInputPolicy', env, verbose=1, tensorboard_log=log_dir, ent_coef=args.ent_coef | |
| , n_steps=64, batch_size=64 | |
| ) | |
| model = PPO.load(latest_model_file, env=env, tensorboard_log=log_dir | |
| , n_steps=64, batch_size=64 | |
| ) | |
| else: | |
| print("Starting a new model training...") | |
| model = PPO('MultiInputPolicy', env, verbose=1, tensorboard_log=log_dir, ent_coef=args.ent_coef | |
| , n_steps=64, batch_size=64 | |
| ) |
🧰 Tools
🪛 Ruff (0.14.13)
156-156: f-string without any placeholders
Remove extraneous f prefix
(F541)
🤖 Prompt for AI Agents
In `@train.py` around lines 152 - 159, The print statement before creating a new
PPO model uses an unnecessary f-string: change the print call in the new-model
branch (the one that constructs model = PPO('MultiInputPolicy', env, ...)) to
use a normal string literal (remove the leading f) so it becomes print("Starting
a new model training..."), leaving the surrounding PPO instantiation and its
parameters (ent_coef, n_steps, batch_size, tensorboard_log) unchanged.
| print(f"iterations limit ({args.iter_limit}) reached: {iters}.") | ||
| break | ||
| try: | ||
| model.learn(total_timesteps=STEPS_PER_ITERATION, reset_num_timesteps=False, tb_log_name=f"PPO", callback=ComputeClusterCallback()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove unnecessary f-prefix from tb_log_name.
🔧 Suggested fix
- model.learn(total_timesteps=STEPS_PER_ITERATION, reset_num_timesteps=False, tb_log_name=f"PPO", callback=ComputeClusterCallback())
+ model.learn(total_timesteps=STEPS_PER_ITERATION, reset_num_timesteps=False, tb_log_name="PPO", callback=ComputeClusterCallback())📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| model.learn(total_timesteps=STEPS_PER_ITERATION, reset_num_timesteps=False, tb_log_name=f"PPO", callback=ComputeClusterCallback()) | |
| model.learn(total_timesteps=STEPS_PER_ITERATION, reset_num_timesteps=False, tb_log_name="PPO", callback=ComputeClusterCallback()) |
🧰 Tools
🪛 Ruff (0.14.13)
294-294: f-string without any placeholders
Remove extraneous f prefix
(F541)
🤖 Prompt for AI Agents
In `@train.py` at line 294, The tb_log_name argument in the model.learn call is
using an unnecessary f-string; update the invocation of model.learn (the call
that currently passes tb_log_name=f"PPO" and callback=ComputeClusterCallback())
to pass a plain string tb_log_name="PPO" instead of the f-prefixed string to
avoid the redundant f-string usage.
… for env. test_sanity_env.py is a standalone sanity/check harness that builds the env like train.py and runs invariants, gym compliance, and determinism checks with CLI options and optional external data. test_env.py is a minimal smoke test that instantiates the env with fixed weights and runs random actions, printing rewards; no invariants, determinism, or CLI control.
…raining. Add plotter.py with reusable plotting helpers for per-hour dashboards and cumulative savings analysis. Update train.py to import plotter utilities and add CLI flags to drive dashboard/cumulative plots. Hook plotting into evaluation and training flows, including saving figures to session plots. Addendum: Added plotter logic in train_iter.py
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 6
🤖 Fix all issues with AI agents
In `@src/plotter.py`:
- Around line 90-94: The header calculations mix direct env attributes with
env.metrics properties causing incorrect zeroing; update the guard checks and
the baseline wait numerator to consistently use env.metrics (e.g., use
getattr(env.metrics, "jobs_submitted", 0), getattr(env.metrics,
"baseline_jobs_submitted", 0), getattr(env.metrics, "jobs_completed", 0),
getattr(env.metrics, "baseline_jobs_completed", 0)) and change baseline_avg_wait
to use env.metrics.baseline_total_job_wait_time so that completion_rate,
baseline_completion_rate, avg_wait, and baseline_avg_wait all reference metrics
on env.metrics consistently.
- Around line 81-86: The function plot_dashboard currently accepts episode_costs
for compatibility but doesn't use it, which triggers Ruff ARG001; to silence the
linter without changing the public signature, add an explicit acknowledgement
inside plot_dashboard such as a single line "_ = episode_costs" (or "del
episode_costs" if you prefer) at the top of the function body so the parameter
is marked as intentionally unused while leaving the signature intact.
In `@test/test_sanity_env.py`:
- Around line 298-302: Rename unused tuple bindings returned from env.reset and
env.step to start with an underscore so linters/readers know they are
intentionally unused; e.g., change o1, _ = env.reset(seed=seed) to _o1, _ =
env.reset(seed=seed) and change o1s, r1, t1, tr1, i1 = env.step(action) to o1s,
r1, _t1, _tr1, _info1 = env.step(action) (and similarly for the second
reset/step: _o2, _ = env.reset(...), o2s, r2, _t2, _tr2, _info2 =
env.step(...)); apply the same pattern for the other occurrence around line
~340.
- Around line 242-243: The call to _extract assigns nodes, q, prices but nodes
and prices are unused; update the assignment from _extract(...) to mark unused
outputs by prefixing them with underscores (e.g., _nodes and _prices or use _ ,
q, _ ) so Ruff stops flagging unused variables—modify the assignment where
_extract is called to only keep q as the used variable and mark the others as
unused.
In `@train.py`:
- Around line 272-273: The broad except Exception blocks that print "Could not
generate dashboard plot" should be made explicit: either narrow the exception to
the specific errors you expect (e.g., ValueError, RuntimeError,
matplotlib-specific exceptions) and log the full traceback using
logging.exception, or, if you intentionally want to swallow all exceptions for
non-fatal plotting, annotate those except lines with "# noqa: BLE001" and
replace print with a proper logger call; apply this change to the handler that
prints "Could not generate dashboard plot" and the similar handler around the
lines referenced (the second except that prints the same message).
- Around line 283-285: The logging condition is off-by-one because the loop
variable iters has already been incremented; change the condition from
(iters+1)%10==0 to iters%10==0 so logs occur at 10, 20, …, and keep the print's
displayed step count consistent (use iters or iters+1 as intended)—update the
check around the print statement that references iters and STEPS_PER_ITERATION
accordingly.
♻️ Duplicate comments (4)
test/test_sanity_env.py (2)
83-91: Prefer imported constants over hard-coded bounds.
Using 170/168/16/96 can drift from config; you already importMAX_JOB_DURATION,EPISODE_HOURS,MAX_NODES_PER_JOB, andCORES_PER_NODE.♻️ Suggested refactor
- assert np.all((dur >= 0) & (dur <= 170)), f"duration out of bounds min={dur.min()} max={dur.max()}" - assert np.all((age >= 0) & (age <= 168)), f"age out of bounds min={age.min()} max={age.max()}" + assert np.all((dur >= 0) & (dur <= MAX_JOB_DURATION)), f"duration out of bounds min={dur.min()} max={dur.max()}" + assert np.all((age >= 0) & (age <= EPISODE_HOURS)), f"age out of bounds min={age.min()} max={age.max()}" @@ - assert np.all((nn[active] >= 1) & (nn[active] <= 16)), f"job nnodes out of bounds nn={nn[active]}" - assert np.all((cpn[active] >= 1) & (cpn[active] <= 96)), f"cores_per_node out of bounds cpn={cpn[active]}" + assert np.all((nn[active] >= 1) & (nn[active] <= MAX_NODES_PER_JOB)), f"job nnodes out of bounds nn={nn[active]}" + assert np.all((cpn[active] >= 1) & (cpn[active] <= CORES_PER_NODE)), f"cores_per_node out of bounds cpn={cpn[active]}"
124-126: Avoid unusedobsbinding in rollout.
Rename to_obsto satisfy Ruff.🔧 Suggested fix
- obs, r, term, trunc, info = env.step(actions[i]) + _obs, r, term, trunc, info = env.step(actions[i])train_iter.py (1)
118-122:result.stdoutis alwaysNonewithout capture.
Line 119 callssubprocess.run(...)withoutcapture_output/stdout=PIPE, soresult.stdoutwill always beNone. Either enable capture if you intend to return output, or returnNoneto avoid a misleading return value.🔧 Proposed fix (keep streaming output)
- return result.stdout + return Nonesrc/plotter.py (1)
289-291: Rename single-letterlto avoid ambiguity.🔧 Suggested fix
- labels = [l.get_label() for l in lines] + labels = [line.get_label() for line in lines]
🧹 Nitpick comments (1)
train.py (1)
255-271: Avoid hard-codedmax_nodes=335in dashboard plots.
Hard-coding can mis-scale plots on different cluster sizes. Consider deriving fromenv.max_nodes/env.num_nodes(once per block) and reusing it.♻️ Suggested refactor
if args.plot_dashboard: try: + max_nodes = getattr(env, "max_nodes", getattr(env, "num_nodes", 0)) plot_dashboard( env, num_hours=args.dashboard_hours, - max_nodes=335, + max_nodes=max_nodes, @@ ) except Exception as e: print(f"Could not generate dashboard plot: {e}") @@ if args.plot_dashboard: try: + max_nodes = getattr(env, "max_nodes", getattr(env, "num_nodes", 0)) plot_dashboard( env, num_hours=args.dashboard_hours, - max_nodes=335, + max_nodes=max_nodes, episode_costs=[Also applies to: 296-312
| def plot_dashboard(env, num_hours, max_nodes, episode_costs=None, save=True, show=True, suffix=""): | ||
| """ | ||
| Per-hour dashboard: price, nodes, queue, reward components, etc. | ||
| NOTE: episode_costs is accepted for backwards compatibility but NOT used here anymore. | ||
| Cumulative savings now lives in plot_cumulative_savings(). | ||
| """ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Silence unused episode_costs (kept for compatibility).
Ruff will flag ARG001; consider acknowledging it explicitly.
🔧 Suggested fix
def plot_dashboard(env, num_hours, max_nodes, episode_costs=None, save=True, show=True, suffix=""):
"""
@@
"""
+ _ = episode_costs # kept for backwards compatibility🧰 Tools
🪛 Ruff (0.14.13)
81-81: Unused function argument: episode_costs
(ARG001)
🤖 Prompt for AI Agents
In `@src/plotter.py` around lines 81 - 86, The function plot_dashboard currently
accepts episode_costs for compatibility but doesn't use it, which triggers Ruff
ARG001; to silence the linter without changing the public signature, add an
explicit acknowledgement inside plot_dashboard such as a single line "_ =
episode_costs" (or "del episode_costs" if you prefer) at the top of the function
body so the parameter is marked as intentionally unused while leaving the
signature intact.
| completion_rate = (env.metrics.jobs_completed / env.metrics.jobs_submitted * 100) if getattr(env, "jobs_submitted", 0) > 0 else 0.0 | ||
| baseline_completion_rate = (env.metrics.baseline_jobs_completed / env.metrics.baseline_jobs_submitted * 100) if getattr(env, "baseline_jobs_submitted", 0) > 0 else 0.0 | ||
| avg_wait = (env.metrics.total_job_wait_time / env.metrics.jobs_completed) if getattr(env, "jobs_completed", 0) > 0 else 0.0 | ||
| baseline_avg_wait = (env.baseline_total_job_wait_time / env.metrics.baseline_jobs_completed) if getattr(env, "baseline_jobs_completed", 0) > 0 else 0.0 | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use env.metrics consistently in header calculations.
The guards read env.* while numerators use env.metrics.*, which can zero out rates if those env attributes don’t exist. Also, baseline_total_job_wait_time likely lives under env.metrics.
🔧 Suggested fix
- completion_rate = (env.metrics.jobs_completed / env.metrics.jobs_submitted * 100) if getattr(env, "jobs_submitted", 0) > 0 else 0.0
- baseline_completion_rate = (env.metrics.baseline_jobs_completed / env.metrics.baseline_jobs_submitted * 100) if getattr(env, "baseline_jobs_submitted", 0) > 0 else 0.0
- avg_wait = (env.metrics.total_job_wait_time / env.metrics.jobs_completed) if getattr(env, "jobs_completed", 0) > 0 else 0.0
- baseline_avg_wait = (env.baseline_total_job_wait_time / env.metrics.baseline_jobs_completed) if getattr(env, "baseline_jobs_completed", 0) > 0 else 0.0
+ completion_rate = (env.metrics.jobs_completed / env.metrics.jobs_submitted * 100) if env.metrics.jobs_submitted > 0 else 0.0
+ baseline_completion_rate = (env.metrics.baseline_jobs_completed / env.metrics.baseline_jobs_submitted * 100) if env.metrics.baseline_jobs_submitted > 0 else 0.0
+ avg_wait = (env.metrics.total_job_wait_time / env.metrics.jobs_completed) if env.metrics.jobs_completed > 0 else 0.0
+ baseline_avg_wait = (env.metrics.baseline_total_job_wait_time / env.metrics.baseline_jobs_completed) if env.metrics.baseline_jobs_completed > 0 else 0.0📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| completion_rate = (env.metrics.jobs_completed / env.metrics.jobs_submitted * 100) if getattr(env, "jobs_submitted", 0) > 0 else 0.0 | |
| baseline_completion_rate = (env.metrics.baseline_jobs_completed / env.metrics.baseline_jobs_submitted * 100) if getattr(env, "baseline_jobs_submitted", 0) > 0 else 0.0 | |
| avg_wait = (env.metrics.total_job_wait_time / env.metrics.jobs_completed) if getattr(env, "jobs_completed", 0) > 0 else 0.0 | |
| baseline_avg_wait = (env.baseline_total_job_wait_time / env.metrics.baseline_jobs_completed) if getattr(env, "baseline_jobs_completed", 0) > 0 else 0.0 | |
| completion_rate = (env.metrics.jobs_completed / env.metrics.jobs_submitted * 100) if env.metrics.jobs_submitted > 0 else 0.0 | |
| baseline_completion_rate = (env.metrics.baseline_jobs_completed / env.metrics.baseline_jobs_submitted * 100) if env.metrics.baseline_jobs_submitted > 0 else 0.0 | |
| avg_wait = (env.metrics.total_job_wait_time / env.metrics.jobs_completed) if env.metrics.jobs_completed > 0 else 0.0 | |
| baseline_avg_wait = (env.metrics.baseline_total_job_wait_time / env.metrics.baseline_jobs_completed) if env.metrics.baseline_jobs_completed > 0 else 0.0 |
🤖 Prompt for AI Agents
In `@src/plotter.py` around lines 90 - 94, The header calculations mix direct env
attributes with env.metrics properties causing incorrect zeroing; update the
guard checks and the baseline wait numerator to consistently use env.metrics
(e.g., use getattr(env.metrics, "jobs_submitted", 0), getattr(env.metrics,
"baseline_jobs_submitted", 0), getattr(env.metrics, "jobs_completed", 0),
getattr(env.metrics, "baseline_jobs_completed", 0)) and change baseline_avg_wait
to use env.metrics.baseline_total_job_wait_time so that completion_rate,
baseline_completion_rate, avg_wait, and baseline_avg_wait all reference metrics
on env.metrics consistently.
| nodes, q, prices = _extract(obs) | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mark unused _extract outputs.
nodes and prices aren’t used here; prefix with _ to satisfy Ruff.
🔧 Suggested fix
- nodes, q, prices = _extract(obs)
+ _, q, _ = _extract(obs)📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| nodes, q, prices = _extract(obs) | |
| _, q, _ = _extract(obs) | |
🧰 Tools
🪛 Ruff (0.14.13)
242-242: Unpacked variable nodes is never used
Prefix it with an underscore or any other dummy variable pattern
(RUF059)
242-242: Unpacked variable prices is never used
Prefix it with an underscore or any other dummy variable pattern
(RUF059)
🤖 Prompt for AI Agents
In `@test/test_sanity_env.py` around lines 242 - 243, The call to _extract assigns
nodes, q, prices but nodes and prices are unused; update the assignment from
_extract(...) to mark unused outputs by prefixing them with underscores (e.g.,
_nodes and _prices or use _ , q, _ ) so Ruff stops flagging unused
variables—modify the assignment where _extract is called to only keep q as the
used variable and mark the others as unused.
| o1, _ = env.reset(seed=seed) | ||
| o1s, r1, t1, tr1, i1 = env.step(action) | ||
|
|
||
| o2, _ = env.reset(seed=seed) | ||
| o2s, r2, t2, tr2, i2 = env.step(action) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Prefix unused bindings with _ in the smoke checks.
Several bindings are unused (o1/o2, t1/tr1, t2/tr2, r, info).
🔧 Suggested fix
- o1, _ = env.reset(seed=seed)
- o1s, r1, t1, tr1, i1 = env.step(action)
+ _o1, _ = env.reset(seed=seed)
+ o1s, r1, _t1, _tr1, i1 = env.step(action)
@@
- o2, _ = env.reset(seed=seed)
- o2s, r2, t2, tr2, i2 = env.step(action)
+ _o2, _ = env.reset(seed=seed)
+ o2s, r2, _t2, _tr2, i2 = env.step(action)
@@
- obs, r, term, trunc, info = env.step(action)
+ obs, _r, term, trunc, _info = env.step(action)Also applies to: 340-340
🧰 Tools
🪛 Ruff (0.14.13)
298-298: Unpacked variable o1 is never used
Prefix it with an underscore or any other dummy variable pattern
(RUF059)
299-299: Unpacked variable t1 is never used
Prefix it with an underscore or any other dummy variable pattern
(RUF059)
299-299: Unpacked variable tr1 is never used
Prefix it with an underscore or any other dummy variable pattern
(RUF059)
301-301: Unpacked variable o2 is never used
Prefix it with an underscore or any other dummy variable pattern
(RUF059)
302-302: Unpacked variable t2 is never used
Prefix it with an underscore or any other dummy variable pattern
(RUF059)
302-302: Unpacked variable tr2 is never used
Prefix it with an underscore or any other dummy variable pattern
(RUF059)
🤖 Prompt for AI Agents
In `@test/test_sanity_env.py` around lines 298 - 302, Rename unused tuple bindings
returned from env.reset and env.step to start with an underscore so
linters/readers know they are intentionally unused; e.g., change o1, _ =
env.reset(seed=seed) to _o1, _ = env.reset(seed=seed) and change o1s, r1, t1,
tr1, i1 = env.step(action) to o1s, r1, _t1, _tr1, _info1 = env.step(action) (and
similarly for the second reset/step: _o2, _ = env.reset(...), o2s, r2, _t2,
_tr2, _info2 = env.step(...)); apply the same pattern for the other occurrence
around line ~340.
| except Exception as e: | ||
| print(f"Could not generate dashboard plot: {e}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Catching bare Exception trips BLE001.
If broad catch is intentional for non-fatal plotting, annotate with # noqa: BLE001 (or narrow to expected exceptions and log a traceback).
🔧 Suggested fix
- except Exception as e:
+ except Exception as e: # noqa: BLE001
print(f"Could not generate dashboard plot: {e}")
@@
- except Exception as e:
+ except Exception as e: # noqa: BLE001
print(f"Dashboard plot failed (non-fatal): {e}")Also applies to: 313-314
🧰 Tools
🪛 Ruff (0.14.13)
272-272: Do not catch blind exception: Exception
(BLE001)
🤖 Prompt for AI Agents
In `@train.py` around lines 272 - 273, The broad except Exception blocks that
print "Could not generate dashboard plot" should be made explicit: either narrow
the exception to the specific errors you expect (e.g., ValueError, RuntimeError,
matplotlib-specific exceptions) and log the full traceback using
logging.exception, or, if you intentionally want to swallow all exceptions for
non-fatal plotting, annotate those except lines with "# noqa: BLE001" and
replace print with a proper logger call; apply this change to the handler that
prints "Could not generate dashboard plot" and the similar handler around the
lines referenced (the second except that prints the same message).
| t0 = time.time() | ||
| if (iters+1)%10==0: | ||
| print(f"Running... at {iters + 1} of {STEPS_PER_ITERATION * (iters + 1)} steps") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Logging interval is off by one.
iters is already incremented; (iters + 1) % 10 logs on 9, 19, … rather than 10, 20, …
🔧 Suggested fix
- if (iters+1)%10==0:
+ if iters % 10 == 0:📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| t0 = time.time() | |
| if (iters+1)%10==0: | |
| print(f"Running... at {iters + 1} of {STEPS_PER_ITERATION * (iters + 1)} steps") | |
| t0 = time.time() | |
| if iters % 10 == 0: | |
| print(f"Running... at {iters + 1} of {STEPS_PER_ITERATION * (iters + 1)} steps") |
🤖 Prompt for AI Agents
In `@train.py` around lines 283 - 285, The logging condition is off-by-one because
the loop variable iters has already been incremented; change the condition from
(iters+1)%10==0 to iters%10==0 so logs occur at 10, 20, …, and keep the print's
displayed step count consistent (use iters or iters+1 as intended)—update the
check around the print statement that references iters and STEPS_PER_ITERATION
accordingly.
No description provided.