-
Notifications
You must be signed in to change notification settings - Fork 1
Carry-over states between episodes, instead of resetting everything to 0. #5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
📝 WalkthroughWalkthroughAdds a carry-over state option to the environment, separates cumulative vs. episode-level metrics, and guards episode-level metric updates (baseline and normal paths) to avoid KeyError while threading per-episode scaffolding through reset and step flows. Changes
Sequence Diagram(s)sequenceDiagram
participant User
participant Env as Environment
participant Jobs as JobManagement
participant Metrics as MetricsTracker
User->>Env: __init__(..., carry_over_state)
Env->>Metrics: reset_cumulative_metrics()
Env->>Env: reset_timeline_state() / init episode scaffolding
User->>Env: step(action)
Env->>Jobs: assign_jobs_to_available_nodes()
Jobs->>Metrics: update completion/drop / wait-time (guarded if episode keys exist)
Env->>Metrics: update episode_total_cost & max_queue_size (guarded)
Env->>User: return (obs, reward, done, info with episode metrics)
User->>Env: reset()
alt carry_over_state == True
Env->>Metrics: reset_episode_metrics() only
else
Env->>Metrics: reset_cumulative_metrics()
Env->>Env: full state reinitialization
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Suggested reviewers
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/environment.py (1)
296-301: Episode increment and termination logic breaks in carry-over mode due to unresolvedcurrent_hourstate.When
carry_over_state=True,reset()calls onlyreset_episode_metrics(), which doesn't resetcurrent_hour. Sincereset_timeline_state()(which setscurrent_hour = 0) is called only once in__init__, subsequent episodes in carry-over mode start withcurrent_hourat its final value from the previous episode (336).This causes two problems:
- After episode 1,
current_hourremains at 336 and increments to 337, 338, etc. in episode 2.- The termination check at line 495 (
if self.metrics.current_hour == EPISODE_HOURS) will never trigger again, causing episodes to never terminate after the first one.- The episode increment check at line 299 (
if self.metrics.current_hour == 1) also fails in subsequent episodes sincecurrent_hournever resets to 1.In non-carry-over mode,
reset()callsreset_state_metrics()which correctly resetscurrent_hour = 0, avoiding this issue.
🤖 Fix all issues with AI agents
In `@src/plotter.py`:
- Around line 90-93: The metric calculations currently read attributes directly
from env (completion_rate, baseline_completion_rate, avg_wait,
baseline_avg_wait) but those values live under env.metrics; update each
expression to use env.metrics.jobs_completed, env.metrics.jobs_submitted,
env.metrics.total_job_wait_time and the baseline counterparts
(env.metrics.baseline_jobs_completed, env.metrics.baseline_jobs_submitted,
env.metrics.baseline_total_job_wait_time) and keep the same guard logic (getattr
checks or existence checks) to avoid division by zero.
🧹 Nitpick comments (12)
src/environment.py (1)
218-265: Significant code duplication betweenreset()andreset_timeline_state().Lines 238-263 in
reset()duplicate the state initialization logic fromreset_timeline_state()(lines 270-294). This violates DRY and creates maintenance risk—if state structure changes, both locations must be updated.♻️ Suggested refactor to eliminate duplication
Consider extracting the common state initialization to a private helper method:
+ def _init_state_structures(self): + """Initialize state structures for nodes, queues, and job tracking.""" + self.state = { + 'nodes': np.zeros(MAX_NODES, dtype=np.int32), + 'job_queue': np.zeros((MAX_QUEUE_SIZE * 4), dtype=np.int32), + 'predicted_prices': self.prices.predicted_prices.copy(), + } + self.baseline_state = { + 'nodes': np.zeros(MAX_NODES, dtype=np.int32), + 'job_queue': np.zeros((MAX_QUEUE_SIZE * 4), dtype=np.int32), + } + self.cores_available = np.full(MAX_NODES, CORES_PER_NODE, dtype=np.int32) + self.baseline_cores_available = np.full(MAX_NODES, CORES_PER_NODE, dtype=np.int32) + self.running_jobs = {} + self.baseline_running_jobs = {} + self.next_job_id = 0 + self.next_empty_slot = 0 + self.baseline_next_empty_slot = 0 def reset_timeline_state(self): self.metrics.current_hour = 0 - self.state = { - # ... all the initialization code ... - } - # ... rest of initialization ... + self._init_state_structures() def reset(self, seed=None, options=None): # ... existing code ... if not self.carry_over_state: # ... price reset logic ... - self.state = { - # ... all the initialization code ... - } - # ... rest of initialization ... + self._init_state_structures()src/plotter.py (3)
46-58: Potential off-by-one in monthly percentage calculation.The monthly percentage is calculated every 2 episodes (assuming 2-week episodes = 1 month), but the values are duplicated twice (
extend([pct, pct])). This creates step-like visualization but may cause array length mismatches. The padding logic at lines 66-70 handles this, but it's fragile.Consider documenting this behavior more clearly or simplifying by computing monthly percentages at the end rather than during iteration.
81-81: Unused parameterepisode_costsshould be removed.The docstring at line 84-85 acknowledges this is kept for backward compatibility. Consider removing it entirely if it's truly unused, or add
_ = episode_coststo silence the warning if backward compatibility is required.♻️ Proposed fix
-def plot_dashboard(env, num_hours, max_nodes, episode_costs=None, save=True, show=True, suffix=""): +def plot_dashboard(env, num_hours, max_nodes, save=True, show=True, suffix=""): """ Per-hour dashboard: price, nodes, queue, reward components, etc. - NOTE: episode_costs is accepted for backwards compatibility but NOT used here anymore. - Cumulative savings now lives in plot_cumulative_savings(). """Or if backward compatibility is truly needed:
def plot_dashboard(env, num_hours, max_nodes, episode_costs=None, save=True, show=True, suffix=""): + del episode_costs # Unused, kept for backward compatibility
290-290: Ambiguous variable namel.The variable
lcan be confused with1(one) in certain fonts. Use a more descriptive name.♻️ Proposed fix
- labels = [l.get_label() for l in lines] + labels = [line.get_label() for line in lines]train.py (4)
156-159: Remove extraneous f-prefix from plain string.Line 156 uses an f-string without any placeholders.
Suggested fix
- print(f"Starting a new model training...") + print("Starting a new model training...")
259-275: Avoid hardcodingmax_nodes=335; retrieve from environment.The hardcoded value will break if the environment's node count changes. The commented-out code shows the right approach.
Suggested fix
plot_dashboard( env, num_hours=args.dashboard_hours, - # max_nodes=env.max_nodes if hasattr(env, "max_nodes") else env.num_nodes if hasattr(env, "num_nodes") else 0, - max_nodes=335, + max_nodes=getattr(env, "max_nodes", getattr(env, "num_nodes", 335)),
287-295: Minor issues in training loop instrumentation.
- Line 288-289: The progress log is confusingly worded - it says "at {iters + 1}" but
iterswas already incremented on line 286.- Line 294: f-string without placeholders on
tb_log_name=f"PPO".Suggested fixes
iters += 1 t0 = time.time() - if (iters+1)%10==0: - print(f"Running... at {iters + 1} of {STEPS_PER_ITERATION * (iters + 1)} steps") + if iters % 10 == 0: + print(f"Running iteration {iters}, cumulative steps: {STEPS_PER_ITERATION * iters}") if args.iter_limit > 0 and iters > args.iter_limit: print(f"iterations limit ({args.iter_limit}) reached: {iters}.") break try: - model.learn(total_timesteps=STEPS_PER_ITERATION, reset_num_timesteps=False, tb_log_name=f"PPO", callback=ComputeClusterCallback()) + model.learn(total_timesteps=STEPS_PER_ITERATION, reset_num_timesteps=False, tb_log_name="PPO", callback=ComputeClusterCallback())
152-159: Consider increasingn_stepsor documenting the rationale for the small value.The
n_steps=64is significantly smaller than stable-baselines3's default of 2048. For this environment with long episodes (336 hours), SB3 best practices recommend largern_steps(1024–2048) to reduce correlated data and improve sample efficiency. The current configuration results in ~5 updates per episode, compared to ~1 per 6 episodes with the default, which increases variance and may impact learning stability. If the intent is faster iteration for development/testing, consider adding a comment explaining the choice or making it configurable via command-line argument.train_iter.py (1)
175-178: Consider improving readability of long function call.Line 178 is quite long and could benefit from line breaks for readability.
Suggested improvement
for combo in combinations: efficiency_weight, price_weight, idle_weight, job_age_weight, drop_weight = combo print(f"Running with weights: efficiency={efficiency_weight}, price={price_weight}, idle={idle_weight}, job_age={job_age_weight}, drop={drop_weight}") - run(efficiency_weight, price_weight, idle_weight, job_age_weight, drop_weight, args.iter_limit_per_step, args.session, args.prices, args.job_durations, args.jobs, args.hourly_jobs,plot_dashboard=args.plot_dashboard,dashboard_hours=args.dashboard_hours) + run( + efficiency_weight, price_weight, idle_weight, job_age_weight, drop_weight, + args.iter_limit_per_step, args.session, + args.prices, args.job_durations, args.jobs, args.hourly_jobs, + plot_dashboard=args.plot_dashboard, + dashboard_hours=args.dashboard_hours, + )test/test_sanity_env.py (3)
48-101: Consider using imported constants instead of hardcoded values.The invariant checks use hardcoded values (170, 96, 168, 16) that should match environment constants. The comment at line 84 acknowledges this, but since you already import
MAX_JOB_DURATION,MAX_NODES_PER_JOB,CORES_PER_NODE, etc., consider using them for consistency.Example for one check
- assert np.all((dur >= 0) & (dur <= 170)), f"duration out of bounds min={dur.min()} max={dur.max()}" + assert np.all((dur >= 0) & (dur <= MAX_JOB_DURATION)), f"duration out of bounds min={dur.min()} max={dur.max()}"
243-246: Fix indentation of nested function.The
norm_pathfunction has inconsistent indentation (extra leading space).Suggested fix
- # Train.py passes strings; the env treats "" as falsy in some places and truthy in others. + # Train.py passes strings; the env treats "" as falsy in some places and truthy in others. # To be safe: normalize "" -> None here. def norm_path(x): return None if (x is None or str(x).strip() == "") else x
336-359: Consider removing or documenting inline debugging test.Lines 336-359 appear to be inline debugging/verification code that duplicates the formal
determinism_test. If this is intentional for quick sanity checks, consider adding a comment explaining its purpose, or removing it if it's leftover debugging code.
📜 Review details
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (8)
src/baseline.pysrc/environment.pysrc/job_management.pysrc/metrics_tracker.pysrc/plotter.pytest/test_sanity_env.pytrain.pytrain_iter.py
🧰 Additional context used
📓 Path-based instructions (2)
train.py
📄 CodeRabbit inference engine (CLAUDE.md)
train.py: Implement stable-baselines3 PPO integration in the training script with tensorboard logging and model checkpointing every 100K steps
Structure evaluation output to include per-episode metrics (cost, savings, completion rate, wait time, queue size) and cumulative analysis (total savings, monthly reduction, annual projections)
Files:
train.py
{train.py,weights.py}
📄 CodeRabbit inference engine (CLAUDE.md)
Use weighted reward components for training: efficiency-weight (default 0.7), price-weight (default 0.2), idle-weight (default 0.1), and job-age-weight (default 0.0)
Files:
train.py
🧠 Learnings (7)
📓 Common learnings
Learnt from: CR
Repo: FairRootGroup/powersched PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-04T08:51:13.779Z
Learning: Applies to train.py : Structure evaluation output to include per-episode metrics (cost, savings, completion rate, wait time, queue size) and cumulative analysis (total savings, monthly reduction, annual projections)
📚 Learning: 2025-12-04T08:51:13.779Z
Learnt from: CR
Repo: FairRootGroup/powersched PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-04T08:51:13.779Z
Learning: Applies to environment.py : Set environment constraints as: 2-week episodes (336 hours), max 1000 jobs in queue, max 1500 new jobs per hour, max 170 hours job runtime
Applied to files:
src/job_management.pysrc/environment.py
📚 Learning: 2025-12-04T08:51:13.779Z
Learnt from: CR
Repo: FairRootGroup/powersched PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-04T08:51:13.779Z
Learning: Applies to {train.py,weights.py} : Use weighted reward components for training: efficiency-weight (default 0.7), price-weight (default 0.2), idle-weight (default 0.1), and job-age-weight (default 0.0)
Applied to files:
train_iter.py
📚 Learning: 2025-12-04T08:51:13.779Z
Learnt from: CR
Repo: FairRootGroup/powersched PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-04T08:51:13.779Z
Learning: Applies to train.py : Structure evaluation output to include per-episode metrics (cost, savings, completion rate, wait time, queue size) and cumulative analysis (total savings, monthly reduction, annual projections)
Applied to files:
train_iter.pytrain.pysrc/plotter.pysrc/environment.pysrc/metrics_tracker.py
📚 Learning: 2025-12-04T08:51:13.779Z
Learnt from: CR
Repo: FairRootGroup/powersched PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-04T08:51:13.779Z
Learning: Applies to train.py : Implement stable-baselines3 PPO integration in the training script with tensorboard logging and model checkpointing every 100K steps
Applied to files:
train.py
📚 Learning: 2025-12-04T08:51:13.779Z
Learnt from: CR
Repo: FairRootGroup/powersched PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-04T08:51:13.779Z
Learning: Applies to environment.py : Configure cluster parameters as: 335 nodes maximum, 96 cores per node, up to 16 nodes per job, 150W idle power, 450W active power per node
Applied to files:
src/environment.py
📚 Learning: 2025-12-04T08:51:13.779Z
Learnt from: CR
Repo: FairRootGroup/powersched PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-04T08:51:13.779Z
Learning: Applies to sampler_hourly.py : Sample job characteristics independently (job count, duration, nodes, cores-per-node) from hour-of-day distributions in the Hourly Sampler
Applied to files:
src/environment.py
🧬 Code graph analysis (3)
train_iter.py (1)
src/plotter.py (1)
plot_dashboard(81-222)
train.py (2)
src/plotter.py (2)
plot_dashboard(81-222)plot_cumulative_savings(225-316)src/plot.py (1)
plot_cumulative_savings(136-276)
src/environment.py (3)
src/metrics_tracker.py (2)
reset_episode_metrics(43-80)reset_state_metrics(38-41)test/test_sanity_env.py (1)
reset(321-327)src/prices_deterministic.py (1)
reset(52-76)
🪛 Ruff (0.14.11)
test/test_sanity_env.py
126-126: Unpacked variable obs is never used
Prefix it with an underscore or any other dummy variable pattern
(RUF059)
150-150: Unpacked variable obs is never used
Prefix it with an underscore or any other dummy variable pattern
(RUF059)
150-150: Unpacked variable r is never used
Prefix it with an underscore or any other dummy variable pattern
(RUF059)
150-150: Unpacked variable info is never used
Prefix it with an underscore or any other dummy variable pattern
(RUF059)
281-281: Unpacked variable nodes is never used
Prefix it with an underscore or any other dummy variable pattern
(RUF059)
281-281: Unpacked variable prices is never used
Prefix it with an underscore or any other dummy variable pattern
(RUF059)
341-341: Unpacked variable o1 is never used
Prefix it with an underscore or any other dummy variable pattern
(RUF059)
342-342: Unpacked variable t1 is never used
Prefix it with an underscore or any other dummy variable pattern
(RUF059)
342-342: Unpacked variable tr1 is never used
Prefix it with an underscore or any other dummy variable pattern
(RUF059)
344-344: Unpacked variable o2 is never used
Prefix it with an underscore or any other dummy variable pattern
(RUF059)
345-345: Unpacked variable t2 is never used
Prefix it with an underscore or any other dummy variable pattern
(RUF059)
345-345: Unpacked variable tr2 is never used
Prefix it with an underscore or any other dummy variable pattern
(RUF059)
383-383: Unpacked variable r is never used
Prefix it with an underscore or any other dummy variable pattern
(RUF059)
383-383: Unpacked variable info is never used
Prefix it with an underscore or any other dummy variable pattern
(RUF059)
train_iter.py
119-119: subprocess call: check for execution of untrusted input
(S603)
train.py
156-156: f-string without any placeholders
Remove extraneous f prefix
(F541)
276-276: Do not catch blind exception: Exception
(BLE001)
294-294: f-string without any placeholders
Remove extraneous f prefix
(F541)
317-317: Do not catch blind exception: Exception
(BLE001)
src/plotter.py
81-81: Unused function argument: episode_costs
(ARG001)
290-290: Ambiguous variable name: l
(E741)
🔇 Additional comments (23)
src/environment.py (3)
175-176: Initialization order may cause issues withself.prices.
reset_timeline_state()at line 175 accessesself.prices.predicted_prices(line 276), butself.pricesis initialized at line 121, so this should work. However, the call order means the state is initialized twice—once inreset_timeline_state()during__init__, and again inreset()when the environment is first reset. This is intentional for the carry-over flow but worth noting.
331-333: LGTM - Episode-level metrics propagation is consistent.The episode-level metrics (
episode_jobs_submitted,episode_jobs_rejected_queue_full,episode_jobs_completed,episode_total_job_wait_time,episode_jobs_dropped) are correctly incremented alongside their cumulative counterparts, maintaining parallel tracking.Also applies to: 377-379
401-402: LGTM - Baseline and cost episode metrics are properly tracked.Episode-level baseline metrics and costs are correctly propagated from the
baseline_metricsdict and accumulated intoepisode_total_cost,episode_baseline_cost, andepisode_baseline_cost_off.Also applies to: 437-448, 446-447, 468-470
src/metrics_tracker.py (3)
7-14: LGTM - Clean initialization pattern.The constructor properly separates cumulative metrics initialization from state/episode metrics, establishing a clear hierarchy.
62-76: LGTM - Comprehensive episode-level metrics.The episode-level metrics for both agent and baseline are well-structured and mirror the cumulative metrics, enabling accurate per-episode reporting.
93-101: LGTM - Correct episode-level calculations.The calculations for
avg_wait_time,completion_rate, anddrop_rateproperly use episode-level counters with appropriate zero-division guards.src/job_management.py (2)
159-167: LGTM - Backward-compatible episode metrics updates.The presence checks (
if 'episode_*' in metrics) allow this function to work with both old and new metrics dictionaries, maintaining backward compatibility.
187-193: LGTM - Consistent pattern for drop metrics.The episode-level drop tracking mirrors the completion tracking pattern, with proper presence checks for both baseline and non-baseline paths.
src/baseline.py (2)
42-46: LGTM - Episode metrics for baseline submissions.Correctly increments
episode_baseline_jobs_submittedandepisode_baseline_jobs_rejected_queue_fullwith presence checks for backward compatibility.
61-63: LGTM - Episode max queue size tracking.Properly tracks per-episode maximum queue size for baseline, with the inner condition correctly checking against the episode-level threshold.
src/plotter.py (2)
8-16: LGTM - Robust series normalization helper.The
_as_seriesfunction correctly handlesNone, arrays of varying lengths, and padding withNaNvalues.
225-316: LGTM - Comprehensive cumulative savings visualization.The function properly handles dual-axis plotting, combines legends, and provides a summary statistics box. The return value with computed statistics is useful for programmatic access.
train.py (4)
5-11: LGTM!Import changes are appropriate -
plot_dashboardandplot_cumulative_savingsfrom the new plotter module, andtimefor performance instrumentation.
60-62: LGTM!New CLI options for dashboard plotting are well-defined with sensible defaults (336 hours = 2 weeks).
298-318: LGTM on dashboard plotting integration.The exception handling around
plot_dashboardis appropriate for non-fatal plotting operations. UsingExceptionhere is acceptable since you want to catch any plotting failure without crashing training.
218-251: Good implementation of evaluation metrics.The evaluation output correctly includes per-episode metrics (cost, savings, completion rate, wait time, queue size) and cumulative analysis (total savings, monthly reduction, annual projections), aligning with the coding guidelines.
train_iter.py (2)
84-98: LGTM!The extended function signature with
plot_dashboardanddashboard_hoursparameters is clean and maintains backward compatibility with defaults.
114-122: LGTM on subprocess forwarding.The subprocess call constructs commands from user-provided CLI arguments, which is appropriate for this local training orchestration tool. The Ruff S603 warning is a false positive in this context.
test/test_sanity_env.py (5)
1-11: Good documentation and usage examples.The docstrings provide clear usage examples for different test scenarios (quick run, full belt-and-braces, with external data).
111-142: Well-designed determinism test.The test correctly pins
price_start_indexto ensure external price variation doesn't affect determinism checking. The unusedobsvariable from unpacking (line 126) is a common pattern in RL code and can be safely ignored.
144-177: Excellent carry-over continuity test.This test directly validates the PR's core objective - carrying state between episodes rather than resetting to zero. The comprehensive snapshot comparison ensures all critical state components (nodes, job_queue, predicted_prices, running_jobs, price_index, etc.) are preserved across resets.
377-377: Clarify whycheck_obs_is_copyis commented out.If there's a known issue with observation aliasing, consider adding a TODO or comment explaining why this check is disabled.
398-406: Good test orchestration structure.The conditional test execution (gym compliance → invariants → determinism → carry-over) provides flexibility for different testing scenarios.
✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.
| def reset_state_metrics(self): | ||
| """Reset timeline-dependent metrics.""" | ||
| self.current_hour = 0 | ||
| self.reset_episode_metrics() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
reset_state_metrics resets current_hour but reset_episode_metrics does not.
This is the expected behavior for the non-carry-over case, but confirms the issue noted in environment.py: in carry-over mode, calling only reset_episode_metrics() will not reset current_hour, causing episode boundary detection to fail.
If carry-over mode is intended to span multiple episodes without resetting time, then current_hour should be reset separately in reset() when carry_over_state=True:
🐛 Proposed fix in environment.py reset()
def reset(self, seed=None, options=None):
# ...
# Reset metrics
if self.carry_over_state:
+ self.metrics.current_hour = 0 # Reset hour for new episode
self.metrics.reset_episode_metrics()
else:
self.metrics.reset_state_metrics()2e15d30 to
2c9ade7
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 4
🤖 Fix all issues with AI agents
In `@src/environment.py`:
- Around line 175-177: The constructor calls reset_timeline_state() before
Prices.reset(), so reset_timeline_state() tries to copy
self.prices.predicted_prices while it's still None; fix by invoking
self.prices.reset(start_index=0) before calling self.reset_timeline_state() in
__init__ (same order used in reset()), ensuring predicted_prices is initialized
before reset_timeline_state() uses self.prices.predicted_prices.copy().
In `@src/plotter.py`:
- Around line 81-86: The parameter episode_costs on plot_dashboard is
intentionally unused but triggers Ruff ARG001; silence the warning by explicitly
using it in the function body (e.g., assign it to a throwaway variable or delete
it) so the signature stays the same but the linter treats it as used; update the
body of plot_dashboard to include a line like "_ = episode_costs" (or "del
episode_costs") immediately after the docstring.
In `@test/test_sanity_env.py`:
- Around line 118-137: In the rollout() function and other places where you
unpack but don't use values (e.g., the env.reset and env.step returns and
variables like obs, r, info, nodes, prices), prefix those unused variable names
with an underscore (e.g., _obs, _r, _info, _nodes, _prices) so the linter RUF059
is satisfied; update every unpacking site (including the env.reset(...) return
and env.step(...) tuples in rollout and the other similar blocks referenced) to
use the underscore-prefixed names consistently.
- Around line 166-176: Update the carry-over test to expect metrics.current_hour
to be reset to 0 after calling ComputeClusterEnv.reset() instead of matching the
saved snapshot; specifically modify the assertion that references
env.metrics.current_hour (in test/test_sanity_env.py) so it asserts
env.metrics.current_hour == 0 (reflecting the reset behavior in
ComputeClusterEnv.reset) rather than comparing to snapshot["current_hour"].
♻️ Duplicate comments (1)
src/plotter.py (1)
89-179: Useenv.metricsfor costs, job counts, and time‑series (current code can crash or show zeros).
plot_dashboard()still readsenv.jobs_completed,env.total_cost,env.price_stats, etc., but these live onenv.metrics. This will either raiseAttributeErroror render empty panels. Please route all metric reads throughmetrics.🛠️ Proposed fix
- base_cost = float(getattr(env, "baseline_cost", 0.0)) - base_cost_off = float(getattr(env, "baseline_cost_off", 0.0)) - agent_cost = float(getattr(env, "total_cost", 0.0)) + base_cost = float(getattr(metrics, "baseline_cost", 0.0)) + base_cost_off = float(getattr(metrics, "baseline_cost_off", 0.0)) + agent_cost = float(getattr(metrics, "total_cost", 0.0)) @@ - f"Jobs: {env.jobs_completed}/{env.jobs_submitted} ({completion_rate:.0f}%, wait={avg_wait:.1f}h, Q={env.max_queue_size_reached}) | " - f"Base: {env.baseline_jobs_completed}/{env.baseline_jobs_submitted} ({baseline_completion_rate:.0f}%, wait={baseline_avg_wait:.1f}h, Q={env.baseline_max_queue_size_reached})" + f"Jobs: {getattr(metrics, 'jobs_completed', 0)}/{getattr(metrics, 'jobs_submitted', 0)} " + f"({completion_rate:.0f}%, wait={avg_wait:.1f}h, Q={getattr(metrics, 'max_queue_size_reached', 0)}) | " + f"Base: {getattr(metrics, 'baseline_jobs_completed', 0)}/{getattr(metrics, 'baseline_jobs_submitted', 0)} " + f"({baseline_completion_rate:.0f}%, wait={baseline_avg_wait:.1f}h, Q={getattr(metrics, 'baseline_max_queue_size_reached', 0)})" @@ - add_panel("Electricity price", getattr(env, "price_stats", None), "€/MWh", None) + add_panel("Electricity price", getattr(metrics, "price_stats", None), "€/MWh", None) @@ - add_panel("Online nodes", getattr(env, "on_nodes", None), "count", (0, max_nodes * 1.1)) + add_panel("Online nodes", getattr(metrics, "on_nodes", None), "count", (0, max_nodes * 1.1)) @@ - add_panel("Used nodes", getattr(env, "used_nodes", None), "count", (0, max_nodes)) + add_panel("Used nodes", getattr(metrics, "used_nodes", None), "count", (0, max_nodes)) @@ - add_panel("Job queue & running jobs", - getattr(env, "job_queue_sizes", None), + add_panel("Job queue & running jobs", + getattr(metrics, "job_queue_sizes", None), "jobs", None, overlay=("Running jobs", running_series), ) @@ - add_panel("Efficiency reward (%)", getattr(env, "eff_rewards", None), "score", None) + add_panel("Efficiency reward (%)", getattr(metrics, "eff_rewards", None), "score", None) @@ - add_panel("Price reward (%)", getattr(env, "price_rewards", None), "score", None) + add_panel("Price reward (%)", getattr(metrics, "price_rewards", None), "score", None) @@ - add_panel("Idle penalty (%)", getattr(env, "idle_penalties", None), "score", None) + add_panel("Idle penalty (%)", getattr(metrics, "idle_penalties", None), "score", None) @@ - add_panel("Job-age penalty (%)", getattr(env, "job_age_penalties", None), "score", None) + add_panel("Job-age penalty (%)", getattr(metrics, "job_age_penalties", None), "score", None) @@ - add_panel("Total reward", getattr(env, "rewards", None), "reward", None) + add_panel("Total reward", getattr(metrics, "rewards", None), "reward", None)
🧹 Nitpick comments (1)
src/plotter.py (1)
305-308: Minor readability: avoid single‑letterlin list comprehension.Ruff flags this as ambiguous (E741).
✏️ Suggested tweak
- labels = [l.get_label() for l in lines] + labels = [line.get_label() for line in lines]
📜 Review details
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (6)
src/baseline.pysrc/environment.pysrc/job_management.pysrc/metrics_tracker.pysrc/plotter.pytest/test_sanity_env.py
🚧 Files skipped from review as they are similar to previous changes (2)
- src/baseline.py
- src/job_management.py
🧰 Additional context used
🧠 Learnings (5)
📓 Common learnings
Learnt from: CR
Repo: FairRootGroup/powersched PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-04T08:51:13.779Z
Learning: Applies to train.py : Structure evaluation output to include per-episode metrics (cost, savings, completion rate, wait time, queue size) and cumulative analysis (total savings, monthly reduction, annual projections)
📚 Learning: 2025-12-04T08:51:13.779Z
Learnt from: CR
Repo: FairRootGroup/powersched PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-04T08:51:13.779Z
Learning: Applies to environment.py : Set environment constraints as: 2-week episodes (336 hours), max 1000 jobs in queue, max 1500 new jobs per hour, max 170 hours job runtime
Applied to files:
src/environment.py
📚 Learning: 2025-12-04T08:51:13.779Z
Learnt from: CR
Repo: FairRootGroup/powersched PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-04T08:51:13.779Z
Learning: Applies to environment.py : Configure cluster parameters as: 335 nodes maximum, 96 cores per node, up to 16 nodes per job, 150W idle power, 450W active power per node
Applied to files:
src/environment.py
📚 Learning: 2025-12-04T08:51:13.779Z
Learnt from: CR
Repo: FairRootGroup/powersched PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-04T08:51:13.779Z
Learning: Applies to sampler_hourly.py : Sample job characteristics independently (job count, duration, nodes, cores-per-node) from hour-of-day distributions in the Hourly Sampler
Applied to files:
src/environment.py
📚 Learning: 2025-12-04T08:51:13.779Z
Learnt from: CR
Repo: FairRootGroup/powersched PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-04T08:51:13.779Z
Learning: Applies to train.py : Structure evaluation output to include per-episode metrics (cost, savings, completion rate, wait time, queue size) and cumulative analysis (total savings, monthly reduction, annual projections)
Applied to files:
src/environment.pysrc/plotter.pysrc/metrics_tracker.py
🧬 Code graph analysis (1)
src/environment.py (3)
src/metrics_tracker.py (2)
reset_episode_metrics(43-80)reset_state_metrics(38-41)test/test_sanity_env.py (1)
reset(321-327)src/prices_deterministic.py (1)
reset(52-76)
🪛 Ruff (0.14.11)
test/test_sanity_env.py
126-126: Unpacked variable obs is never used
Prefix it with an underscore or any other dummy variable pattern
(RUF059)
150-150: Unpacked variable obs is never used
Prefix it with an underscore or any other dummy variable pattern
(RUF059)
150-150: Unpacked variable r is never used
Prefix it with an underscore or any other dummy variable pattern
(RUF059)
150-150: Unpacked variable info is never used
Prefix it with an underscore or any other dummy variable pattern
(RUF059)
281-281: Unpacked variable nodes is never used
Prefix it with an underscore or any other dummy variable pattern
(RUF059)
281-281: Unpacked variable prices is never used
Prefix it with an underscore or any other dummy variable pattern
(RUF059)
341-341: Unpacked variable o1 is never used
Prefix it with an underscore or any other dummy variable pattern
(RUF059)
342-342: Unpacked variable t1 is never used
Prefix it with an underscore or any other dummy variable pattern
(RUF059)
342-342: Unpacked variable tr1 is never used
Prefix it with an underscore or any other dummy variable pattern
(RUF059)
344-344: Unpacked variable o2 is never used
Prefix it with an underscore or any other dummy variable pattern
(RUF059)
345-345: Unpacked variable t2 is never used
Prefix it with an underscore or any other dummy variable pattern
(RUF059)
345-345: Unpacked variable tr2 is never used
Prefix it with an underscore or any other dummy variable pattern
(RUF059)
383-383: Unpacked variable r is never used
Prefix it with an underscore or any other dummy variable pattern
(RUF059)
383-383: Unpacked variable info is never used
Prefix it with an underscore or any other dummy variable pattern
(RUF059)
src/plotter.py
81-81: Unused function argument: episode_costs
(ARG001)
307-307: Ambiguous variable name: l
(E741)
✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.
| self.reset_timeline_state() | ||
| self.metrics.reset_episode_metrics() | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
# First, find the Prices class and check __init__
rg -n "class Prices" src/ -A 20 | head -50Repository: FairRootGroup/powersched
Length of output: 2426
🏁 Script executed:
# Also check the environment.py file at the mentioned lines
sed -n '160,190p' src/environment.pyRepository: FairRootGroup/powersched
Length of output: 1410
🏁 Script executed:
# And check lines 268-277 as mentioned
sed -n '260,285p' src/environment.pyRepository: FairRootGroup/powersched
Length of output: 1050
🏁 Script executed:
# Now let's look at the reset_timeline_state implementation
rg -n "def reset_timeline_state" src/environment.py -A 10Repository: FairRootGroup/powersched
Length of output: 577
🏁 Script executed:
# Search for reset method in Prices class
rg -n "def reset" src/prices_deterministic.py -A 15Repository: FairRootGroup/powersched
Length of output: 796
🏁 Script executed:
# Check the full environment.__init__ to see the complete sequence
rg -n "def __init__" src/environment.py -A 80 | head -120Repository: FairRootGroup/powersched
Length of output: 3707
🏁 Script executed:
# Also check if Prices.reset() is called anywhere in environment.__init__
rg -n "self.prices.reset" src/environment.pyRepository: FairRootGroup/powersched
Length of output: 183
🏁 Script executed:
# Get full context around these critical lines
sed -n '115,250p' src/environment.py | cat -nRepository: FairRootGroup/powersched
Length of output: 6999
reset_timeline_state() attempts to copy self.prices.predicted_prices while it is still None.
In __init__ at line 175, reset_timeline_state() is called before Prices.reset() is invoked. However, Prices.__init__ initializes predicted_prices = None (line 19), and it only gets populated when Prices.reset() is called. This causes AttributeError: 'NoneType' object has no attribute 'copy' at line 277 when trying to execute self.prices.predicted_prices.copy().
Call self.prices.reset(start_index=0) before self.reset_timeline_state() in __init__, matching the pattern used in the reset() method (lines 120–123).
🤖 Prompt for AI Agents
In `@src/environment.py` around lines 175 - 177, The constructor calls
reset_timeline_state() before Prices.reset(), so reset_timeline_state() tries to
copy self.prices.predicted_prices while it's still None; fix by invoking
self.prices.reset(start_index=0) before calling self.reset_timeline_state() in
__init__ (same order used in reset()), ensuring predicted_prices is initialized
before reset_timeline_state() uses self.prices.predicted_prices.copy().
| def plot_dashboard(env, num_hours, max_nodes, episode_costs=None, save=True, show=True, suffix=""): | ||
| """ | ||
| Per-hour dashboard: price, nodes, queue, reward components, etc. | ||
| NOTE: episode_costs is accepted for backwards compatibility but NOT used here anymore. | ||
| Cumulative savings now lives in plot_cumulative_savings(). | ||
| """ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Silence Ruff’s unused-parameter warning for episode_costs.
The parameter is kept for backwards compatibility but is unused, so Ruff (ARG001) will complain. Mark it intentionally unused to keep lint clean without breaking the signature.
🧹 Proposed fix
def plot_dashboard(env, num_hours, max_nodes, episode_costs=None, save=True, show=True, suffix=""):
"""
Per-hour dashboard: price, nodes, queue, reward components, etc.
NOTE: episode_costs is accepted for backwards compatibility but NOT used here anymore.
Cumulative savings now lives in plot_cumulative_savings().
"""
+ _ = episode_costs # backward-compat placeholder📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| def plot_dashboard(env, num_hours, max_nodes, episode_costs=None, save=True, show=True, suffix=""): | |
| """ | |
| Per-hour dashboard: price, nodes, queue, reward components, etc. | |
| NOTE: episode_costs is accepted for backwards compatibility but NOT used here anymore. | |
| Cumulative savings now lives in plot_cumulative_savings(). | |
| """ | |
| def plot_dashboard(env, num_hours, max_nodes, episode_costs=None, save=True, show=True, suffix=""): | |
| """ | |
| Per-hour dashboard: price, nodes, queue, reward components, etc. | |
| NOTE: episode_costs is accepted for backwards compatibility but NOT used here anymore. | |
| Cumulative savings now lives in plot_cumulative_savings(). | |
| """ | |
| _ = episode_costs # backward-compat placeholder |
🧰 Tools
🪛 Ruff (0.14.11)
81-81: Unused function argument: episode_costs
(ARG001)
🤖 Prompt for AI Agents
In `@src/plotter.py` around lines 81 - 86, The parameter episode_costs on
plot_dashboard is intentionally unused but triggers Ruff ARG001; silence the
warning by explicitly using it in the function body (e.g., assign it to a
throwaway variable or delete it) so the signature stays the same but the linter
treats it as used; update the body of plot_dashboard to include a line like "_ =
episode_costs" (or "del episode_costs") immediately after the docstring.
| def rollout(): | ||
| env = make_env() | ||
| # Pin external price window so determinism doesn't vary by episode. | ||
| obs, _ = env.reset(seed=seed, options={"price_start_index": 0}) | ||
| traj = [] | ||
| done = False | ||
| i = 0 | ||
| while not done and i < n_steps: | ||
| obs, r, term, trunc, info = env.step(actions[i]) | ||
| # record a small fingerprint | ||
| traj.append(( | ||
| float(r), | ||
| float(info.get("step_cost", 0.0)), | ||
| int(info.get("num_unprocessed_jobs", -1)), | ||
| int(info.get("num_on_nodes", -1)), | ||
| int(info.get("dropped_this_episode", -1)), | ||
| )) | ||
| done = term or trunc | ||
| i += 1 | ||
| env.close() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ruff RUF059: unused unpacked variables.
Several unpacked values aren’t used (e.g., obs, r, info, nodes, prices). Prefix them with _ to keep lint clean.
🧹 Suggested clean‑up
- obs, r, term, trunc, info = env.step(actions[i])
+ _obs, r, term, trunc, info = env.step(actions[i])
@@
- obs, r, term, trunc, info = env.step(action)
+ _obs, _r, term, trunc, _info = env.step(action)
@@
- nodes, q, prices = _extract(obs)
+ _nodes, q, _prices = _extract(obs)
@@
- o1, _ = env.reset(seed=seed)
+ _o1, _ = env.reset(seed=seed)
- o1s, r1, t1, tr1, i1 = env.step(action)
+ o1s, r1, _t1, _tr1, i1 = env.step(action)
@@
- o2, _ = env.reset(seed=seed)
+ _o2, _ = env.reset(seed=seed)
- o2s, r2, t2, tr2, i2 = env.step(action)
+ o2s, r2, _t2, _tr2, i2 = env.step(action)
@@
- obs, r, term, trunc, info = env.step(action)
+ obs, _r, term, trunc, _info = env.step(action)Also applies to: 145-152, 281-283, 341-346, 383-384
🧰 Tools
🪛 Ruff (0.14.11)
126-126: Unpacked variable obs is never used
Prefix it with an underscore or any other dummy variable pattern
(RUF059)
🤖 Prompt for AI Agents
In `@test/test_sanity_env.py` around lines 118 - 137, In the rollout() function
and other places where you unpack but don't use values (e.g., the env.reset and
env.step returns and variables like obs, r, info, nodes, prices), prefix those
unused variable names with an underscore (e.g., _obs, _r, _info, _nodes,
_prices) so the linter RUF059 is satisfied; update every unpacking site
(including the env.reset(...) return and env.step(...) tuples in rollout and the
other similar blocks referenced) to use the underscore-prefixed names
consistently.
2b0533d to
adf7f7c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@test/test_sanity_env.py`:
- Around line 376-378: The test was commented out because observations alias
env.state; fix by returning an isolated copy of the environment state in both
reset() and step() so callers cannot mutate internal state. Locate the reset()
and step() methods (they currently return self.state directly) and change them
to return a defensive copy (e.g., use copy.deepcopy(self.state) or an
appropriate shallow copy if state is a simple dict/ndarray) for the observation
value; ensure any other places that currently expose self.state (observation
construction paths) also return copies so check_obs_is_copy will pass.
♻️ Duplicate comments (4)
src/plotter.py (1)
81-86: Silence the unusedepisode_costsparameter to avoid lint noise.
Ruff flags ARG001 here; the placeholder line keeps the API stable without warnings.♻️ Proposed fix
def plot_dashboard(env, num_hours, max_nodes, episode_costs=None, save=True, show=True, suffix=""): """ Per-hour dashboard: price, nodes, queue, reward components, etc. NOTE: episode_costs is accepted for backwards compatibility but NOT used here anymore. Cumulative savings now lives in plot_cumulative_savings(). """ + _ = episode_costs # backward-compat placeholdertest/test_sanity_env.py (2)
166-177: Verify intended behavior: shouldcurrent_hourpersist or reset in carry-over mode?Line 176 asserts that
env.metrics.current_hourequals the snapshotted value afterreset(). However, semantically,current_hourtracks hours within an episode and is used for episode termination (line 495 in environment.py:if self.metrics.current_hour == EPISODE_HOURS).If
current_hourshould reset to 0 per episode (even in carry-over mode), update the assertion:#!/bin/bash # Check what reset_episode_metrics does to current_hour rg -n "current_hour" src/metrics_tracker.py -B2 -A2
144-152: Prefix unused unpacked variables with_to satisfy linter.Multiple locations have unused unpacked variables flagged by Ruff RUF059 (lines 150, 281, 341-345, 383). Prefix them with
_for lint compliance.src/environment.py (1)
175-177:reset_timeline_state()is called beforeself.prices.reset(), causingAttributeError.At line 175,
reset_timeline_state()is invoked, which at line 276 attemptsself.prices.predicted_prices.copy(). However,Prices.__init__initializespredicted_prices = None, and it only gets populated whenPrices.reset()is called. Since noself.prices.reset()call precedes line 175, this will raiseAttributeError: 'NoneType' object has no attribute 'copy'.🐛 Proposed fix
+ self.prices.reset(start_index=0) self.reset_timeline_state() self.metrics.reset_episode_metrics()
🧹 Nitpick comments (3)
src/plotter.py (1)
306-308: Avoid ambiguous variable namel.
This trips Ruff E741; a clearer name improves readability.♻️ Proposed refactor
- labels = [l.get_label() for l in lines] + labels = [line.get_label() for line in lines]train.py (1)
259-265: Avoid hard-codedmax_nodes=335for dashboard plots.
Use env/config values so plots stay accurate if cluster size changes.♻️ Proposed refactor
- max_nodes=335, + max_nodes=getattr(env, "max_nodes", getattr(env, "num_nodes", 0)),Also applies to: 300-305
test/test_sanity_env.py (1)
336-359: Consider removing or guarding the debug comparison block.Lines 336-359 appear to be development/debug code that runs unconditionally, printing comparison results. Consider either:
- Removing it if no longer needed
- Guarding it behind a
--verboseor--debugflag♻️ Suggested guard
+ if args.render == "human": # or add --debug flag seed = 123 action = np.array([1, 0], dtype=np.int64) ... print("info.current_price", i1.get("current_price"), i2.get("current_price"))
| check_invariants(env, obs) | ||
| #check_obs_is_copy(env, obs) | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
# First, let's locate and read the test file around the commented line
find . -name "test_sanity_env.py" -type fRepository: FairRootGroup/powersched
Length of output: 93
🏁 Script executed:
# Search for the check_obs_is_copy function definition
rg -n "def check_obs_is_copy" -A 10Repository: FairRootGroup/powersched
Length of output: 750
🏁 Script executed:
# Search for all usages and references to check_obs_is_copy
rg -n "check_obs_is_copy" -B 2 -A 2Repository: FairRootGroup/powersched
Length of output: 658
🏁 Script executed:
# Search for environment files and how observations are returned
fd -name "*environment*" -o -name "*env*" | grep -E "\.(py)$" | head -20Repository: FairRootGroup/powersched
Length of output: 298
🏁 Script executed:
# Read the test file around line 376-378 with more context
sed -n '360,385p' test/test_sanity_env.pyRepository: FairRootGroup/powersched
Length of output: 1026
🏁 Script executed:
# Search for environment implementations
find . -name "*.py" -type f | xargs grep -l "def reset\|def step" | grep -v test | head -10Repository: FairRootGroup/powersched
Length of output: 165
🏁 Script executed:
# Look for how observations are created/returned
rg -n "def reset\|def step" --type py | grep -v test | head -20Repository: FairRootGroup/powersched
Length of output: 50
🏁 Script executed:
# Search for observation creation in the environment
rg -n "observation\|obs.*=" --type py -i | grep -v test | head -30Repository: FairRootGroup/powersched
Length of output: 50
🏁 Script executed:
# Read the environment implementation - look for reset and step methods
grep -n "def reset\|def step" src/environment.pyRepository: FairRootGroup/powersched
Length of output: 189
🏁 Script executed:
# Get the full reset and step implementations
sed -n '1,50p' src/environment.pyRepository: FairRootGroup/powersched
Length of output: 1496
🏁 Script executed:
# Look for observation-related code in environment
rg -n "observation\|return.*obs\|self.obs" src/environment.py -A 2 | head -50Repository: FairRootGroup/powersched
Length of output: 50
🏁 Script executed:
# Check the actual structure - read the environment file
wc -l src/environment.pyRepository: FairRootGroup/powersched
Length of output: 90
🏁 Script executed:
# Read the reset method implementation
sed -n '207,266p' src/environment.pyRepository: FairRootGroup/powersched
Length of output: 2674
🏁 Script executed:
# Read the step method implementation
sed -n '296,350p' src/environment.pyRepository: FairRootGroup/powersched
Length of output: 3394
🏁 Script executed:
# Check if there's any .copy() or deepcopy usage
rg -n "\.copy\(\)|deepcopy" src/environment.pyRepository: FairRootGroup/powersched
Length of output: 217
🏁 Script executed:
# Read the step method return statement
sed -n '296,400p' src/environment.py | tail -100Repository: FairRootGroup/powersched
Length of output: 6201
🏁 Script executed:
# Search for the full step method - get line count up to next method or end
sed -n '296,$p' src/environment.py | head -120Repository: FairRootGroup/powersched
Length of output: 7332
🏁 Script executed:
# Check if step returns self.state directly
rg -n "return.*state" src/environment.py -A 1 -B 1Repository: FairRootGroup/powersched
Length of output: 191
🏁 Script executed:
# Look for any comments or git history around line 377 in test file
git log -p --all -S "check_obs_is_copy" -- test/test_sanity_env.py 2>/dev/null | head -100Repository: FairRootGroup/powersched
Length of output: 50
Uncomment and fix the observation aliasing bug: observations must not alias env.state.
The check_obs_is_copy function is commented out because it fails — the environment returns direct references to self.state instead of copies (see reset() line 265 and step() line 529). This allows users to corrupt internal state by modifying the returned observation.
Return a properly isolated copy of state from both reset() and step() methods.
🤖 Prompt for AI Agents
In `@test/test_sanity_env.py` around lines 376 - 378, The test was commented out
because observations alias env.state; fix by returning an isolated copy of the
environment state in both reset() and step() so callers cannot mutate internal
state. Locate the reset() and step() methods (they currently return self.state
directly) and change them to return a defensive copy (e.g., use
copy.deepcopy(self.state) or an appropriate shallow copy if state is a simple
dict/ndarray) for the observation value; ensure any other places that currently
expose self.state (observation construction paths) also return copies so
check_obs_is_copy will pass.
adf7f7c to
9a8b7f6
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🤖 Fix all issues with AI agents
In `@src/plotter.py`:
- Around line 306-307: Rename the ambiguous loop variable `l` in the list
comprehension that builds `labels` to a clearer identifier (e.g., `line` or
`item`) to satisfy Ruff E741; locate the expression where `lines` is created
from `line1`, `line1b`, `line2`, `line2b` and change the comprehension
`[l.get_label() for l in lines]` to use the clearer name so it reads like
`[line.get_label() for line in lines]` (or similar) ensuring callers of `labels`
remain unaffected.
- Around line 119-124: The header f-string currently references deprecated
top-level attributes (env.session, env.current_episode, env.current_step,
env.weights, env.jobs_completed, env.jobs_submitted, env.max_queue_size_reached,
env.baseline_jobs_completed, env.baseline_jobs_submitted,
env.baseline_max_queue_size_reached) and should be switched to read from
env.metrics with safe fallbacks; update the header construction (the variable
named header) to pull values from env.metrics (e.g. env.metrics.get('session'),
.get('current_episode'), .get('current_step'), .get('weights'),
.get('jobs_completed'), .get('jobs_submitted'), .get('max_queue_size_reached'),
.get('baseline_jobs_completed'), .get('baseline_jobs_submitted'),
.get('baseline_max_queue_size_reached')) and use sensible defaults when keys are
missing to avoid AttributeError or stale values, keeping the same formatting and
percentage calculations but sourcing all counters/averages from env.metrics.
♻️ Duplicate comments (7)
src/plotter.py (1)
81-86: Silence unusedepisode_coststo keep Ruff clean.
The parameter is intentionally unused but still triggers ARG001.🧹 Proposed fix
def plot_dashboard(env, num_hours, max_nodes, episode_costs=None, save=True, show=True, suffix=""): """ Per-hour dashboard: price, nodes, queue, reward components, etc. NOTE: episode_costs is accepted for backwards compatibility but NOT used here anymore. Cumulative savings now lives in plot_cumulative_savings(). """ + _ = episode_costs # backward-compat placeholdersrc/environment.py (3)
175-177:reset_timeline_state()readspredicted_pricesbefore prices are reset.
IfPrices.__init__doesn’t populatepredicted_prices, this raises at Line 276. Initialize prices before copying.🐛 Proposed fix
- self.reset_timeline_state() + self.prices.reset(start_index=0) + self.reset_timeline_state() self.metrics.reset_episode_metrics()
218-221: Carry‑over mode doesn’t resetcurrent_hour, so episodes won’t terminate after the first.
step()only truncates whencurrent_hour == EPISODE_HOURS. Without resetting, future episodes never end.🐛 Proposed fix
if self.carry_over_state: + self.metrics.current_hour = 0 # reset per-episode hour self.metrics.reset_episode_metrics() else: self.metrics.reset_state_metrics()
265-265: Return defensive copies of observations to avoid state aliasing.
Returningself.statedirectly allows callers to mutate internal state.🐛 Proposed fix
+ def _copy_obs(self): + return {k: v.copy() for k, v in self.state.items()} ... - return self.state, {} + return self._copy_obs(), {} ... - return self.state, step_reward, terminated, truncated, info + return self._copy_obs(), step_reward, terminated, truncated, infoAlso applies to: 529-529
test/test_sanity_env.py (3)
118-137: Prefix unused unpacked variables with_to satisfy Ruff RUF059.🧹 Suggested clean‑up
- _obs, r, term, trunc, info = env.step(actions[i]) + _obs, r, term, trunc, _info = env.step(actions[i]) ... - nodes, q, prices = _extract(obs) + _nodes, q, _prices = _extract(obs) ... - o1, _ = env.reset(seed=seed) + _o1, _ = env.reset(seed=seed) - o1s, r1, t1, tr1, i1 = env.step(action) + o1s, r1, _t1, _tr1, i1 = env.step(action) ... - o2, _ = env.reset(seed=seed) + _o2, _ = env.reset(seed=seed) - o2s, r2, t2, tr2, i2 = env.step(action) + o2s, r2, _t2, _tr2, i2 = env.step(action) ... - obs, r, term, trunc, info = env.step(action) + obs, _r, term, trunc, _info = env.step(action)Also applies to: 145-152, 281-283, 341-346, 383-384
166-176: Carry‑over test should expectcurrent_hourto reset per episode.
This aligns with the episode‑length termination logic.🧪 Proposed fix
- assert env.metrics.current_hour == snapshot["current_hour"], "carry-over failed: current_hour reset" + assert env.metrics.current_hour == 0, "carry-over failed: current_hour should reset per episode"
371-378: Re‑enable the copy‑safety check once observations are defensive copies.
Right now it’s commented out, so the aliasing bug can regress silently.
src/plotter.py
Outdated
| header = ( | ||
| f"{env.session} | ep:{env.current_episode} step:{env.current_step} | {env.weights}\n" | ||
| f"Cost: €{agent_cost:.0f}, Base: €{base_cost:.0f} (+{base_cost - agent_cost:.0f}, {pct_vs_base:.1f}%), " | ||
| f"Base_Off: €{base_cost_off:.0f} (+{base_cost_off - agent_cost:.0f}, {pct_vs_base_off:.1f}%)\n" | ||
| f"Jobs: {env.jobs_completed}/{env.jobs_submitted} ({completion_rate:.0f}%, wait={avg_wait:.1f}h, Q={env.max_queue_size_reached}) | " | ||
| f"Base: {env.baseline_jobs_completed}/{env.baseline_jobs_submitted} ({baseline_completion_rate:.0f}%, wait={baseline_avg_wait:.1f}h, Q={env.baseline_max_queue_size_reached})" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Header still reads env.* counters that are now tracked under env.metrics.
This can raise AttributeError or show stale values depending on how metrics are surfaced. Prefer metrics (with safe fallbacks) consistently.
🐛 Proposed fix
- header = (
+ jobs_completed = getattr(metrics, "jobs_completed", getattr(env, "jobs_completed", 0))
+ jobs_submitted = getattr(metrics, "jobs_submitted", getattr(env, "jobs_submitted", 0))
+ max_q = getattr(metrics, "max_queue_size_reached", getattr(env, "max_queue_size_reached", 0))
+ baseline_jobs_completed = getattr(metrics, "baseline_jobs_completed", getattr(env, "baseline_jobs_completed", 0))
+ baseline_jobs_submitted = getattr(metrics, "baseline_jobs_submitted", getattr(env, "baseline_jobs_submitted", 0))
+ baseline_max_q = getattr(metrics, "baseline_max_queue_size_reached", getattr(env, "baseline_max_queue_size_reached", 0))
+
+ header = (
f"{env.session} | ep:{env.current_episode} step:{env.current_step} | {env.weights}\n"
f"Cost: €{agent_cost:.0f}, Base: €{base_cost:.0f} (+{base_cost - agent_cost:.0f}, {pct_vs_base:.1f}%), "
f"Base_Off: €{base_cost_off:.0f} (+{base_cost_off - agent_cost:.0f}, {pct_vs_base_off:.1f}%)\n"
- f"Jobs: {env.jobs_completed}/{env.jobs_submitted} ({completion_rate:.0f}%, wait={avg_wait:.1f}h, Q={env.max_queue_size_reached}) | "
- f"Base: {env.baseline_jobs_completed}/{env.baseline_jobs_submitted} ({baseline_completion_rate:.0f}%, wait={baseline_avg_wait:.1f}h, Q={env.baseline_max_queue_size_reached})"
+ f"Jobs: {jobs_completed}/{jobs_submitted} ({completion_rate:.0f}%, wait={avg_wait:.1f}h, Q={max_q}) | "
+ f"Base: {baseline_jobs_completed}/{baseline_jobs_submitted} ({baseline_completion_rate:.0f}%, wait={baseline_avg_wait:.1f}h, Q={baseline_max_q})"
)🤖 Prompt for AI Agents
In `@src/plotter.py` around lines 119 - 124, The header f-string currently
references deprecated top-level attributes (env.session, env.current_episode,
env.current_step, env.weights, env.jobs_completed, env.jobs_submitted,
env.max_queue_size_reached, env.baseline_jobs_completed,
env.baseline_jobs_submitted, env.baseline_max_queue_size_reached) and should be
switched to read from env.metrics with safe fallbacks; update the header
construction (the variable named header) to pull values from env.metrics (e.g.
env.metrics.get('session'), .get('current_episode'), .get('current_step'),
.get('weights'), .get('jobs_completed'), .get('jobs_submitted'),
.get('max_queue_size_reached'), .get('baseline_jobs_completed'),
.get('baseline_jobs_submitted'), .get('baseline_max_queue_size_reached')) and
use sensible defaults when keys are missing to avoid AttributeError or stale
values, keeping the same formatting and percentage calculations but sourcing all
counters/averages from env.metrics.
| lines = line1 + line1b + line2 + line2b | ||
| labels = [l.get_label() for l in lines] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rename ambiguous l to avoid Ruff E741.
🧹 Proposed fix
- labels = [l.get_label() for l in lines]
+ labels = [line.get_label() for line in lines]📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| lines = line1 + line1b + line2 + line2b | |
| labels = [l.get_label() for l in lines] | |
| lines = line1 + line1b + line2 + line2b | |
| labels = [line.get_label() for line in lines] |
🧰 Tools
🪛 Ruff (0.14.13)
307-307: Ambiguous variable name: l
(E741)
🤖 Prompt for AI Agents
In `@src/plotter.py` around lines 306 - 307, Rename the ambiguous loop variable
`l` in the list comprehension that builds `labels` to a clearer identifier
(e.g., `line` or `item`) to satisfy Ruff E741; locate the expression where
`lines` is created from `line1`, `line1b`, `line2`, `line2b` and change the
comprehension `[l.get_label() for l in lines]` to use the clearer name so it
reads like `[line.get_label() for line in lines]` (or similar) ensuring callers
of `labels` remain unaffected.
9a8b7f6 to
a7f9921
Compare
… now checks for determinism and carry-over continuitiy. Add carry-over state handling in the refactored environment by initializing timeline state on construction and skipping resets when enabled, while separating cumulative from episode-scoped metrics. Update MetricsTracker to track episode_* counters, propagate those counters through job assignment and baseline steps, and ensure episode summaries use per-episode values. Extend sanity tests and metrics usage to align with the refactor (metrics.current_hour, carry-over continuity checks).
212475f to
5399354
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@src/metrics_tracker.py`:
- Around line 15-36: The reset_cumulative_metrics method resets numeric counters
but does not clear the episode_costs history; modify reset_cumulative_metrics
(in class where reset_cumulative_metrics is defined) to also clear the
cumulative history by resetting self.episode_costs to an empty list (or calling
self.episode_costs.clear()), and similarly clear any other cumulative history
lists if present (e.g., baseline_episode_costs) to avoid mixing old episode
history with fresh counters.
♻️ Duplicate comments (3)
src/environment.py (2)
175-176: Reset order still riskspredicted_pricesbeingNone.
reset_timeline_state()copiesself.prices.predicted_prices, butPrices.reset()isn’t called in__init__before this, sopredicted_pricesmay still beNone. Consider initializing prices first.🐛 Proposed fix
- self.reset_timeline_state() + self.prices.reset(start_index=0) + self.reset_timeline_state()
267-267: Return defensive copies of observations.
reset()andstep()still returnself.statedirectly, so callers can mutate env internals. Please return copies.🛡️ Proposed fix
- return self.state, {} + obs = {k: (v.copy() if hasattr(v, "copy") else v) for k, v in self.state.items()} + return obs, {} @@ - return self.state, step_reward, terminated, truncated, info + obs = {k: (v.copy() if hasattr(v, "copy") else v) for k, v in self.state.items()} + return obs, step_reward, terminated, truncated, infoAlso applies to: 532-532
test/test_sanity_env.py (1)
144-152: Rename unused step outputs to satisfy RUF059.
obs,r, andinfoaren’t used in the loop. Prefix with_to keep Ruff clean.🧹 Proposed fix
- obs, _ = env.reset(seed=seed) + _obs, _ = env.reset(seed=seed) @@ - obs, r, term, trunc, info = env.step(action) + _obs, _r, term, trunc, _info = env.step(action)
| def reset_cumulative_metrics(self): | ||
| """Reset metrics that persist across episodes.""" | ||
| # Job tracking metrics for agent (cumulative across episodes) | ||
| # Cost tracking (cumulative across episodes) | ||
| self.total_cost = 0 | ||
| self.baseline_cost = 0 | ||
| self.baseline_cost_off = 0 | ||
|
|
||
| # Agent job metrics (cumulative across episodes) | ||
| self.jobs_submitted = 0 | ||
| self.jobs_completed = 0 | ||
| self.total_job_wait_time = 0 | ||
| self.max_queue_size_reached = 0 | ||
| self.jobs_dropped = 0 | ||
| self.jobs_rejected_queue_full = 0 | ||
|
|
||
| # Job tracking metrics for baseline (cumulative across episodes) | ||
| # Baseline job metrics (cumulative across episodes) | ||
| self.baseline_jobs_submitted = 0 | ||
| self.baseline_jobs_completed = 0 | ||
| self.baseline_total_job_wait_time = 0 | ||
| self.baseline_max_queue_size_reached = 0 | ||
| self.baseline_jobs_dropped = 0 | ||
| self.baseline_jobs_rejected_queue_full = 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reset cumulative history list alongside counters.
reset_cumulative_metrics() resets totals but leaves episode_costs intact, which can mix old episode history with fresh counters if this method is reused mid-run. Consider clearing it here.
🧹 Proposed fix
def reset_cumulative_metrics(self):
"""Reset metrics that persist across episodes."""
@@
self.baseline_jobs_rejected_queue_full = 0
+ self.episode_costs = []📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| def reset_cumulative_metrics(self): | |
| """Reset metrics that persist across episodes.""" | |
| # Job tracking metrics for agent (cumulative across episodes) | |
| # Cost tracking (cumulative across episodes) | |
| self.total_cost = 0 | |
| self.baseline_cost = 0 | |
| self.baseline_cost_off = 0 | |
| # Agent job metrics (cumulative across episodes) | |
| self.jobs_submitted = 0 | |
| self.jobs_completed = 0 | |
| self.total_job_wait_time = 0 | |
| self.max_queue_size_reached = 0 | |
| self.jobs_dropped = 0 | |
| self.jobs_rejected_queue_full = 0 | |
| # Job tracking metrics for baseline (cumulative across episodes) | |
| # Baseline job metrics (cumulative across episodes) | |
| self.baseline_jobs_submitted = 0 | |
| self.baseline_jobs_completed = 0 | |
| self.baseline_total_job_wait_time = 0 | |
| self.baseline_max_queue_size_reached = 0 | |
| self.baseline_jobs_dropped = 0 | |
| self.baseline_jobs_rejected_queue_full = 0 | |
| def reset_cumulative_metrics(self): | |
| """Reset metrics that persist across episodes.""" | |
| # Cost tracking (cumulative across episodes) | |
| self.total_cost = 0 | |
| self.baseline_cost = 0 | |
| self.baseline_cost_off = 0 | |
| # Agent job metrics (cumulative across episodes) | |
| self.jobs_submitted = 0 | |
| self.jobs_completed = 0 | |
| self.total_job_wait_time = 0 | |
| self.max_queue_size_reached = 0 | |
| self.jobs_dropped = 0 | |
| self.jobs_rejected_queue_full = 0 | |
| # Baseline job metrics (cumulative across episodes) | |
| self.baseline_jobs_submitted = 0 | |
| self.baseline_jobs_completed = 0 | |
| self.baseline_total_job_wait_time = 0 | |
| self.baseline_max_queue_size_reached = 0 | |
| self.baseline_jobs_dropped = 0 | |
| self.baseline_jobs_rejected_queue_full = 0 | |
| self.episode_costs = [] |
🤖 Prompt for AI Agents
In `@src/metrics_tracker.py` around lines 15 - 36, The reset_cumulative_metrics
method resets numeric counters but does not clear the episode_costs history;
modify reset_cumulative_metrics (in class where reset_cumulative_metrics is
defined) to also clear the cumulative history by resetting self.episode_costs to
an empty list (or calling self.episode_costs.clear()), and similarly clear any
other cumulative history lists if present (e.g., baseline_episode_costs) to
avoid mixing old episode history with fresh counters.
Note: Not tested "in production" yet. But Continuity checks between consecutive episodes have passed so far.