Feature/test nan by paulf81 · Pull Request #223 · NatLabRockies/hercules

paulf81 · 2026-03-03T18:22:23Z

PR #222 proposed to fill powers but per discussion this might not be expected behavior from users. Prefer that user's decide for themselves what do with NaNs. Still, user's may be unaware of NaNs in data and what Hercules should do with them when they are in certain key signals is ambiguous. For example, if one turbine in 100 has a NaN value, the total power for that time step is also NaN.

This PR address the issue by throwing an error:

Wind speeds input to WindFarm are NaN
Wind turbine powers input to WindFarmSCADAPower are NaN

A test for each new condition is added.

Copilot

Pull request overview

This PR changes WindFarm and WindFarmSCADAPower initialization behavior to fail fast when key input signals contain NaN values, avoiding ambiguous downstream NaN propagation in aggregate power calculations.

Changes:

Add NaN validation for wind input files used by WindFarm and raise ValueError when required wind columns contain NaNs.
Add NaN validation for SCADA power input files used by WindFarmSCADAPower and raise ValueError when required power columns contain NaNs.
Add/extend pytest coverage to assert these new error conditions.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 3 comments.

File	Description
`hercules/plant_components/wind_farm.py`	Adds NaN checks on key wind input columns during initialization.
`hercules/plant_components/wind_farm_scada_power.py`	Adds NaN checks on key SCADA columns during initialization.
`tests/wind_farm_direct_test.py`	Adds a test asserting WindFarm raises when wind input contains NaNs.
`tests/wind_farm_scada_power_test.py`	Adds a test asserting WindFarmSCADAPower raises when power columns contain NaNs.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-03-03T18:25:35Z

hercules/plant_components/wind_farm.py

+        if df_wi[nan_check_cols].isna().any().any():
+            raise ValueError(
+                "wind input file contains NaN values in required columns (time_utc, wd_mean, ws_*)"


The NaN validation only checks ws_* columns, but this class also supports wind input files that provide wind speeds via ws_mean when per-turbine ws_### columns are absent. As written, NaNs in ws_mean will not be caught and can still propagate into the simulation. Consider including ws_mean in nan_check_cols (at least when ws_000 is not present) so the new behavior covers all supported wind-speed inputs.

Suggested change

if df_wi[nan_check_cols].isna().any().any():

raise ValueError(

"wind input file contains NaN values in required columns (time_utc, wd_mean, ws_*)"

# If no per-turbine wind speed columns are present, fall back to validating ws_mean

if not ws_cols and "ws_mean" in df_wi.columns and "ws_mean" not in nan_check_cols:

nan_check_cols.append("ws_mean")

if df_wi[nan_check_cols].isna().any().any():

raise ValueError(

"wind input file contains NaN values in required columns (time_utc, wd_mean, ws_*, ws_mean)"

Copilot · 2026-03-03T18:25:36Z

hercules/plant_components/wind_farm_scada_power.py

+        # Check key columns for NaN values
+        pow_cols = sorted(col for col in df_scada.columns if col.startswith("pow_"))


pow_cols can be empty (e.g., malformed SCADA file with no pow_* columns). In that case initialization later will fail with an IndexError when indexing self.power_columns[0] to infer rated power. Since this block is already validating the SCADA schema, consider explicitly erroring when no pow_* columns are present (and ideally mention which required columns are missing).

Suggested change

# Check key columns for NaN values

pow_cols = sorted(col for col in df_scada.columns if col.startswith("pow_"))

# Identify power columns and check key columns for NaN values

pow_cols = sorted(col for col in df_scada.columns if col.startswith("pow_"))

if not pow_cols:

raise ValueError(

"SCADA file must contain at least one power column with name starting with 'pow_'."

)

Copilot · 2026-03-03T18:25:36Z

tests/wind_farm_direct_test.py

+            WindFarm(test_h_dict)
+    finally:
+        if os.path.exists(temp_wind_file):
+            os.unlink(temp_wind_file)


This new NaN test covers NaNs in per-turbine ws_### columns, but WindFarm also supports the ws_mean-only input format (see existing tests in dynamic/precomputed modes). If the intent is to reject NaNs in wind speeds generally, add a test case where ws_mean is present (and ws_### columns are absent) and contains NaN, asserting the same ValueError is raised.

Suggested change

os.unlink(temp_wind_file)

os.unlink(temp_wind_file)

def test_wind_farm_raises_on_nan_in_wind_input_ws_mean_only():

"""Test that WindFarm raises ValueError when ws_mean-only wind input contains NaN values."""

wind_data = {

"time_utc": [

"2018-05-10 12:31:00",

"2018-05-10 12:31:01",

"2018-05-10 12:31:02",

"2018-05-10 12:31:03",

"2018-05-10 12:31:04",

"2018-05-10 12:31:05",

"2018-05-10 12:31:06",

"2018-05-10 12:31:07",

"2018-05-10 12:31:08",

"2018-05-10 12:31:09",

"2018-05-10 12:31:10",

],

"wd_mean": [180.5, 185.2, 190.8, 175.3, 170.1, 165.7, 160.4, 155.9, 150.2, 145.6, 140.3],

# ws_mean-only input format (no per-turbine ws_### columns), with a NaN value

"ws_mean": [8.2, np.nan, 7.8, 6.5, 10.2, 11.5, 9.8, 8.7, 7.3, 6.9, 8.4],

}

with tempfile.NamedTemporaryFile(mode="w", suffix=".csv", delete=False) as f:

pd.DataFrame(wind_data).to_csv(f.name, index=False)

temp_wind_file = f.name

try:

test_h_dict = copy.deepcopy(h_dict_wind_direct)

test_h_dict["wind_farm"]["wind_input_filename"] = temp_wind_file

with pytest.raises(ValueError, match="wind input file contains NaN values"):

WindFarm(test_h_dict)

finally:

if os.path.exists(temp_wind_file):

os.unlink(temp_wind_file)

I think it's ok as is

paulf81 added 3 commits March 3, 2026 11:18

test for nan in power

639daf0

test for ws nans

e6db33d

Add tests

3fd9869

paulf81 requested review from Copilot, genevievestarke and misi9170 March 3, 2026 18:22

paulf81 self-assigned this Mar 3, 2026

paulf81 mentioned this pull request Mar 3, 2026

Fill NaN values in WindFarmSCADAPower #222

Closed

Copilot started reviewing on behalf of paulf81 March 3, 2026 18:22 View session

Copilot AI reviewed Mar 3, 2026

View reviewed changes

Add checks

5e39f5a

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature/test nan#223

Feature/test nan#223
paulf81 wants to merge 4 commits intoNatLabRockies:developfrom
paulf81:feature/test_nan

paulf81 commented Mar 3, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Mar 3, 2026

Uh oh!

paulf81 Mar 3, 2026

Uh oh!

Copilot AI Mar 3, 2026

Uh oh!

paulf81 Mar 3, 2026

Uh oh!

Copilot AI Mar 3, 2026

Uh oh!

paulf81 Mar 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		# Check key columns for NaN values
		pow_cols = sorted(col for col in df_scada.columns if col.startswith("pow_"))

-            os.unlink(temp_wind_file)
+            os.unlink(temp_wind_file)
+def test_wind_farm_raises_on_nan_in_wind_input_ws_mean_only():
+    """Test that WindFarm raises ValueError when ws_mean-only wind input contains NaN values."""
+    wind_data = {
+        "time_utc": [
+            "2018-05-10 12:31:00",
+            "2018-05-10 12:31:01",
+            "2018-05-10 12:31:02",
+            "2018-05-10 12:31:03",
+            "2018-05-10 12:31:04",
+            "2018-05-10 12:31:05",
+            "2018-05-10 12:31:06",
+            "2018-05-10 12:31:07",
+            "2018-05-10 12:31:08",
+            "2018-05-10 12:31:09",
+            "2018-05-10 12:31:10",
+        ],
+        "wd_mean": [180.5, 185.2, 190.8, 175.3, 170.1, 165.7, 160.4, 155.9, 150.2, 145.6, 140.3],
+        # ws_mean-only input format (no per-turbine ws_### columns), with a NaN value
+        "ws_mean": [8.2, np.nan, 7.8, 6.5, 10.2, 11.5, 9.8, 8.7, 7.3, 6.9, 8.4],
+    }
+    with tempfile.NamedTemporaryFile(mode="w", suffix=".csv", delete=False) as f:
+        pd.DataFrame(wind_data).to_csv(f.name, index=False)
+        temp_wind_file = f.name
+    try:
+        test_h_dict = copy.deepcopy(h_dict_wind_direct)
+        test_h_dict["wind_farm"]["wind_input_filename"] = temp_wind_file
+        with pytest.raises(ValueError, match="wind input file contains NaN values"):
+            WindFarm(test_h_dict)
+    finally:
+        if os.path.exists(temp_wind_file):
+            os.unlink(temp_wind_file)

Conversation

paulf81 commented Mar 3, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Mar 3, 2026

Choose a reason for hiding this comment

Uh oh!

paulf81 Mar 3, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 3, 2026

Choose a reason for hiding this comment

Uh oh!

paulf81 Mar 3, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 3, 2026

Choose a reason for hiding this comment

Uh oh!

paulf81 Mar 3, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants