Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
26 commits
Select commit Hold shift + click to select a range
06fe5fe
initial benchmarking and inference layer
RishikeshRanade Apr 2, 2026
ba7ed71
adding visualization layer and improving readme
RishikeshRanade Apr 2, 2026
35adf9d
adding line plot visualization
RishikeshRanade Apr 2, 2026
480655c
fixing issues with visualization and merging workflows
RishikeshRanade Apr 2, 2026
d6915d0
refactoring nim evaluation
RishikeshRanade Apr 3, 2026
37e48ed
adding headers
RishikeshRanade Apr 3, 2026
08166df
adding caching capability and updating docstrings
RishikeshRanade Apr 3, 2026
f3ba6d5
refactoring code
RishikeshRanade Apr 6, 2026
336f3c1
adding distributed calculation and cleaning up
RishikeshRanade Apr 6, 2026
c84c911
Merge pull request #2 from RishikeshRanade/visualization-layer
RishikeshRanade Apr 6, 2026
15296e3
renaming example and adding matrix evaluation configs
RishikeshRanade Apr 6, 2026
fca40cd
Revise README for model evaluation and benchmarking
ram-cherukuri Apr 9, 2026
e8769ad
Revise README for OOB Benchmarking section
ram-cherukuri Apr 10, 2026
4d4703b
domain-scoped metrics, aggregate volume visual, and naming cleanup
ktangsali Apr 10, 2026
362ee52
Revise README for clarity and customization options
ram-cherukuri Apr 10, 2026
cb8f8ac
Update README for benchmarking workflow sections
ram-cherukuri Apr 10, 2026
d0c37f4
update api
ktangsali Apr 10, 2026
0124bee
remove xmgn and fgnet volume, because they don't exist
ktangsali Apr 10, 2026
eb0be99
add notebooks after validation
ktangsali Apr 10, 2026
be2d3f6
add last notebook
ktangsali Apr 10, 2026
a4d49a7
use pnemo functionals for knn
ktangsali Apr 14, 2026
d9948fe
add deprecation notice
ktangsali Apr 14, 2026
f172a8a
add files for DrivAerML
ktangsali Apr 14, 2026
b8a3e6e
cleaning up readme, adding ci tests and contributing details
RishikeshRanade Apr 14, 2026
2dd3806
add tutorial notebook on adding a dataset adaptor
ktangsali Apr 17, 2026
101f476
add notebook showing adding of a new model
ktangsali Apr 21, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
170 changes: 170 additions & 0 deletions .cursor/skills/create-dataset-adapter/SKILL.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,170 @@
---
name: create-dataset-adapter
description: >-
Create a new dataset adapter for the PhysicsNeMo CFD benchmarking workflow.
Use when the user wants to add a new CFD dataset, write a DatasetAdapter,
integrate a new mesh format, or benchmark models on custom data.
---

# Create a Dataset Adapter

Guide the user through adding a new CFD dataset to the benchmarking workflow by writing a `DatasetAdapter` subclass.

## Reference files to read first

Before starting, read these files for context:

- `physicsnemo/cfd/evaluation/datasets/adapter_registry.py` — base class and registry
- `physicsnemo/cfd/evaluation/datasets/schema.py` — `CanonicalCase` and `build_predictions_dict`
- `physicsnemo/cfd/evaluation/datasets/adapters/drivaerml.py` — reference adapter implementation
- `workflows/benchmarking_workflow/notebooks/adding_a_new_dataset.ipynb` — end-to-end tutorial

## Step 1: Explore the new dataset

Ask the user for the dataset path, then inspect one file:

```python
import pyvista as pv
mesh = pv.read("<path_to_one_file>")
print(f"Type: {type(mesh).__name__}, Points: {mesh.n_points}, Cells: {mesh.n_cells}")
print(f"Cell arrays: {list(mesh.cell_data.keys())}")
print(f"Point arrays: {list(mesh.point_data.keys())}")
```

Identify these differences from the canonical schema:

| Question | What to look for |
|----------|-----------------|
| File format | `.vtp`, `.vtu`, `.vtk`, or other? Model wrappers expect `.vtp` (surface) or `.vtu` (volume) XML format. |
| Directory layout | Flat directory? Nested `run_<id>/` dirs? How are case IDs derived from filenames? |
| Pressure field name | The canonical key is `pressure`. What is the VTK array name? |
| WSS field name | The canonical key is `shear_stress` (N, 3). Is it a single vector or separate scalar components? |
| Sign conventions | Compare field ranges with DrivAerML. Are normals, WSS, or pressure flipped? |
| Extra arrays | Are there explicit `Normals` or `Area` arrays? DrivAerML has none — remove them if present. |
| STL files | Are separate STL geometry files available? If not, the surface mesh itself is the geometry. |
| Inference domain | Surface (`.vtp`) or volume (`.vtu`)? |

## Step 2: Write the adapter class

Subclass `DatasetAdapter` with these methods:

```python
from pathlib import Path
from physicsnemo.cfd.evaluation.datasets.adapter_registry import DatasetAdapter, register_adapter
from physicsnemo.cfd.evaluation.datasets.schema import CanonicalCase

class MyDatasetAdapter(DatasetAdapter):
def __init__(self, root: str, **kwargs):
self._root = Path(root)

@classmethod
def inference_domain_from_kwargs(cls, kwargs=None):
return "surface" # or "volume"

def list_cases(self, split=None):
# Return list of case ID strings
...

def load_case(self, case_id: str) -> CanonicalCase:
# 1. Read the mesh file
# 2. Build ground_truth dict with canonical keys:
# - "pressure": np.float32 array
# - "shear_stress": np.float32 array of shape (N, 3)
# For volume: "pressure", "velocity" (N,3), "turbulent_viscosity"
# 3. Return CanonicalCase(case_id, mesh_path, mesh_type, ground_truth, inference_domain)
...
```

### Common transformations in `load_case`

**Format conversion** (legacy `.vtk` → `.vtp`):
```python
mesh = pv.read(vtk_path).extract_surface()
mesh.save(vtp_path)
```

**Combining separate WSS scalars into a vector:**
```python
wss = np.stack([mesh.cell_data["WSSx"], mesh.cell_data["WSSy"], mesh.cell_data["WSSz"]], axis=1)
```

**Removing explicit Normals/Area** (DrivAerML convention):
```python
for key in ["Normals", "Area"]:
if key in mesh.cell_data:
del mesh.cell_data[key]
```

**Creating STL from surface mesh** (when no STL is shipped):
```python
mesh.extract_surface().triangulate().save(stl_path)
```

The STL must be named `drivaer_{int(case_id)}.stl` in the same directory as the VTP for the model wrappers to find it.

### Caching pattern

Do expensive conversions lazily and cache:

```python
def _prepare_case(self, case_id):
prepared_path = self._root / "_prepared" / f"{case_id}.vtp"
if not prepared_path.exists():
# ... convert and save
return str(prepared_path)
```

## Step 3: Register and test

```python
register_adapter("my_dataset", MyDatasetAdapter)

adapter = MyDatasetAdapter(root="/path/to/data")
cases = adapter.list_cases()
case = adapter.load_case(cases[0])
assert case.ground_truth is not None
assert "pressure" in case.ground_truth
```

## Step 4: Run inference and benchmark

Build a config and run:

```python
from physicsnemo.cfd.evaluation.config import Config
from physicsnemo.cfd.evaluation.benchmarks.engine import run_benchmark

config = Config.from_dict({
"run": {"device": "cuda:0", "output_dir": "results"},
"model": {"name": "<model_name>", "inference_domain": "<surface|volume>", ...},
"dataset": {"name": "my_dataset", "root": "/path/to/data", "case_ids": cases[:2]},
"output": {
"ground_truth_mesh_field_names": {"pressure": "<vtk_gt_name>", "shear_stress": "<vtk_gt_name>"},
"mesh_field_names": {"pressure": "<vtk_pred_name>", "shear_stress": "<vtk_pred_name>"},
},
"metrics": ["l2_pressure", "l2_shear_stress", "drag", "lift"],
"reports": {"enabled": False},
})
results = run_benchmark(config)
```

## Step 5: Make permanent (optional)

Save the adapter to `physicsnemo/cfd/evaluation/datasets/adapters/<name>.py` and register in `adapters/__init__.py`:

```python
from physicsnemo.cfd.evaluation.datasets.adapters.<name> import MyDatasetAdapter
register_adapter("my_dataset", MyDatasetAdapter)
```

## Why conventions must match the training data

The field name mappings, sign conventions, and format conversions in the adapter exist because the model checkpoint was trained on a specific dataset (e.g., DrivAerML) with specific conventions. The adapter bridges the gap between the new dataset's conventions and the training data's conventions — not some abstract standard. If a model is retrained directly on the new dataset, the adapter would not need these transformations. When writing an adapter, always ask: "What conventions did the model's training data use?" and map to those.

## Gotchas

- **DistributedManager**: Model wrappers call `DistributedManager.initialize()`. In notebooks without `torchrun`, set env vars first: `WORLD_SIZE=1`, `RANK=0`, `LOCAL_RANK=0`, `MASTER_ADDR=localhost`, `MASTER_PORT=12355`.
- **STL naming**: DoMINO looks for `drivaer_{tag}.stl`, GeoTransolver looks for `drivaer_{tag}_single_solid.stl` then `*.stl`. Both now fall back to any `*.stl` in the directory.
- **VTP vs VTK**: Model wrappers use VTK XML readers internally. Legacy `.vtk` files must be converted to `.vtp`/`.vtu`.
- **Checkpoint loading**: Some wrappers need `trusted_torch_load_context()` for PyTorch 2.6+ checkpoint compatibility.
- **Domain-scoped metrics**: `l2_pressure` resolves to different implementations for surface vs volume based on `inference_domain`. Use the same metric name for both.
228 changes: 228 additions & 0 deletions .cursor/skills/create-model-wrapper/SKILL.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,228 @@
---
name: create-model-wrapper
description: >-
Create a new model wrapper for the PhysicsNeMo CFD benchmarking workflow.
Use when the user wants to add a new CFD model, write a CFDModel wrapper,
integrate a new neural network architecture, or run a custom model through
the benchmarking pipeline.
---

# Create a Model Wrapper

Guide the user through adding a new CFD model to the benchmarking workflow by writing a `CFDModel` subclass.

## Reference files to read first

Before starting, read these files for context:

- `physicsnemo/cfd/evaluation/inference/model_registry.py` — base class and registry
- `physicsnemo/cfd/evaluation/datasets/schema.py` — `CanonicalCase`, `predictions_dict`, `build_predictions_dict`
- `physicsnemo/cfd/evaluation/inference/wrappers/surface_baseline.py` — simplest concrete wrapper
- `physicsnemo/cfd/evaluation/inference/wrappers/__init__.py` — how wrappers are registered
- `physicsnemo/cfd/evaluation/common/io.py` — mesh loading and normalization stats helpers
- `workflows/benchmarking_workflow/notebooks/adding_a_new_model.ipynb` — end-to-end tutorial

## The `CFDModel` interface

Every wrapper must set two class variables and implement four methods:

| Member | Purpose |
|--------|---------|
| `INFERENCE_DOMAIN` | `"surface"` or `"volume"` — which mesh manifold |
| `OUTPUT_LOCATION` | `"point"` or `"cell"` — where predictions live on the mesh |
| `output_location` (property) | Instance-level access to `OUTPUT_LOCATION` |
| `load(checkpoint_path, stats_path, device, **kwargs)` | Load weights and stats; return `self` |
| `prepare_inputs(case: CanonicalCase)` | Convert canonical case into model-specific tensors/graphs |
| `predict(model_input)` | Run forward pass; return raw output |
| `decode_outputs(raw_output, case)` | Denormalize and map to canonical predictions dict |

The engine calls `load` once, then `prepare_inputs → predict → decode_outputs` per case.

## Step 1: Write the wrapper class

```python
import numpy as np
import torch
import pyvista as pv
from typing import Any, ClassVar

from physicsnemo.cfd.evaluation.inference.model_registry import (
CFDModel, register_model, OutputLocation,
)
from physicsnemo.cfd.evaluation.datasets.schema import (
CanonicalCase, InferenceDomain, predictions_dict,
)
from physicsnemo.cfd.evaluation.inference.progress import log_inference


class MyModelWrapper(CFDModel):
INFERENCE_DOMAIN: ClassVar[InferenceDomain] = "surface" # or "volume"
OUTPUT_LOCATION: ClassVar[OutputLocation] = "cell" # or "point"

def __init__(self) -> None:
self._model = None
self._stats = None
self._device = "cpu"

@property
def output_location(self) -> OutputLocation:
return self.OUTPUT_LOCATION

def load(self, checkpoint_path, stats_path, device, **kwargs):
self._device = device
# Load your model architecture + weights
# self._model = ...
# Load normalization stats if needed
# self._stats = ...
log_inference("my_model", f"Loaded from {checkpoint_path}")
return self

def prepare_inputs(self, case: CanonicalCase):
mesh = pv.read(case.mesh_path)
if not isinstance(mesh, pv.PolyData):
mesh = mesh.extract_surface()
# Extract coordinates and build model-specific input
# (tensors, graphs, point clouds, etc.)
coords = np.array(mesh.cell_centers().points, dtype=np.float32)
return torch.tensor(coords, device=self._device)

def predict(self, model_input):
# Run forward pass through your model
with torch.no_grad():
raw_output = self._model(model_input)
return raw_output

def decode_outputs(self, raw_output, case):
# Denormalize if needed, then return canonical dict
# For surface models:
return predictions_dict(
pressure=raw_output["pressure"].cpu().numpy(),
shear_stress=raw_output["shear_stress"].cpu().numpy(),
)
# For volume models, use build_predictions_dict:
# return build_predictions_dict(
# pressure=..., velocity=..., turbulent_viscosity=...
# )
```

### Key implementation considerations

**Normalization**: Most trained models normalize inputs/outputs. Load stats from `stats_path` in `load()` and denormalize in `decode_outputs()`. See `physicsnemo/cfd/evaluation/common/io.py` for `load_global_stats()` and related helpers.

**Batching**: For large meshes, `prepare_inputs` may need to subsample or batch. Use `kwargs` passed through `load()` (e.g., `batch_resolution`, `geometry_sampling`) to control this.

**Output shape**: `pressure` must be `(N,)` float32. `shear_stress` must be `(N, 3)` float32 for surface. Volume fields: `velocity` is `(N, 3)`, `turbulent_viscosity` is `(N,)`.

**Output location**: If `OUTPUT_LOCATION = "cell"`, return N = `mesh.n_cells` values. If `"point"`, return N = `mesh.n_points` values.

## Step 2: Create checkpoint and stats files

Your model needs a checkpoint file and optionally a `global_stats.json`:

```python
# Checkpoint: save your model's state dict
torch.save(model.state_dict(), "checkpoint.pt")

# Stats: JSON with mean/std_dev for denormalization
# Surface format:
{
"mean": {"pressure": [0.0], "shear_stress": [0.0, 0.0, 0.0]},
"std_dev": {"pressure": [1.0], "shear_stress": [1.0, 1.0, 1.0]}
}
# Volume format:
{
"mean": {"pressure": [0.0], "velocity": [0.0, 0.0, 0.0], "turbulent_viscosity": [0.0]},
"std_dev": {"pressure": [1.0], "velocity": [1.0, 1.0, 1.0], "turbulent_viscosity": [1.0]}
}
```

## Step 3: Register and test

```python
register_model("my_model", MyModelWrapper)

# Load a case from any registered dataset adapter
from physicsnemo.cfd.evaluation.datasets.adapters.drivaerml import DrivAerMLAdapter
adapter = DrivAerMLAdapter(root="/path/to/data", inference_domain="surface")
case = adapter.load_case(adapter.list_cases()[0])

# Run the full inference pipeline
wrapper = MyModelWrapper()
wrapper.load(checkpoint_path="checkpoint.pt", stats_path="global_stats.json", device="cuda:0")
model_input = wrapper.prepare_inputs(case)
raw_output = wrapper.predict(model_input)
predictions = wrapper.decode_outputs(raw_output, case)

assert "pressure" in predictions
assert predictions["pressure"].shape[0] > 0
```

## Step 4: Run the full benchmark

```python
from physicsnemo.cfd.evaluation.config import Config
from physicsnemo.cfd.evaluation.benchmarks.engine import run_benchmark

config = Config.from_dict({
"run": {"device": "cuda:0", "output_dir": "results", "metrics_cache": {"enabled": False}},
"benchmark": {
"mode": "matrix",
"models": [{
"name": "my_model",
"inference_domain": "surface",
"checkpoint": "/path/to/checkpoint.pt",
"stats_path": "/path/to/global_stats.json",
"kwargs": {},
}],
"datasets": [{
"name": "drivaerml",
"root": "/path/to/drivaerml/data",
"case_ids": ["run_1", "run_11"],
"kwargs": {"align_ground_truth_to_model": True, "inference_domain": "surface"},
}],
"reproducibility": {"log_env": False, "save_artifacts": True},
},
"output": {"mesh_field_names": {"pressure": "pMeanTrimPred", "shear_stress": "wallShearStressMeanTrimPred"}},
"metrics": ["l2_pressure", "l2_shear_stress", "l2_pressure_area_weighted", "drag", "lift"],
"reports": {"enabled": False},
})
results = run_benchmark(config)
```

Results are written to `benchmark_results.json` (a JSON list of dicts, one per model×dataset combo).

## Step 5: Visualize predictions

```python
from physicsnemo.cfd.postprocessing_tools.visualization.utils import plot_fields, plot_field_comparisons

# Just the predicted fields (no GT comparison):
plotter = plot_fields(mesh, fields=["pMeanTrimPred"], view="xy", dtype="cell", window_size=[1800, 600])
plotter.screenshot("predicted_pressure.png")
plotter.close()

# Side-by-side with GT (GT | Pred | Error):
plotter = plot_field_comparisons(mesh, true_fields=["pMeanTrim"], pred_fields=["pMeanTrimPred"],
view="xy", dtype="cell", window_size=[1800, 600])
plotter.screenshot("comparison.png")
plotter.close()
```

## Step 6: Make permanent (optional)

Save the wrapper to `physicsnemo/cfd/evaluation/inference/wrappers/my_model.py` and register in `wrappers/__init__.py`:

```python
from physicsnemo.cfd.evaluation.inference.wrappers.my_model import MyModelWrapper
register_model("my_model", MyModelWrapper)
```

Then use `model.name: my_model` in any YAML config.

## Gotchas

- **DistributedManager**: Model wrappers may call `DistributedManager.initialize()`. In notebooks without `torchrun`, set env vars first: `WORLD_SIZE=1`, `RANK=0`, `LOCAL_RANK=0`, `MASTER_ADDR=localhost`, `MASTER_PORT=12355`.
- **`weights_only=True`**: Use this flag with `torch.load()` for safe deserialization (PyTorch 2.6+ default).
- **Domain matching**: The engine checks that `model.INFERENCE_DOMAIN` matches the dataset adapter's `inference_domain_from_kwargs()`. Mismatches are skipped in matrix mode or raise in single mode.
- **GT alignment**: When `align_ground_truth_to_model: true` in dataset kwargs, the engine converts GT data to match `OUTPUT_LOCATION` (point ↔ cell). This is automatic — the wrapper just needs correct class vars.
- **Results JSON format**: `benchmark_results.json` is a plain `list[dict]`, not `{"results": [...]}`. Iterate directly: `for combo in report:`.
Loading