NVIDIA · RishikeshRanade · Apr 2, 2026 · Apr 2, 2026 · Apr 2, 2026 · Apr 2, 2026
diff --git a/.cursor/skills/create-dataset-adapter/SKILL.md b/.cursor/skills/create-dataset-adapter/SKILL.md
@@ -0,0 +1,170 @@
+---
+name: create-dataset-adapter
+description: >-
+  Create a new dataset adapter for the PhysicsNeMo CFD benchmarking workflow.
+  Use when the user wants to add a new CFD dataset, write a DatasetAdapter,
+  integrate a new mesh format, or benchmark models on custom data.
+---
+
+# Create a Dataset Adapter
+
+Guide the user through adding a new CFD dataset to the benchmarking workflow by writing a `DatasetAdapter` subclass.
+
+## Reference files to read first
+
+Before starting, read these files for context:
+
+- `physicsnemo/cfd/evaluation/datasets/adapter_registry.py` — base class and registry
+- `physicsnemo/cfd/evaluation/datasets/schema.py` — `CanonicalCase` and `build_predictions_dict`
+- `physicsnemo/cfd/evaluation/datasets/adapters/drivaerml.py` — reference adapter implementation
+- `workflows/benchmarking_workflow/notebooks/adding_a_new_dataset.ipynb` — end-to-end tutorial
+
+## Step 1: Explore the new dataset
+
+Ask the user for the dataset path, then inspect one file:
+
+```python
+import pyvista as pv
+mesh = pv.read("<path_to_one_file>")
+print(f"Type: {type(mesh).__name__}, Points: {mesh.n_points}, Cells: {mesh.n_cells}")
+print(f"Cell arrays: {list(mesh.cell_data.keys())}")
+print(f"Point arrays: {list(mesh.point_data.keys())}")
+```
+
+Identify these differences from the canonical schema:
+
+| Question | What to look for |
+|----------|-----------------|
+| File format | `.vtp`, `.vtu`, `.vtk`, or other? Model wrappers expect `.vtp` (surface) or `.vtu` (volume) XML format. |
+| Directory layout | Flat directory? Nested `run_<id>/` dirs? How are case IDs derived from filenames? |
+| Pressure field name | The canonical key is `pressure`. What is the VTK array name? |
+| WSS field name | The canonical key is `shear_stress` (N, 3). Is it a single vector or separate scalar components? |
+| Sign conventions | Compare field ranges with DrivAerML. Are normals, WSS, or pressure flipped? |
+| Extra arrays | Are there explicit `Normals` or `Area` arrays? DrivAerML has none — remove them if present. |
+| STL files | Are separate STL geometry files available? If not, the surface mesh itself is the geometry. |
+| Inference domain | Surface (`.vtp`) or volume (`.vtu`)? |
+
+## Step 2: Write the adapter class
+
+Subclass `DatasetAdapter` with these methods:
+
+```python
+from pathlib import Path
+from physicsnemo.cfd.evaluation.datasets.adapter_registry import DatasetAdapter, register_adapter
+from physicsnemo.cfd.evaluation.datasets.schema import CanonicalCase
+
+class MyDatasetAdapter(DatasetAdapter):
+    def __init__(self, root: str, **kwargs):
+        self._root = Path(root)
+
+    @classmethod
+    def inference_domain_from_kwargs(cls, kwargs=None):
+        return "surface"  # or "volume"
+
+    def list_cases(self, split=None):
+        # Return list of case ID strings
+        ...
+
+    def load_case(self, case_id: str) -> CanonicalCase:
+        # 1. Read the mesh file
+        # 2. Build ground_truth dict with canonical keys:
+        #    - "pressure": np.float32 array
+        #    - "shear_stress": np.float32 array of shape (N, 3)
+        #    For volume: "pressure", "velocity" (N,3), "turbulent_viscosity"
+        # 3. Return CanonicalCase(case_id, mesh_path, mesh_type, ground_truth, inference_domain)
+        ...
+```
+
+### Common transformations in `load_case`
+
+**Format conversion** (legacy `.vtk` → `.vtp`):
+```python
+mesh = pv.read(vtk_path).extract_surface()
+mesh.save(vtp_path)
+```
+
+**Combining separate WSS scalars into a vector:**
+```python
+wss = np.stack([mesh.cell_data["WSSx"], mesh.cell_data["WSSy"], mesh.cell_data["WSSz"]], axis=1)
+```
+
+**Removing explicit Normals/Area** (DrivAerML convention):
+```python
+for key in ["Normals", "Area"]:
+    if key in mesh.cell_data:
+        del mesh.cell_data[key]
+```
+
+**Creating STL from surface mesh** (when no STL is shipped):
+```python
+mesh.extract_surface().triangulate().save(stl_path)
+```
+
+The STL must be named `drivaer_{int(case_id)}.stl` in the same directory as the VTP for the model wrappers to find it.
+
+### Caching pattern
+
+Do expensive conversions lazily and cache:
+
+```python
+def _prepare_case(self, case_id):
+    prepared_path = self._root / "_prepared" / f"{case_id}.vtp"
+    if not prepared_path.exists():
+        # ... convert and save
+    return str(prepared_path)
+```
+
+## Step 3: Register and test
+
+```python
+register_adapter("my_dataset", MyDatasetAdapter)
+
+adapter = MyDatasetAdapter(root="/path/to/data")
+cases = adapter.list_cases()
+case = adapter.load_case(cases[0])
+assert case.ground_truth is not None
+assert "pressure" in case.ground_truth
+```
+
+## Step 4: Run inference and benchmark
+
+Build a config and run:
+
+```python
+from physicsnemo.cfd.evaluation.config import Config
+from physicsnemo.cfd.evaluation.benchmarks.engine import run_benchmark
+
+config = Config.from_dict({
+    "run": {"device": "cuda:0", "output_dir": "results"},
+    "model": {"name": "<model_name>", "inference_domain": "<surface|volume>", ...},
+    "dataset": {"name": "my_dataset", "root": "/path/to/data", "case_ids": cases[:2]},
+    "output": {
+        "ground_truth_mesh_field_names": {"pressure": "<vtk_gt_name>", "shear_stress": "<vtk_gt_name>"},
+        "mesh_field_names": {"pressure": "<vtk_pred_name>", "shear_stress": "<vtk_pred_name>"},
+    },
+    "metrics": ["l2_pressure", "l2_shear_stress", "drag", "lift"],
+    "reports": {"enabled": False},
+})
+results = run_benchmark(config)
+```
+
+## Step 5: Make permanent (optional)
+
+Save the adapter to `physicsnemo/cfd/evaluation/datasets/adapters/<name>.py` and register in `adapters/__init__.py`:
+
+```python
+from physicsnemo.cfd.evaluation.datasets.adapters.<name> import MyDatasetAdapter
+register_adapter("my_dataset", MyDatasetAdapter)
+```
+
+## Why conventions must match the training data
+
+The field name mappings, sign conventions, and format conversions in the adapter exist because the model checkpoint was trained on a specific dataset (e.g., DrivAerML) with specific conventions. The adapter bridges the gap between the new dataset's conventions and the training data's conventions — not some abstract standard. If a model is retrained directly on the new dataset, the adapter would not need these transformations. When writing an adapter, always ask: "What conventions did the model's training data use?" and map to those.
+
+## Gotchas
+
+- **DistributedManager**: Model wrappers call `DistributedManager.initialize()`. In notebooks without `torchrun`, set env vars first: `WORLD_SIZE=1`, `RANK=0`, `LOCAL_RANK=0`, `MASTER_ADDR=localhost`, `MASTER_PORT=12355`.
+- **STL naming**: DoMINO looks for `drivaer_{tag}.stl`, GeoTransolver looks for `drivaer_{tag}_single_solid.stl` then `*.stl`. Both now fall back to any `*.stl` in the directory.
+- **VTP vs VTK**: Model wrappers use VTK XML readers internally. Legacy `.vtk` files must be converted to `.vtp`/`.vtu`.
+- **Checkpoint loading**: Some wrappers need `trusted_torch_load_context()` for PyTorch 2.6+ checkpoint compatibility.
+- **Domain-scoped metrics**: `l2_pressure` resolves to different implementations for surface vs volume based on `inference_domain`. Use the same metric name for both.
diff --git a/.cursor/skills/create-model-wrapper/SKILL.md b/.cursor/skills/create-model-wrapper/SKILL.md
@@ -0,0 +1,228 @@
+---
+name: create-model-wrapper
+description: >-
+  Create a new model wrapper for the PhysicsNeMo CFD benchmarking workflow.
+  Use when the user wants to add a new CFD model, write a CFDModel wrapper,
+  integrate a new neural network architecture, or run a custom model through
+  the benchmarking pipeline.
+---
+
+# Create a Model Wrapper
+
+Guide the user through adding a new CFD model to the benchmarking workflow by writing a `CFDModel` subclass.
+
+## Reference files to read first
+
+Before starting, read these files for context:
+
+- `physicsnemo/cfd/evaluation/inference/model_registry.py` — base class and registry
+- `physicsnemo/cfd/evaluation/datasets/schema.py` — `CanonicalCase`, `predictions_dict`, `build_predictions_dict`
+- `physicsnemo/cfd/evaluation/inference/wrappers/surface_baseline.py` — simplest concrete wrapper
+- `physicsnemo/cfd/evaluation/inference/wrappers/__init__.py` — how wrappers are registered
+- `physicsnemo/cfd/evaluation/common/io.py` — mesh loading and normalization stats helpers
+- `workflows/benchmarking_workflow/notebooks/adding_a_new_model.ipynb` — end-to-end tutorial
+
+## The `CFDModel` interface
+
+Every wrapper must set two class variables and implement four methods:
+
+| Member | Purpose |
+|--------|---------|
+| `INFERENCE_DOMAIN` | `"surface"` or `"volume"` — which mesh manifold |
+| `OUTPUT_LOCATION` | `"point"` or `"cell"` — where predictions live on the mesh |
+| `output_location` (property) | Instance-level access to `OUTPUT_LOCATION` |
+| `load(checkpoint_path, stats_path, device, **kwargs)` | Load weights and stats; return `self` |
+| `prepare_inputs(case: CanonicalCase)` | Convert canonical case into model-specific tensors/graphs |
+| `predict(model_input)` | Run forward pass; return raw output |
+| `decode_outputs(raw_output, case)` | Denormalize and map to canonical predictions dict |
+
+The engine calls `load` once, then `prepare_inputs → predict → decode_outputs` per case.
+
+## Step 1: Write the wrapper class
+
+```python
+import numpy as np
+import torch
+import pyvista as pv
+from typing import Any, ClassVar
+
+from physicsnemo.cfd.evaluation.inference.model_registry import (
+    CFDModel, register_model, OutputLocation,
+)
+from physicsnemo.cfd.evaluation.datasets.schema import (
+    CanonicalCase, InferenceDomain, predictions_dict,
+)
+from physicsnemo.cfd.evaluation.inference.progress import log_inference
+
+
+class MyModelWrapper(CFDModel):
+    INFERENCE_DOMAIN: ClassVar[InferenceDomain] = "surface"  # or "volume"
+    OUTPUT_LOCATION: ClassVar[OutputLocation] = "cell"       # or "point"
+
+    def __init__(self) -> None:
+        self._model = None
+        self._stats = None
+        self._device = "cpu"
+
+    @property
+    def output_location(self) -> OutputLocation:
+        return self.OUTPUT_LOCATION
+
+    def load(self, checkpoint_path, stats_path, device, **kwargs):
+        self._device = device
+        # Load your model architecture + weights
+        # self._model = ...
+        # Load normalization stats if needed
+        # self._stats = ...
+        log_inference("my_model", f"Loaded from {checkpoint_path}")
+        return self
+
+    def prepare_inputs(self, case: CanonicalCase):
+        mesh = pv.read(case.mesh_path)
+        if not isinstance(mesh, pv.PolyData):
+            mesh = mesh.extract_surface()
+        # Extract coordinates and build model-specific input
+        # (tensors, graphs, point clouds, etc.)
+        coords = np.array(mesh.cell_centers().points, dtype=np.float32)
+        return torch.tensor(coords, device=self._device)
+
+    def predict(self, model_input):
+        # Run forward pass through your model
+        with torch.no_grad():
+            raw_output = self._model(model_input)
+        return raw_output
+
+    def decode_outputs(self, raw_output, case):
+        # Denormalize if needed, then return canonical dict
+        # For surface models:
+        return predictions_dict(
+            pressure=raw_output["pressure"].cpu().numpy(),
+            shear_stress=raw_output["shear_stress"].cpu().numpy(),
+        )
+        # For volume models, use build_predictions_dict:
+        # return build_predictions_dict(
+        #     pressure=..., velocity=..., turbulent_viscosity=...
+        # )
+```
+
+### Key implementation considerations
+
+**Normalization**: Most trained models normalize inputs/outputs. Load stats from `stats_path` in `load()` and denormalize in `decode_outputs()`. See `physicsnemo/cfd/evaluation/common/io.py` for `load_global_stats()` and related helpers.
+
+**Batching**: For large meshes, `prepare_inputs` may need to subsample or batch. Use `kwargs` passed through `load()` (e.g., `batch_resolution`, `geometry_sampling`) to control this.
+
+**Output shape**: `pressure` must be `(N,)` float32. `shear_stress` must be `(N, 3)` float32 for surface. Volume fields: `velocity` is `(N, 3)`, `turbulent_viscosity` is `(N,)`.
+
+**Output location**: If `OUTPUT_LOCATION = "cell"`, return N = `mesh.n_cells` values. If `"point"`, return N = `mesh.n_points` values.
+
+## Step 2: Create checkpoint and stats files
+
+Your model needs a checkpoint file and optionally a `global_stats.json`:
+
+```python
+# Checkpoint: save your model's state dict
+torch.save(model.state_dict(), "checkpoint.pt")
+
+# Stats: JSON with mean/std_dev for denormalization
+# Surface format:
+{
+    "mean": {"pressure": [0.0], "shear_stress": [0.0, 0.0, 0.0]},
+    "std_dev": {"pressure": [1.0], "shear_stress": [1.0, 1.0, 1.0]}
+}
+# Volume format:
+{
+    "mean": {"pressure": [0.0], "velocity": [0.0, 0.0, 0.0], "turbulent_viscosity": [0.0]},
+    "std_dev": {"pressure": [1.0], "velocity": [1.0, 1.0, 1.0], "turbulent_viscosity": [1.0]}
+}
+```
+
+## Step 3: Register and test
+
+```python
+register_model("my_model", MyModelWrapper)
+
+# Load a case from any registered dataset adapter
+from physicsnemo.cfd.evaluation.datasets.adapters.drivaerml import DrivAerMLAdapter
+adapter = DrivAerMLAdapter(root="/path/to/data", inference_domain="surface")
+case = adapter.load_case(adapter.list_cases()[0])
+
+# Run the full inference pipeline
+wrapper = MyModelWrapper()
+wrapper.load(checkpoint_path="checkpoint.pt", stats_path="global_stats.json", device="cuda:0")
+model_input = wrapper.prepare_inputs(case)
+raw_output = wrapper.predict(model_input)
+predictions = wrapper.decode_outputs(raw_output, case)
+
+assert "pressure" in predictions
+assert predictions["pressure"].shape[0] > 0
+```
+
+## Step 4: Run the full benchmark
+
+```python
+from physicsnemo.cfd.evaluation.config import Config
+from physicsnemo.cfd.evaluation.benchmarks.engine import run_benchmark
+
+config = Config.from_dict({
+    "run": {"device": "cuda:0", "output_dir": "results", "metrics_cache": {"enabled": False}},
+    "benchmark": {
+        "mode": "matrix",
+        "models": [{
+            "name": "my_model",
+            "inference_domain": "surface",
+            "checkpoint": "/path/to/checkpoint.pt",
+            "stats_path": "/path/to/global_stats.json",
+            "kwargs": {},
+        }],
+        "datasets": [{
+            "name": "drivaerml",
+            "root": "/path/to/drivaerml/data",
+            "case_ids": ["run_1", "run_11"],
+            "kwargs": {"align_ground_truth_to_model": True, "inference_domain": "surface"},
+        }],
+        "reproducibility": {"log_env": False, "save_artifacts": True},
+    },
+    "output": {"mesh_field_names": {"pressure": "pMeanTrimPred", "shear_stress": "wallShearStressMeanTrimPred"}},
+    "metrics": ["l2_pressure", "l2_shear_stress", "l2_pressure_area_weighted", "drag", "lift"],
+    "reports": {"enabled": False},
+})
+results = run_benchmark(config)
+```
+
+Results are written to `benchmark_results.json` (a JSON list of dicts, one per model×dataset combo).
+
+## Step 5: Visualize predictions
+
+```python
+from physicsnemo.cfd.postprocessing_tools.visualization.utils import plot_fields, plot_field_comparisons
+
+# Just the predicted fields (no GT comparison):
+plotter = plot_fields(mesh, fields=["pMeanTrimPred"], view="xy", dtype="cell", window_size=[1800, 600])
+plotter.screenshot("predicted_pressure.png")
+plotter.close()
+
+# Side-by-side with GT (GT | Pred | Error):
+plotter = plot_field_comparisons(mesh, true_fields=["pMeanTrim"], pred_fields=["pMeanTrimPred"],
+                                  view="xy", dtype="cell", window_size=[1800, 600])
+plotter.screenshot("comparison.png")
+plotter.close()
+```
+
+## Step 6: Make permanent (optional)
+
+Save the wrapper to `physicsnemo/cfd/evaluation/inference/wrappers/my_model.py` and register in `wrappers/__init__.py`:
+
+```python
+from physicsnemo.cfd.evaluation.inference.wrappers.my_model import MyModelWrapper
+register_model("my_model", MyModelWrapper)
+```
+
+Then use `model.name: my_model` in any YAML config.
+
+## Gotchas
+
+- **DistributedManager**: Model wrappers may call `DistributedManager.initialize()`. In notebooks without `torchrun`, set env vars first: `WORLD_SIZE=1`, `RANK=0`, `LOCAL_RANK=0`, `MASTER_ADDR=localhost`, `MASTER_PORT=12355`.
+- **`weights_only=True`**: Use this flag with `torch.load()` for safe deserialization (PyTorch 2.6+ default).
+- **Domain matching**: The engine checks that `model.INFERENCE_DOMAIN` matches the dataset adapter's `inference_domain_from_kwargs()`. Mismatches are skipped in matrix mode or raise in single mode.
+- **GT alignment**: When `align_ground_truth_to_model: true` in dataset kwargs, the engine converts GT data to match `OUTPUT_LOCATION` (point ↔ cell). This is automatic — the wrapper just needs correct class vars.
+- **Results JSON format**: `benchmark_results.json` is a plain `list[dict]`, not `{"results": [...]}`. Iterate directly: `for combo in report:`.