Best practice for handling unevenly sampled (jagged) observations in Black-Box models? (SimpleBbAsciiFile length constraint & Documentation inquiry)

Hello OpenDA developers/community,

I am currently integrating a 2D hydrodynamic and water quality model (CE-QUAL-W2) into OpenDA using the BlackBoxWrapper. Since the model is not natively supported, I am using Python scripts to bridge the model I/O with OpenDA.

Currently, I am using org.openda.blackbox.io.SimpleBbAsciiFile for model results and noosObserver for observations. However, I have hit a severe architectural limitation regarding real-world, unevenly sampled data.

The Physical Scenario (The "Jagged Data" Problem):
In our real-world reservoir monitoring (multiple sites and multiple depths), the sampling frequency is naturally uneven.
For example, within a 60-day simulation window:

Site_A_Depth_0.5m might have 15 observations.

Site_B_Depth_10.0m might have 8 observations.

Site_C_Depth_90.0m might only have 2 observations.

The Technical Bottleneck:
My Python bridge successfully extracts a full, continuous 60-day time series for all simulation points and writes them into model_results.output. However, SimpleBbAsciiFile seems to enforce a strict length contract based on the .noo files.

If the .noo file for Site_B has 8 records, OpenDA throws an error when parsing the 60-day model output:

“Error preparing algorithm.
Error message: expecting vector of length 8 for time, but length was 60
...
at org.openda.blackbox.io.SimpleBbAsciiFile.initialize”

Workarounds we considered (but are not ideal):

Intersection Method: Only keep the exact dates where all sites were simultaneously monitored. (This forces all .noo files to be the exact same length, but we lose a massive amount of valuable field data).

Scalarization Method: Abandon timeSeries entirely and treat every single observation point at every specific timestamp as an independent, length-1 scalar variable (e.g., SiteA_Day1, SiteA_Day2). This explodes the XML configuration and loses the semantic meaning of a time series.

My Questions:

Advanced I/O Handling: Is there a more advanced generic IO class (e.g., a generic NetCDF wrapper) or a specific Observation Operator configuration in OpenDA that allows us to feed a full, continuous model time series (e.g., length 60) and let OpenDA automatically interpolate or pick the matching timestamps based on the varying lengths of individual .noo files?

Best Practices: How do advanced models integrated via the Black-Box approach usually handle varying observation frequencies across different measurement vectors without triggering the length mismatch error?

Documentation & Manuals: Is there a continuously updated reference manual or comprehensive documentation for OpenDA's built-in wrapper classes and algorithms? We often find ourselves unsure about what newer classes might exist, what their underlying constraints/limitations are, how to correctly format their XML paradigms, and how they compare to one another. A detailed guide would greatly help us fully utilize the framework's potential.

Any guidance, documentation links, or examples pointing to a more advanced I/O handler for Black-Box models would be greatly appreciated. Thank you!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Best practice for handling unevenly sampled (jagged) observations in Black-Box models? (SimpleBbAsciiFile length constraint & Documentation inquiry) #473

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Best practice for handling unevenly sampled (jagged) observations in Black-Box models? (SimpleBbAsciiFile length constraint & Documentation inquiry) #473

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions