Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
50 commits
Select commit Hold shift + click to select a range
6631bf5
change json to suit suggested config update by codex
Githubcopilot111 Sep 14, 2025
2a4a7ba
Simplify Q timeseries processing (#21)
Chain-Frost Sep 15, 2025
1d2b188
Merge remote-tracking branch 'origin/main' into work-on-qprocessor-th…
Chain-Frost Sep 15, 2025
45e19d9
Add stub TUFLOW H/V processors (#27)
Chain-Frost Sep 16, 2025
a8dbdf1
Refactor POMM abs max derivation (#26)
Chain-Frost Sep 16, 2025
700d867
Document BaseProcessor CSV loaders (#24)
Chain-Frost Sep 16, 2025
41dce03
Remove obsolete TUFLOW output file (#25)
Chain-Frost Sep 16, 2025
d728b8a
Consolidate PO combination logic (#23)
Chain-Frost Sep 16, 2025
3c2acb5
[processors] Refresh CSV ingestion helpers (#28)
Chain-Frost Sep 16, 2025
4ff5a0a
[core] Drive POMM processing from config (#29)
Chain-Frost Sep 16, 2025
25f0715
pandas-stubs
Chain-Frost Sep 17, 2025
7f414e6
might have broken it more
Chain-Frost Sep 17, 2025
f416953
Refactor timeseries processors for Q and V (#32)
Chain-Frost Sep 17, 2025
9346b66
Simplify TUFLOW processing metadata (#33)
Chain-Frost Sep 17, 2025
200b327
Document TUFLOW processor workflow (#34)
Chain-Frost Sep 17, 2025
fd190af
Document TUFLOW processor extension workflow (#35)
Chain-Frost Sep 17, 2025
bddcf59
Document POMM and PO workflows
Chain-Frost Sep 18, 2025
952265d
Merge remote-tracking branch 'origin/main' into work-on-qprocessor-th…
Chain-Frost Oct 6, 2025
2b5b8a1
Merge remote-tracking branch 'origin/main' into work-on-qprocessor-th…
Chain-Frost Oct 6, 2025
4547004
Merge branch 'main' into work-on-qprocessor-that-could-break-stuff
Chain-Frost Nov 2, 2025
e2efb44
2025-11-02 progress
Chain-Frost Nov 2, 2025
0b2cf99
Merge branch 'main' into work-on-qprocessor-that-could-break-stuff
Chain-Frost Nov 8, 2025
fcf86b5
Added configuration guards inside BaseProcessor._load_configuration
Chain-Frost Nov 8, 2025
b2e2bfd
Document processor import flow and config hooks (#36)
Chain-Frost Nov 16, 2025
a966f0f
Document TUFLOW maximum and timeseries processors (#37)
Chain-Frost Nov 16, 2025
9ee1cc7
Merge pull request #38 from Chain-Frost:codex/update-documentation-fo…
Chain-Frost Nov 16, 2025
d5a6bc7
[tuflow] Implement H timeseries processor pipeline (#39)
Chain-Frost Nov 16, 2025
be492aa
Merge branch 'main' into work-on-qprocessor-that-could-break-stuff
Chain-Frost Nov 16, 2025
1ed938f
extra test data - TUFLOW_Example_Model_Dataset
Chain-Frost Nov 22, 2025
abff503
feat: Add TUFLOW processors, results validation and datatypes configu…
Chain-Frost Nov 22, 2025
7024c95
add eof processor
Chain-Frost Nov 22, 2025
14a08a8
Merge branch 'main' into work-on-qprocessor-that-could-break-stuff
Chain-Frost Nov 23, 2025
29e92d5
first part logging tweaks
Chain-Frost Nov 23, 2025
cdb21a2
logging improvements - remove duplicate module
Chain-Frost Nov 23, 2025
ed31cc2
progress on tests, but there is a lot more work to do
Chain-Frost Nov 23, 2025
fd03545
progress on tests, black formatter
Chain-Frost Nov 26, 2025
23e9ff0
more testing progress
Chain-Frost Nov 27, 2025
35f9e47
basic mcp, doesn't really do anything though
Chain-Frost Nov 27, 2025
4d82895
cache dir
Chain-Frost Nov 27, 2025
2f3d809
test tweaks
Chain-Frost Nov 29, 2025
d123797
add 1D gpkg results and tlf extample datasets
Chain-Frost Nov 29, 2025
72f8d95
tidy testing
Chain-Frost Nov 29, 2025
cf38980
processor file path log text shortened
Chain-Frost Nov 29, 2025
7b35449
tidy up tuflow python scripts and some that are ss now
Chain-Frost Nov 30, 2025
aca663c
more test coverage expansion
Chain-Frost Nov 30, 2025
e1a073f
Merge remote-tracking branch 'origin/main' into work-on-qprocessor-th…
Chain-Frost Nov 30, 2025
c534783
more test stuff
Chain-Frost Nov 30, 2025
f80442b
add garbage collector, other tweaks
Chain-Frost Dec 1, 2025
f83fe55
standardisation of wrappers for tuflow functions
Chain-Frost Dec 2, 2025
a34ef5b
resolve test failures, still need to check with actual RORB data
Chain-Frost Dec 2, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
The diff you're trying to view is too large. We only load the first 3000 changed files.
36 changes: 36 additions & 0 deletions .github/copilot-instructions.md
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,42 @@
- Example entry-point: see `ryan-scripts/TUFLOW-python/POMM-med-max-aep-dur.py` for a typical wrapper pattern.
- Avoid duplicating logic between scripts—refactor into `functions/` as needed.

## Maximum/ccA processors
- **Pipeline overview**
- *Read*: `MaxDataProcessor.read_maximums_csv` pulls only the configured columns/dtypes for each data type, stops early on empty frames and fails fast when headers diverge from the JSON contract.【F:ryan_library/processors/tuflow/max_data_processor.py†L11-L55】【F:ryan_library/classes/tuflow_results_validation_and_datatypes.json†L59-L132】
- *Reshape*: Each processor reshapes or augments the raw frame before the shared `BaseProcessor` post-processing kicks in:
- `NmxProcessor._extract_and_transform_nmx_data` splits `Node ID`, filters out non-standard suffixes, pivots to `US_h`/`DS_h`, and enforces the expected column set.【F:ryan_library/processors/tuflow/1d_maximums/NmxProcessor.py†L12-L106】
- `CmxProcessor._reshape_cmx_data` normalises the Q/V maxima into long form while `_handle_malformed_data` drops rows with no values so downstream aggregation stays stable.【F:ryan_library/processors/tuflow/1d_maximums/CmxProcessor.py†L12-L97】
- `ChanProcessor.process` derives culvert `Height`, renames legacy fields, and bails out when required geometry columns are missing.【F:ryan_library/processors/tuflow/ChanProcessor.py†L6-L68】
- `ccAProcessor.process` dispatches to `process_dbf` or `process_gpkg` to read shapefile/geopackage sources, renames `Channel` to `Chan ID`, and only continues once a populated frame is available.【F:ryan_library/processors/tuflow/ccAProcessor.py†L12-L121】
- *Validate & finalise*: All processors converge on `BaseProcessor.add_common_columns`, `apply_output_transformations`, and `validate_data` so run-code metadata, dtype casting, and empty-frame checks remain consistent.【F:ryan_library/processors/tuflow/base_processor.py†L261-L391】
- **Edge-case handling conventions**
- Header checks go through `BaseProcessor.check_headers_match`, giving clear logs for missing/extra fields and reordering hints before the processor proceeds.【F:ryan_library/processors/tuflow/base_processor.py†L393-L433】
- `MaxDataProcessor.read_maximums_csv` returns explicit status codes (success, empty data, header mismatch, read failure) so subclass `process` methods can short-circuit safely.【F:ryan_library/processors/tuflow/max_data_processor.py†L14-L55】
- Reshapers filter abnormal inputs: `_extract_and_transform_nmx_data` drops pit suffixes, `_reshape_cmx_data` verifies each required column, and `_handle_malformed_data` strips all-null rows; `ChanProcessor` exits when geometry columns are absent; `ccAProcessor.process_gpkg` ignores geopackages missing the `1d_ccA_L` layer.【F:ryan_library/processors/tuflow/1d_maximums/NmxProcessor.py†L65-L105】【F:ryan_library/processors/tuflow/1d_maximums/CmxProcessor.py†L53-L97】【F:ryan_library/processors/tuflow/ChanProcessor.py†L23-L48】【F:ryan_library/processors/tuflow/ccAProcessor.py†L79-L121】
- **Expected outputs**
- `Cmx`: `Chan ID`, `Time`, `Q`, `V`
- `Nmx`: `Chan ID`, `Time`, `US_h`, `DS_h`
- `Chan`: `Chan ID`, `Length`, `n or Cd`, `pSlope`, `US Invert`, `DS Invert`, `US Obvert`, `Height`, `pBlockage`, `Flags`
- `ccA`: `Chan ID`, `pFull_Max`, `pTime_Full`, `Area_Max`, `Area_Culv`, `Dur_Full`, `Dur_10pFull`, `Sur_CD`, `Dur_Sur`, `pTime_Sur`, `TFirst_Sur`
- Use `output_columns` in `tuflow_results_validation_and_datatypes.json` as the contract for dtype casting and regression tests when extending these processors.【F:ryan_library/classes/tuflow_results_validation_and_datatypes.json†L59-L168】

## Timeseries processors
- **Pipeline overview**
- *Read*: `TimeSeriesProcessor.read_and_process_timeseries_csv` wraps `_read_csv`, `_clean_headers`, and `_reshape_timeseries_df` so every dataset starts from a tidy long-form frame, then `_apply_final_transformations` coerces numeric types.【F:ryan_library/processors/tuflow/timeseries_processor.py†L20-L282】
- *Reshape*: `reshape_h_timeseries` handles upstream/downstream H series; otherwise the melt keeps either `Chan ID` or `Location`. Subclasses such as `QProcessor` and `VProcessor` invoke `_normalise_value_dataframe` to enforce `[Time, identifier, value]` order and strip empty measurements, while `POProcessor._parse_point_output` performs its own transpose-like cleanup of the multi-row header format.【F:ryan_library/processors/tuflow/timeseries_helpers.py†L1-L33】【F:ryan_library/processors/tuflow/1d_timeseries/QProcessor.py†L11-L33】【F:ryan_library/processors/tuflow/1d_timeseries/VProcessor.py†L11-L22】【F:ryan_library/processors/tuflow/timeseries_processor.py†L283-L343】【F:ryan_library/processors/tuflow/POProcessor.py†L11-L131】
- *Validate & finalise*: After the dataset-specific hook runs, the base class adds run-code metadata, applies JSON-driven dtype mappings, and refuses to mark the processor complete if validation fails.【F:ryan_library/processors/tuflow/timeseries_processor.py†L33-L116】【F:ryan_library/processors/tuflow/base_processor.py†L261-L391】
- **Edge-case handling conventions**
- `_clean_headers` drops the placeholder first column, normalises `Time (h)` aliases, and raises when a `Time` column cannot be recovered.【F:ryan_library/processors/tuflow/timeseries_processor.py†L149-L188】
- `_reshape_timeseries_df` swaps between `Chan ID` and `Location`, reuses `reshape_h_timeseries` for two-value H exports, and sets `expected_in_header` so `check_headers_match` can reject malformed melts.【F:ryan_library/processors/tuflow/timeseries_processor.py†L216-L265】
- `_normalise_value_dataframe` enforces a single identifier column, drops empty rows, and returns granular status codes (`FAILURE`, `EMPTY_DATAFRAME`, `HEADER_MISMATCH`) so callers can log precise causes; PO parsing similarly guards against missing header rows, non-numeric times, all-NaN columns, and absent measurement data.【F:ryan_library/processors/tuflow/timeseries_processor.py†L283-L343】【F:ryan_library/processors/tuflow/POProcessor.py†L47-L131】
- Header validation ultimately runs through `BaseProcessor.check_headers_match`, so custom processors should always set `expected_in_header` before returning a frame.【F:ryan_library/processors/tuflow/base_processor.py†L393-L433】
- **Expected outputs**
- `Q`: `Time`, `Chan ID`, `Q`
- `V`: `Time`, `Chan ID`, `V`
- `PO`: `Time`, `Location`, `Type`, `Value`
- Keep the `output_columns` mapping in sync with new processors so dtype coercion continues to succeed for downstream reporting and tests.【F:ryan_library/classes/tuflow_results_validation_and_datatypes.json†L40-L247】

## Integration & Dependencies
- External dependencies are managed via `requirements.txt`.
- Some scripts expect specific directory structures or data files (see comments in wrappers for details).
Expand Down
9 changes: 8 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -23,9 +23,16 @@ pytest-cache-files-*

# Ignore all .xlsx files in the tests/test_data folder and its subfolders, such as from testing runs
tests/test_data/tuflow/tutorials/**/*.xlsx

tests/test_data/**/*.xlsx
tests/test_data/**/*.py

# Others
*/Thumbs.db
*/~$*.xlsx
**/*/~$*.xlsx
tests/test_data/tuflow/Thumbs.db
test_output.txt


.coverage

15 changes: 15 additions & 0 deletions AGENTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -71,6 +71,12 @@ This file guides AI agents (e.g., ChatGPT Codex) on how to interact with and con
2. **Follow conventions**: Generate code adhering to the project standards (sections 2–5).
3. **Produce PR diffs**: Only modify relevant files; include clear commit messages.

#### Logging (loguru) guidance
- Success/error/exception logs shown to users must use f-strings (or equivalent eager formatting) for clarity.
- Info logs are also user-facing; prefer f-strings or explicit formatting so rendered messages are readable as-is.
- Debug logs should remain lazily formatted (loguru parameter style) to avoid unnecessary work when debug is disabled.
- TODO: Sweep the codebase and align existing log statements with these conventions; ensure logging helpers do not leak internal helper names into user-facing output.

---

### 7. Build Workflow
Expand All @@ -90,6 +96,15 @@ This file guides AI agents (e.g., ChatGPT Codex) on how to interact with and con
### 8. Environment Notes

* On machines joined to the `bge-resources.com` domain (e.g., where `USERDNSDOMAIN=bge-resources.com` or `USERDOMAIN=BGER`), PowerShell sometimes fails to stream file contents reliably. When working on these systems, prefer running commands through `cmd.exe` (e.g., `cmd.exe /C type path\to\file`) so files load correctly in the Codex CLI.
* CI/CLI host commonly provides system Python 3.12 with PEP 668 (externally-managed) pip. `python -m venv` may fail unless `python3-venv` is installed. If you need repo deps, install the bundled wheel under `dist/` (e.g., `python3 -m pip install --break-system-packages dist/ryan_functions-*.whl`) so `ryan_library` and loguru/geopandas/fiona are available. If isolation is required, install venv tooling first or use user-level installs.

---

### 9. MCP Setup

* This repository supports the Model Context Protocol (MCP).
* See [MCP_SETUP.md](MCP_SETUP.md) for instructions on how to configure your AI assistant to use the custom tools provided by this repo.
* The server script is located at `ryan_mcp_server.py`.

---

Expand Down
97 changes: 97 additions & 0 deletions MCP_SETUP.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,97 @@
# MCP Setup for ryan-tools

This repository includes a Model Context Protocol (MCP) server that allows AI agents (like Claude, Codex, Antigravity) to directly interact with the repository's tools.

## Prerequisites

You need to install the `mcp` python package:

```bash
pip install mcp
```

## Running the Server

You can run the server directly using Python:

```bash
python ryan_mcp_server.py
```

The server runs over stdio (standard input/output), so it is designed to be called by an MCP client, not run interactively by a human.

## Configuration

### Claude Desktop

To use this with Claude Desktop, add the following to your `claude_desktop_config.json`:

```json
{
"mcpServers": {
"ryan-tools": {
"command": "python",
"args": [
"E:\\Library\\Automation\\ryan-tools\\ryan_mcp_server.py"
]
}
}
}
```

Make sure to update the path to `ryan_mcp_server.py` if it is different on your machine.

### VS Code (Antigravity / Codex)

If you are using the Antigravity or Codex extension in VS Code, you typically need to add the MCP server configuration to your VS Code `settings.json` (User or Workspace) or the extension's specific configuration file.

Add the following configuration:

```json
"mcpServers": {
"ryan-tools": {
"command": "python",
"args": [
"e:\\Library\\Automation\\ryan-tools\\ryan_mcp_server.py"
]
}
}
```

> [!NOTE]
> Ensure that `python` is in your system PATH, or use the full path to your Python executable (e.g., `e:\\Library\\Automation\\ryan-tools\\.venv\\Scripts\\python.exe`).

### Standalone Antigravity

If you are running Antigravity as a standalone application, you need to edit the `mcp_config.json` file.

**Location**: `C:\Users\Ryan\.gemini\antigravity\mcp_config.json`

Add the server configuration to the JSON object:

```json
{
"mcpServers": {
"ryan-tools": {
"command": "python",
"args": [
"e:\\Library\\Automation\\ryan-tools\\ryan_mcp_server.py"
]
}
}
}
```

If the file already exists, merge the `ryan-tools` entry into the existing `mcpServers` list.

### Other Clients

Configure your MCP client to run the command: `python path/to/ryan_mcp_server.py`.

## Available Tools

Currently, the server exposes:

- `search_files`: Fast file search using Python. Finds files matching a pattern in a directory.

More tools can be added by modifying `ryan_mcp_server.py`.
Loading