Summary
Replaced the handcrafted lexicographic 3-tuple (effective_width, log_facc, pathlen) junction scoring with a weighted scalar score learned from 1,967 human-labeled junction decisions.
Methodology
- Training data: 1,967 unique junctions from 906 backwater QC paths, labeled by JHG. At each junction, the human selected the correct upstream branch.
- Feature engineering: Pairwise log1p-difference features comparing two candidate branches (A vs B):
log1p(attr_A) - log1p(attr_B) for width, facc, slope, pathlen_hw; integer diff for stream_order.
- Model: Logistic regression on mirrored pairwise features (no regularization). Coefficients directly become routing weights.
- Validation: Leave-junction-out GroupKFold CV: 89.7% junction accuracy vs 88.3% for old 3-tuple.
Learned Weights
score = 1.972*log1p(ew) + 0.227*log1p(facc) - 0.228*log1p(slope) + 0.234*log1p(pathlen) + 0.288*stream_order
| Feature |
Weight |
Share |
Note |
| effective_width |
+1.972 |
67% |
SWOT-preferred (n_obs≥5), else GRWL |
| stream_order |
+0.288 |
10% |
new — not in old 3-tuple |
| pathlen |
+0.234 |
8% |
log1p-transformed cumulative path length |
| slope |
-0.228 |
8% |
new, negative — prefer lower gradient (mainstem) |
| facc |
+0.227 |
8% |
log1p(flow accumulation) |
Two new signals vs old 3-tuple
- slope (negative weight): mainstem channels have lower gradients than tributaries. This is a fundamental geomorphic pattern the old ranking ignored entirely.
- stream_order: higher-order reaches are preferred. 10% of the score.
Impact
Pipeline rerun on all 6 regions:
- Junction accuracy: 88.3% → 89.7% on 1,967 labeled junctions
- Net improvement: +29 junctions (146 pipeline-wins vs 117 old-wins at 263 disagreements)
- Routing changes: ~61K best_headwater, ~85K best_outlet changed across 248K reaches
- Mainstem overlap: 99.9% agreement with old algorithm on major rivers (Mississippi, Amazon, Nile, Danube, Mekong, Murray)
- All post-save gates passed (V001, V005, V007, V008, T001, T002)
- 0 monotonicity violations in hydro_dist_out
Files
- Weights:
src/sword_v17c_pipeline/stages/graph.py (ROUTING_WEIGHTS, routing_score())
- Used by:
stages/distances.py (best_headwater/outlet), stages/mainstem.py (mainstem walk + main neighbors)
- Training labels:
data/backwater_junction_labels_v003.parquet
- GBM comparison model:
data/routing_gbm_v3.joblib
GBM comparison
Also trained a GBM (gradient boosting) on the same data: 90.0% CV accuracy vs 89.7% LogReg. The 0.3% difference doesn't justify a model file dependency. LogReg weights are hardcoded — no runtime dependency, fully interpretable.
Error analysis
158 errors (8%) on labeled junctions fall into 3 categories:
- CAT1 (42%): Human picked narrower + less facc (deltas, tidal — unlearnable from reach attributes)
- CAT2 (40%): Width misleading, facc correct (lakes, wide side channels)
- CAT3 (18%): Facc misleading, width correct (already handled)
The ~90% ceiling is structural — CAT1 errors require river name continuity or geographic knowledge no reach-level attribute can provide.
Summary
Replaced the handcrafted lexicographic 3-tuple
(effective_width, log_facc, pathlen)junction scoring with a weighted scalar score learned from 1,967 human-labeled junction decisions.Methodology
log1p(attr_A) - log1p(attr_B)for width, facc, slope, pathlen_hw; integer diff for stream_order.Learned Weights
Two new signals vs old 3-tuple
Impact
Pipeline rerun on all 6 regions:
Files
src/sword_v17c_pipeline/stages/graph.py(ROUTING_WEIGHTS,routing_score())stages/distances.py(best_headwater/outlet),stages/mainstem.py(mainstem walk + main neighbors)data/backwater_junction_labels_v003.parquetdata/routing_gbm_v3.joblibGBM comparison
Also trained a GBM (gradient boosting) on the same data: 90.0% CV accuracy vs 89.7% LogReg. The 0.3% difference doesn't justify a model file dependency. LogReg weights are hardcoded — no runtime dependency, fully interpretable.
Error analysis
158 errors (8%) on labeled junctions fall into 3 categories:
The ~90% ceiling is structural — CAT1 errors require river name continuity or geographic knowledge no reach-level attribute can provide.