Skip to content

fix(loader): fallback field mapping for sonnet comparison schema#15

Merged
nvandessel merged 6 commits into
mainfrom
fix/sonnet-comparison-schema
Apr 2, 2026
Merged

fix(loader): fallback field mapping for sonnet comparison schema#15
nvandessel merged 6 commits into
mainfrom
fix/sonnet-comparison-schema

Conversation

@nvandessel
Copy link
Copy Markdown
Owner

Summary

  • Add fallback field mapping in JsonlLoader._parse_entry(): responsesonnet_response, parsedhaiku_parsed
  • Sonnet comparison data (682 records) now loads correctly, yielding 2,076 total SFT pairs
  • 4 new tests covering fallback behavior and preference for canonical fields

Test plan

  • 4 new tests for sonnet comparison schema fallback (all pass)
  • All 78 existing tests still pass
  • Ruff linting clean
  • Verified on real data: 1,394 haiku + 682 sonnet = 2,076 SFT pairs

🤖 Generated with Claude Code

The sonnet comparison data uses `sonnet_response` and `haiku_parsed`
instead of `response` and `parsed`. Add fallback mapping in
_parse_entry() so these records are loaded correctly, yielding
682 additional SFT training pairs (2,076 total).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Apr 1, 2026

Greptile Summary

This PR adds fallback field mapping in JsonlLoader._parse_entry() to handle a sonnet comparison JSONL schema that stores responses and parsed output under sonnet_response / haiku_parsed instead of the canonical response / parsed keys. The fix enables 682 previously unloadable records to flow through the full pipeline (load → clean → format), increasing total SFT pairs from ~1,394 to ~2,076. A new Kaggle training notebook is also introduced.

Key changes:

  • response falls back to sonnet_response and parsed falls back to haiku_parsed using consistent is not None guards — the concern raised in the previous review thread about or vs is not None asymmetry has been resolved; both fields now use the more precise is not None check.
  • The cross-model pairing (sonnet_response + haiku_parsed) is semantically safe: parsed is not consumed by SftFormatter or DecisionCleaner — only response is used as the assistant training target, so the fields not being from the same model has no downstream effect.
  • 4 new tests cover absent-key fallback, explicit-preference, and full pipeline scenarios.
  • The notebook final summary cell references report without a fallback guard — a minor style issue if cells are run out of order.

Confidence Score: 5/5

Safe to merge — the loader fix is correct, consistently guarded, and well-tested; no P0/P1 findings.

All remaining findings are P2 (notebook style). The core loader change uses consistent is not None guards for all fallback fields, the parsed cross-model asymmetry is harmless since parsed is not consumed downstream in the training pipeline, and the 4 new tests provide solid coverage of the fallback paths.

notebooks/train-hippofloop.ipynb — minor guard missing for report in summary cell.

Important Files Changed

Filename Overview
src/hippofloop/data/loader.py Adds is not None fallback for responsesonnet_response and parsedhaiku_parsed; consistent guard strategy across all three fallback fields.
tests/data/test_loader.py 4 new tests covering fallback activation, canonical-field preference, and end-to-end pipeline; adequate coverage for the added logic.
notebooks/train-hippofloop.ipynb New end-to-end training notebook for Kaggle; minor issue where the final summary cell references report without a guard, which would NameError if the evaluation cell is skipped.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[Raw JSONL line] --> B{response key present\nand not None?}
    B -- Yes --> C[use response]
    B -- No --> D{sonnet_response\npresent?}
    D -- Yes --> E[use sonnet_response]
    D -- No --> F[use empty string]
    C --> G[DecisionEntry.response]
    E --> G
    F --> G

    A --> H{parsed key present\nand not None?}
    H -- Yes --> I[use parsed]
    H -- No --> J{haiku_parsed\npresent?}
    J -- Yes --> K[use haiku_parsed]
    J -- No --> L[use None]
    I --> M[DecisionEntry.parsed]
    K --> M
    L --> M

    G --> N[DecisionCleaner\nfilters on response]
    N --> O[SftFormatter\nuses response as\nassistant message]
    M --> P[metadata only\nnot used in training]
Loading

Reviews (4): Last reviewed commit: "refactor: eliminate double dict lookups ..." | Re-trigger Greptile

Comment thread src/hippofloop/data/loader.py Outdated
nvandessel and others added 5 commits April 1, 2026 04:20
Self-contained notebook that runs the full pipeline on a Kaggle T4:
load JSONL → clean → format → train (QLoRA) → evaluate → export GGUF.

Uses hippofloop's data pipeline and Unsloth for efficient 4-bit training
with batch_size=1, grad_accum=16, max_seq_length=8192.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
- Use rng.sample() instead of ordered slice for eval subset
- Initialize best_loss = None before training, guard with is not None

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Extract response, parsed, and time into locals before the
constructor call. Fixes redundant hash+lookup per record and
aligns all three fallback fields to the same is-not-None idiom.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@nvandessel nvandessel merged commit cc31cd7 into main Apr 2, 2026
5 checks passed
@nvandessel nvandessel deleted the fix/sonnet-comparison-schema branch April 2, 2026 04:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant