Skip to content

Non-English paraphrase scenario (quantify English-preprocessor dependency) #6

@tomas-samek

Description

@tomas-samek

Summary

Current paraphrase test (Scenario 4) runs on English only. We need to know how much of the 10/10 grade is English-specific vs architectural before making general claims.

What to do

  • Port Scenario 4's 5-query rubric to at least one other language (Czech is a natural choice — we already have Czech fixtures from the cross-language concept binding test).
  • Use the SAME coverage logic (no per-language stopword lists added).
  • Record the grade. If the grade drops significantly, that quantifies the English-preprocessor dependency flagged in M1's coverage issue.

Acceptance

  • tests/honest_agent_paraphrase_cs.rs (or similar) exists.
  • Results logged in docs/ with the English-vs-non-English grade delta.

Links

  • tests/honest_agent_paraphrase.rs
  • M1 issue: "Coverage signal has a hidden English prior"

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions