Skip to content

test(fixtures): expand rewrite-quality corpus to 10 KO + 10 EN (enable A/B measurement)#501

Merged
devswha merged 1 commit into
mainfrom
bot/rewrite-quality-corpus
Jun 15, 2026
Merged

test(fixtures): expand rewrite-quality corpus to 10 KO + 10 EN (enable A/B measurement)#501
devswha merged 1 commit into
mainfrom
bot/rewrite-quality-corpus

Conversation

@devswha

@devswha devswha commented Jun 15, 2026

Copy link
Copy Markdown
Owner

Summary

Expands the rewrite-quality (live) fixture corpus from 1 → 11 fixtures per language so the quality:rewrite-ab and quality:live harnesses can measure rewrite-pipeline A/B (single vs ouroboros) at a meaningful sample size. No version bump — test data only, no schema/behavior change.

Adds 10 KO + 10 EN synthetic, redistributable (repo-ok) AI-sounding fixtures across registers: blog, academic-summary, product-doc, chat-update, technical-how-to, marketing, news, email, instructional, social. Each carries meaning anchors (numbers/entities that must survive a rewrite) for MPS/fidelity grading.

Why

The earlier measurement attempt found the real blocker: tests/fixtures/live-quality/ko/ had only 1 fixture, so single-vs-ouroboros (and the multi-agent --strict question) couldn't be measured (n=1). This unblocks the measurement phase.

Verification

  • npm test797 pass / 0 fail (live-quality fixture-shape test green)
  • npm run release:check — OK for 5.4.0 (unchanged)
  • npm run check:no-private-assets — OK (332 packed; all fixtures repo-ok)
  • harness loads ko 11 / en 11; all 20 new fixtures have anchors present in their text

Next phase: run quality:rewrite-ab --configs single,ouroboros across the corpus with a backend, then decide keep/cut the multi-agent surface from data.

The live-quality corpus had only 1 fixture per language, too small to measure
rewrite-pipeline A/B (single vs ouroboros) meaningfully. Adds 10 KO + 10 EN
synthetic, redistributable (repo-ok) AI-sounding fixtures across registers
(blog, academic-summary, product-doc, chat-update, technical-how-to, marketing,
news, email, instructional, social), each with meaning anchors (numbers/entities
that must survive a rewrite) for MPS/fidelity grading. No version bump: test
data only, no schema/behavior change. Enables `npm run quality:rewrite-ab` and
`quality:live` to run at n>=10 per language.
@vercel

vercel Bot commented Jun 15, 2026

Copy link
Copy Markdown
Contributor

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
patina Ready Ready Preview, Comment Jun 15, 2026 11:30am

Request Review

@devswha devswha merged commit e4e43d1 into main Jun 15, 2026
8 checks passed
@devswha devswha deleted the bot/rewrite-quality-corpus branch June 15, 2026 11:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant