Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
43 changes: 43 additions & 0 deletions openspec/changes/add-libreoffice-accept-reject-oracle/proposal.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
# Change: Add a LibreOffice accept/reject oracle voter to the Lean↔TS differential harness

## Why

The Lean↔TS helper differential (`add-lean-ts-helper-differential-harness`) validates that the genuine Lean
model and the production TS engine *agree*, but it has no **independent ground truth** — both could be wrong
the same way. The paragraph-collapse cases the harness pins (G3/G4/G5, closed by
`broaden-lean-accept-keep-empty-paragraphs`, `make-reject-paragraph-collapse-mark-based`, and
`make-accept-paragraph-collapse-mark-based`) rest on a claim about how a real word processor behaves: an
**untracked paragraph mark is kept** (as an empty `<w:p>`) on accept/reject, while a **PPR-INS/PPR-DEL mark is
dropped**. That claim was confirmed once, manually, against LibreOffice and recorded in memory; it was never a
committed, reproducible check.

LibreOffice is the native engine for the `.uno:AcceptAllTrackedChanges` / `.uno:RejectAllTrackedChanges`
dispatches, so its paragraph-structure output is authoritative for the mark-based rule. Wiring it in as a
**third voter** makes the accept/reject claims oracle-backed, not just Lean↔TS self-consistent, and turns the
throwaway `.tmp` oracle script into a committed, on-demand developer test.

## What Changes

- Add a committed helper `packages/docx-core/src/integration/libreoffice-oracle.ts`: `resolveSoffice()`
(binary discovery, `SAFE_DOCX_SOFFICE_BIN` override), `packMinimalDocx` / `extractDocumentXml`,
`runLibreOfficeOracle` (drives LibreOffice headless via an injected Basic macro in a throwaway profile —
pyuno is blocked on macOS by Launch Constraints — batching all jobs in one launch), and `paragraphShape`
(the structural projection).
- Add a gated oracle voter to `lean-differential-helpers.test.ts` (`[LEAN-HELP-09..11]`) asserting LibreOffice
agrees with the TS engine on **paragraph structure** for the pinned fixtures: kept-not-dropped on G3/G4/G5,
full empty-collapse structure on the clean single-level G4/G5, and a PPR-marked **drop** control.
- The comparison is structural (paragraph count + which paragraphs collapsed to empty), NOT the full token
projection: LibreOffice rewrites styles, and on the contrived nested G3 fixture (`w:ins` wrapping `w:del`)
it keeps the inserted-then-deleted text on accept where Lean/TS collapse to empty. The paragraph *count*
still agrees (the kept-not-dropped claim); that content divergence is **pinned** in `[LEAN-HELP-09]`.

## Impact

- Affected specs: `docx-comparison` (ADDED: one requirement + `[LEAN-HELP-09..11]`).
- Affected code: new `packages/docx-core/src/integration/libreoffice-oracle.ts`;
`packages/docx-core/src/integration/lean-differential-helpers.test.ts` (oracle describe block + imports).
- **No production-engine change**; this strengthens the differential's evidence only.
- **Local-only**: gated on a LibreOffice binary via `resolveSoffice()`. CI does not install LibreOffice, so the
voter skips cleanly there (exactly like `odf-core`'s LibreOffice round-trip test); it runs for any developer
who has LibreOffice installed. The mechanism (Basic-macro injection, `macro:///` invocation after a
profile-init convert) follows the `reference_libreoffice_macos_oracle` recipe.
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
## ADDED Requirements

### Requirement: The differential harness validates accept/reject paragraph collapse against a LibreOffice oracle

The Lean↔TS helper differential SHALL validate its pinned accept/reject paragraph-collapse cases against
**LibreOffice** as an independent reference implementation, so the paragraph-collapse claims are oracle-backed
ground truth rather than only Lean↔TS self-consistency. The harness SHALL drive LibreOffice headless through
the native `.uno:AcceptAllTrackedChanges` / `.uno:RejectAllTrackedChanges` dispatches (via an injected Basic
macro, since pyuno is blocked on macOS), batching all pinned cases through a single launch.

The oracle comparison SHALL be **structural** — the number of body paragraphs and which paragraphs collapsed
to empty (carry no visible text) — NOT the full revision/formatting token projection, because LibreOffice
rewrites styles and run properties, and on a contrived nested revision (`w:ins` wrapping `w:del`) it interprets
the change differently from the Lean/TS model (it keeps the inserted-then-deleted text on accept where Lean/TS
collapse to empty). The harness SHALL assert the claim the oracle is authoritative for — that an UNTRACKED
paragraph mark is kept and a `PPR-INS`/`PPR-DEL` mark is dropped — and SHALL pin, rather than hide, the
nested-revision content divergence so a change in LibreOffice's behavior is detected.

The oracle voter SHALL be gated on the presence of a LibreOffice binary (`resolveSoffice()`, with a
`SAFE_DOCX_SOFFICE_BIN` override) and SHALL skip cleanly with a clear message when it is absent — CI does not
install LibreOffice, so the voter is a local developer check. It SHALL ALSO skip cleanly when LibreOffice is
present but cannot launch (for example a sandboxed shell on macOS, where `soffice` aborts before doing any
work): the harness catches the launch failure, logs why, and no-ops the assertions rather than failing, since
the oracle is best-effort ground truth — it runs fully only where a working LibreOffice can be driven. This
requirement adds reference-implementation evidence only; it introduces no production-engine change.

#### Scenario: [LEAN-HELP-09] LibreOffice keeps an untracked-mark paragraph (kept-not-dropped), matching the TS engine

- **GIVEN** the pinned G3 (accept), G4 (reject), and G5 (accept) fixtures, each an untracked-mark paragraph whose body collapses to empty, followed by a surviving paragraph
- **WHEN** each is run through LibreOffice and through the TS engine
- **THEN** LibreOffice and the TS engine keep the same number of paragraphs (the untracked-mark paragraph is kept, not dropped); and the contrived nested-revision G3 content divergence (LibreOffice keeps the inserted-then-deleted text on accept while the TS engine collapses to empty) is asserted explicitly as a characterized difference

#### Scenario: [LEAN-HELP-10] LibreOffice and the TS engine agree on full paragraph structure for the clean single-level fixtures

- **GIVEN** the clean single-level fixtures — G4 (an `ins`-only paragraph, reject) and G5 (a `del`-only paragraph, accept)
- **WHEN** each is run through LibreOffice and through the TS engine
- **THEN** the resulting paragraph structure is identical in both — the collapsed paragraph is kept as an empty `<w:p>` and the survivor remains — confirming the mark-based collapse against the reference implementation

#### Scenario: [LEAN-HELP-11] LibreOffice drops a PPR-marked paragraph, matching the TS engine

- **GIVEN** a paragraph whose mark is `PPR-INS` (reject side) and a paragraph whose mark is `PPR-DEL` (accept side), each followed by a surviving paragraph
- **WHEN** each is run through LibreOffice and through the TS engine
- **THEN** both LibreOffice and the TS engine remove the marked paragraph, leaving only the survivor — confirming the other direction of the mark-based rule against the reference implementation
30 changes: 30 additions & 0 deletions openspec/changes/add-libreoffice-accept-reject-oracle/tasks.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
## 1. Oracle helper (committed)

- [x] 1.1 Add `packages/docx-core/src/integration/libreoffice-oracle.ts`: `resolveSoffice()`,
`packMinimalDocx` / `extractDocumentXml` (reuse `primitives/zip.ts`), `runLibreOfficeOracle`
(macro-injection driver, one launch per batch), `paragraphShape` (structural projection).
- [x] 1.2 Follow the `reference_libreoffice_macos_oracle` recipe: write `registrymodifications.xcu`
(MacroSecurityLevel 0), init the profile via a throwaway `--convert-to`, THEN overwrite
`Module1.xba`, THEN invoke `macro:///Standard.Module1.RunOracle`; verify via a marker file.

## 2. Oracle voter (gated)

- [x] 2.1 Add a `describeOracle = resolveSoffice() ? describe : describe.skip` block to
`lean-differential-helpers.test.ts`; one `beforeAll` drives the whole batch through LibreOffice.
- [x] 2.2 `[LEAN-HELP-09]` kept-not-dropped (G3/G4/G5 paragraph count matches TS); pin the G3 nested-revision
content divergence (LibreOffice keeps the text) rather than hide it.
- [x] 2.3 `[LEAN-HELP-10]` full structural agreement on the clean single-level fixtures (G4 reject, G5 accept).
- [x] 2.4 `[LEAN-HELP-11]` PPR-marked drop control (PPR-INS reject, PPR-DEL accept) — LibreOffice drops, matching TS.

## 3. Verification

- [x] 3.1 `npm test -w @usejunior/docx-core -- lean-differential-helpers` green with the oracle voter running
against a real LibreOffice (11 tests); `tsc --noEmit` clean.
- [x] 3.2 Full `@usejunior/docx-core` suite green (1350 passed / 3 skipped); voter skips cleanly when soffice is absent.

## 4. Specs / docs

- [x] 4.1 Add the `docx-comparison` ADDED requirement + scenarios `[LEAN-HELP-09..11]`.
- [x] 4.2 `verification/ROADMAP.md`: record the oracle voter landed (accept/reject is now oracle-backed for the
pinned cases); note it is a local-only check.
- [ ] 4.3 Ship: peer-review (codex + agy), open PR, `/automerge-smoke`. Update memory (committed oracle helper).
139 changes: 138 additions & 1 deletion packages/docx-core/src/integration/lean-differential-helpers.test.ts
Original file line number Diff line number Diff line change
Expand Up @@ -60,11 +60,17 @@ import { spawnSync } from 'node:child_process';
import { existsSync } from 'node:fs';
import { dirname, join } from 'node:path';
import fc from 'fast-check';
import { describe, expect } from 'vitest';
import { beforeAll, describe, expect } from 'vitest';
import { acceptAllChanges, rejectAllChanges } from '../baselines/atomizer/trackChangesAcceptorAst.js';
import { validateFieldStructure } from '../baselines/atomizer/pipeline.js';
import { parseDocumentXml } from '../baselines/atomizer/xmlToWmlElement.js';
import { testAllure, type AllureBddContext } from '../testing/allure-test.js';
import {
resolveSoffice,
runLibreOfficeOracle,
paragraphShape,
type OracleJob,
} from './libreoffice-oracle.js';

// Named const (not an inline literal) so `scripts/validate_allure_test_labels.mjs` can
// map the `.openspec([LEAN-HELP-*])` tags deterministically to a feature.
Expand Down Expand Up @@ -703,3 +709,134 @@ describeMaybe('Lean Differential Harness - Tier 2 helper extensional equivalence
},
);
});

// ---------------------------------------------------------------------------
// LibreOffice accept/reject oracle voter (PR-B).
//
// LibreOffice is the native engine for the `.uno:AcceptAllTrackedChanges` /
// `.uno:RejectAllTrackedChanges` dispatches, so its output is authoritative ground truth for the
// mark-based paragraph-collapse rule the G3/G4/G5 cases are about: an UNTRACKED paragraph mark is
// kept (as an empty <w:p>) on accept/reject, while a PPR-INS / PPR-DEL mark drops the whole
// paragraph. This makes the accept/reject claims oracle-backed, not just Lean↔TS self-consistent.
//
// The comparison is deliberately STRUCTURAL — paragraph count, plus which paragraphs collapsed to
// empty — not the full token projection. LibreOffice rewrites styles/run-properties, and on the
// contrived nested G3 fixture (a `w:ins` wrapping a `w:del`) it interprets the revision differently
// from Lean/TS: on accept it KEEPS the inserted-then-deleted text, where Lean/TS collapse to empty.
// The paragraph *count* still agrees (the kept-not-dropped claim — what the oracle settles); that
// content divergence is pinned in [LEAN-HELP-09], not hidden. The clean single-level fixtures
// (G4 reject, G5 accept) agree on the full structure ([LEAN-HELP-10]).
//
// Gated on a LibreOffice binary; CI does not install one, so this is a local developer check (it
// skips cleanly, like odf-core's LibreOffice round-trip test). Set SAFE_DOCX_SOFFICE_BIN to point
// at a binary in a non-standard location.
const soffice = resolveSoffice();
const describeOracle = soffice ? describe : describe.skip;
if (!soffice) {
// eslint-disable-next-line no-console
console.warn(
'[lean-differential-helpers] oracle SKIP: no LibreOffice (soffice) binary found. ' +
'Install LibreOffice or set SAFE_DOCX_SOFFICE_BIN to run the accept/reject oracle voter.',
);
}

describeOracle('LibreOffice accept/reject oracle — paragraph-collapse ground truth', () => {
const W_NS = 'http://schemas.openxmlformats.org/wordprocessingml/2006/main';
const rawDoc = (inner: string): string =>
`<?xml version="1.0"?><w:document xmlns:w="${W_NS}"><w:body>${inner}</w:body></w:document>`;
const mark = (id: number): string => `w:id="${id}" w:author="oracle" w:date="2024-01-01T00:00:00Z"`;
const KEEP = '<w:p><w:r><w:t>keep</w:t></w:r></w:p>';

// The three differential fixtures (rendered exactly as the TS engine receives them) plus two
// PPR-marked drop controls — the other direction of the mark-based rule.
const PPR_INS_REJECT = rawDoc(
`<w:p><w:pPr><w:rPr><w:ins ${mark(50)}/></w:rPr></w:pPr>` +
`<w:ins ${mark(51)}><w:r><w:t>inserted line</w:t></w:r></w:ins></w:p>` + KEEP,
);
const PPR_DEL_ACCEPT = rawDoc(
`<w:p><w:pPr><w:rPr><w:del ${mark(52)}/></w:rPr></w:pPr><w:r><w:t>deleted line</w:t></w:r></w:p>` + KEEP,
);
const CASES: { name: string; op: 'accept' | 'reject'; xml: string }[] = [
{ name: 'G3', op: 'accept', xml: renderDocToXml(G3_DOC) },
{ name: 'G4', op: 'reject', xml: renderDocToXml(G4_DOC) },
{ name: 'G5', op: 'accept', xml: renderDocToXml(G5_DOC) },
{ name: 'PPR_INS_REJECT', op: 'reject', xml: PPR_INS_REJECT },
{ name: 'PPR_DEL_ACCEPT', op: 'accept', xml: PPR_DEL_ACCEPT },
];

// ONE headless LibreOffice launch drives the whole batch; project both engines to paragraph shape.
const tsShape: Record<string, boolean[]> = {};
const loShape: Record<string, boolean[]> = {};
// `resolveSoffice()` only checks that a binary EXISTS, not that it can LAUNCH. In some
// environments LibreOffice is installed but cannot start — most notably a sandboxed shell on
// macOS, where `soffice` aborts (SIGABRT) before doing any work. When that happens the oracle
// can't produce ground truth, so we record a skip reason and the assertions no-op cleanly rather
// than fail. The oracle is a best-effort local check; a real terminal with a working LibreOffice
// runs it fully.
let oracleSkip = '';
beforeAll(async () => {
try {
const out = await runLibreOfficeOracle(
CASES.map((c): OracleJob => ({ op: c.op, documentXml: c.xml })),
soffice,
);
CASES.forEach((c, i) => {
tsShape[c.name] = paragraphShape(c.op === 'accept' ? acceptAllChanges(c.xml) : rejectAllChanges(c.xml));
loShape[c.name] = paragraphShape(out[i]!);
});
} catch (err) {
oracleSkip = `LibreOffice present but could not run in this environment — skipping oracle assertions. (${(err as Error).message.split('\n')[0]})`;
// eslint-disable-next-line no-console
console.warn('[lean-differential-helpers] ' + oracleSkip);
}
}, 120_000);

test.openspec('[LEAN-HELP-09] LibreOffice keeps an untracked-mark paragraph (kept-not-dropped), matching the TS engine on G3/G4/G5')(
'every pinned untracked-mark fixture survives as two paragraphs in both LibreOffice and the TS engine',
async ({ then }: AllureBddContext) => {
await then('LibreOffice and TS agree on paragraph count (the paragraph is kept, not dropped)', async () => {
if (oracleSkip) return;
for (const name of ['G3', 'G4', 'G5']) {
expect(loShape[name]!.length, `${name}: LibreOffice paragraph count`).toBe(2);
expect(tsShape[name]!.length, `${name}: TS paragraph count`).toBe(2);
}
// Pinned divergence (characterized, not hidden): on the contrived nested G3 fixture
// (ins wrapping del), LibreOffice KEEPS the inserted-then-deleted text on accept while
// Lean/TS collapse to empty. Only the kept-not-dropped count is oracle-asserted; the
// content difference is recorded here so a change in LibreOffice's behavior is noticed.
expect(loShape['G3'], 'G3: LibreOffice keeps the nested-revision text').toEqual([true, true]);
expect(tsShape['G3'], 'G3: TS collapses the nested revision to empty').toEqual([false, true]);
});
},
);

test.openspec('[LEAN-HELP-10] LibreOffice and the TS engine agree on full paragraph structure for the clean single-level fixtures (G4 reject, G5 accept)')(
'the collapsed paragraph is kept empty in both LibreOffice and the TS engine',
async ({ then }: AllureBddContext) => {
await then('LibreOffice structure equals the TS structure: an empty first paragraph then the survivor', async () => {
if (oracleSkip) return;
for (const name of ['G4', 'G5']) {
expect(loShape[name], `${name}: LibreOffice paragraph shape`).toEqual([false, true]);
expect(tsShape[name], `${name}: TS paragraph shape`).toEqual([false, true]);
expect(loShape[name]).toEqual(tsShape[name]);
}
});
},
);

test.openspec('[LEAN-HELP-11] LibreOffice drops a PPR-marked paragraph (mark-based rule), matching the TS engine')(
'a PPR-INS paragraph on reject and a PPR-DEL paragraph on accept are removed by both LibreOffice and the TS engine',
async ({ then }: AllureBddContext) => {
await then('only the survivor paragraph (with text) remains in both engines', async () => {
if (oracleSkip) return;
for (const name of ['PPR_INS_REJECT', 'PPR_DEL_ACCEPT']) {
// Exactly one paragraph survives, and it is the text-bearing "keep" paragraph — not an
// empty leftover. Asserting the full shape [true] (not just length 1) prevents a vacuous
// pass where both engines happened to leave a single empty paragraph.
expect(loShape[name], `${name}: LibreOffice shape after drop`).toEqual([true]);
expect(tsShape[name], `${name}: TS shape after drop`).toEqual([true]);
}
});
},
);
});
Loading
Loading