Skip to content

Add unit test for duplicate-paragraph collision/salt-loop in insertParagraphBookmarks #282

@stevenobiajulu

Description

@stevenobiajulu

Problem

insertParagraphBookmarks derives each _bk_* identifier from sha12(paragraph text + previous-paragraph text + next-paragraph text + ancestor signature) via buildParagraphSeed and deriveDeterministicJrParaName in packages/docx-core/src/primitives/bookmarks.ts. When two paragraphs have identical text AND identical neighbor context, the seeds collide. The implementation handles this with a salt-loop at lines ~62-79:

while (attempt < 10_000) {
  const salt = attempt === 0 ? '' : `|salt:${attempt}`;
  const candidate = `_bk_${sha12(`${seed}${salt}`)}`;
  if (!params.usedNames.has(candidate)) {
    params.usedNames.add(candidate);
    return candidate;
  }
  attempt += 1;
}

So: the first colliding paragraph gets the unsalted hash, the second gets salt:1, the third salt:2, etc. Determinism across reopens is preserved because DOM-order iteration over <w:p> is stable.

This load-bearing salt-loop has no unit test. The existing tests in packages/docx-core/test-primitives/bookmarks.test.ts and paragraph_id_stability.traceability.test.ts cover (a) IDs match the _bk_[0-9a-f]{12} pattern, (b) retrieval roundtrips, (c) stability across reopens of the SAME document. None of them constructs a document where two paragraphs would collide on seed.

Acceptance criteria

Add one or two BDD-style traceability tests (preferred location: paragraph_id_stability.traceability.test.ts, alongside the existing stability scenario) that:

  1. Two paragraphs colliding on seed get distinct _bk_* IDs. Build a 2-paragraph document where both <w:p> elements have identical inline text AND no prev/next neighbors (or the same neighbor context). After insertParagraphBookmarks, assert that:

    • both paragraphs have a _bk_* ID matching the canonical regex,
    • the two IDs are NOT equal to each other.
  2. The collision resolution is stable across reopens. Open the same XML body twice (two independent parseXml(...)), apply insertParagraphBookmarks to each, and assert that:

    • the second open's two IDs equal the first open's two IDs paragraph-by-paragraph (byte-identical),
    • i.e., the first paragraph in document order gets the unsalted hash both times, the second gets salt:1 both times.

Use the existing makeDoc(bodyXml) helper. Follow the existing BDD/given/when/then style. Use test.openspec('<scenario name>') so the scenario lands on a traceability lane. Naming suggestions:

  • 'insertParagraphBookmarks resolves seed collisions with a deterministic salt'
  • 'Collision resolution is stable across independent reopens'

Files to read first

  • packages/docx-core/src/primitives/bookmarks.ts (lines 32-90 for the seed/derivation; lines 186-235 for the public insertParagraphBookmarks).
  • packages/docx-core/test-primitives/paragraph_id_stability.traceability.test.ts (template for BDD style + how test.openspec is used).
  • packages/docx-core/test-primitives/bookmarks.test.ts (template for the simpler makeDoc-based fixtures).
  • packages/docx-core/test-primitives/helpers/allure-test.ts (the BDD test factory).

Out of scope

  • Changing the salt-loop algorithm itself. The test characterizes existing behavior; if there's a desire to change "salt suffix" to "content nonce" or similar, that's a separate proposal.
  • Property-based / fuzzing variants. One BDD scenario per case is enough for the traceability lane.
  • Cross-package coverage. Stay inside packages/docx-core.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions