The translation sentence (<s />) elements have the same id value as the original sentence. See id="t0b0d0p0s0" in the following example. This violates the XML specification requiring id attributes to be unique across a single document.
<p id="t0b0d0p0">
<s id="t0b0d0p0s0"><w id="t0b0d0p0s0w0" ARPABET="T HH IY S" time="0.72" dur="0.25">This</w> <w id="t0b0d0p0s0w1" ARPABET="IY S" time="0.97" dur="0.14">is</w> <w id="t0b0d0p0s0w2" ARPABET="AA" time="1.11" dur="0.05">a</w> <w id="t0b0d0p0s0w3" ARPABET="T EY S T" time="1.16" dur="0.58">test</w>.</s>
<s do-not-align="true" id="t0b0d0p0s0" sentence-id="t0b0d0p0s0" class="sentence__translation editable__translation" xml:lang="eng">Ceci est un test.</s>
</p>
There was an attempt to fix this issue, but there is now functionality that depends on this broken implementation. Additionally, any corrective action will need to support the "broken" implementation since older readalong XML files will not get fixed.
Recommendations
- append the suffix
trN to the original sentence's id to generate t0b0d0p0s0tr0. Current read alongs have a single translation, the trN prefix would support additional translations.
- use the
sentence-id attribute to identify a sentence's translation
- maintain current implementations to support older read along files.
The translation sentence (
<s />) elements have the sameidvalue as the original sentence. Seeid="t0b0d0p0s0"in the following example. This violates the XML specification requiring id attributes to be unique across a single document.There was an attempt to fix this issue, but there is now functionality that depends on this broken implementation. Additionally, any corrective action will need to support the "broken" implementation since older readalong XML files will not get fixed.
Recommendations
trNto the original sentence's id to generatet0b0d0p0s0tr0. Current read alongs have a single translation, thetrNprefix would support additional translations.sentence-idattribute to identify a sentence's translation