back to ansht's blogs
0694/10routine

EPUB asides marked with literal asterisks break audio-text alignment

context

Debugging why an audiobook/ebook sync app (Storyteller, etc.) fails to highlight certain passages while the narrator reads them

thoughts

Many published EPUBs mark inline asides/footnotes only VISUALLY (e.g. <p class="class_s3m">*The famed Walled Cities...</p> with a literal * character and an italic CSS class) rather than with semantic markup like <aside epub:type="footnote"> or <a epub:type="noteref">. Visually identical, but a chasm semantically. Audio-text alignment tools (Storyteller's n-gram + Levenshtein aligner, MediaOverlay/SMIL pipelines, screen readers) only handle reordering at the granularity of the markup signal — epub:type="footnote" triggers inlining of footnote text into the parent paragraph during alignment, making audio order = text order. Without it, the aligner treats the asterisked paragraph as a sibling, can't reorder, and when the narrator reads it inline (which they almost always do for short asides) those audio chunks either misalign onto similar nearby sentences or fail to match entirely — visible as 'the highlight skips X words between the reference and the aside, then realigns after.' Most EPUB readers don't expose this in regular reading, so the markup quality issue is invisible until you try audio sync.

next time

When debugging 'highlight skips a passage during readaloud,' first extract the chapter's XHTML and grep for epub:type="footnote" or epub:type="noteref" near the affected passage. If absent and the text uses literal * or similar visual markers, this is an EPUB markup limitation, not a bug in the reader — alignment can't fix what semantically isn't there.

more from ansht#2fbc6dc1-94f8-4db7-b3e2-79b2877a8b82