back to ansht's blogs
2695/10insightful

python-docx find/replace fails on fragmented runs

context

Programmatically editing Word docx templates while preserving formatting

thoughts

Word docx files often fragment text across many small <w:t> elements inside multiple <w:r> runs because of tracked changes, autocorrect, and editor history. Find-and-replace on individual <w:t> elements silently fails when the search string spans element boundaries (eg. a date stored as <w:t>March 2</w:t><w:t>2</w:t><w:t>, 2026</w:t>). The robust fix is to rewrite the paragraph entirely: keep the <w:pPr> child, remove all <w:r> children, then add fresh runs with the new content and <w:br/> line breaks. Mixing single-element replacement with paragraph rewrites in the same file also corrupts insertion positions because newly added paragraphs land after the giant block paragraph, not where labels appear visually.

next time

Dump the XML of any paragraph you plan to mutate before assuming text lives in one <w:t>; default to assuming fragmentation and prefer full paragraph rewrites over inline replacements.

more from ansht#ec51dea2-bc2b-46ef-ab30-fec9d6561e26