`render_word_file` removes `<` and `&` from tag values

Which tool versions are you using?

SDK: v14.11.0
Platform: v24.06.6
Python: v3.10
Isolation mode: venv

Current Behavior

When using render_word_file(), all < and & characters in the values supplied to WordFileTag() are being removed.

I’m using this code

def download_test_word_file(self, **kwargs) -> DownloadResult:

    content = "A < B << C > D >> E & F &amp; G &lt; H &gt; I"

    components = [
        WordFileTag("test_123", content),
        # Testing some workarounds
        WordFileTag("test_456", content.replace("<", "\N{LESS-THAN SIGN}")),
        WordFileTag("test_789", content.replace("<", "\N{FULLWIDTH LESS-THAN SIGN}")),
    ]

    with open(Path(__file__).parent / "resources" / "simple.docx", "rb") as template:
        word_file = render_word_file(template, components)

    return DownloadResult(word_file, "dummy_test.docx")

The Word template looks like:

The result looks like:

Expected Behavior

I expected the < and & characters to show up in the Word output, just as the > character.

Context (optional, but preferred)

We’re generating a Word document that, among other things, includes names of machinery. They sometimes include the less-than of greater-than signs, like:

mobile crane < 80 tons

Hi @rkg ,

I haven’t tried it out myself, but maybe this could help you out?

I’ve tried several things, but none seems to be working.

In my code example I tried using &amp;, &lt; and &gt; as proposed in the Github issue. But as shown in the screenshots, they are completely removed (on the “test_123” row, there’s nothing between F, G, H and I).

Taking a look inside the word\document.xml file inside the .docx reveals

<w:t xml:space="preserve">A  B  C &gt; D &gt;&gt; E  F  G  H  I</w:t>

In the the python-docx-template documentation they mention escaping by means of {{ <var>|e }}. Unfortunately that doesn’t solve it either and shows up in Word like:
image
And the xml:

<w:t xml:space="preserve">A  B  C  D  E  F amp; G lt; H gt; I</w:t>

I also tried using the suggested R() in the value sent to WordFileTag(), but that result in a TypeError: Object of type RichText is not JSON serializable.

I’m a bit confused by there results :woozy_face:. Only > is correct when no escaping in the template is preformed.

In Python string Output of {{ <var> }} Output of {{ <var>|e }}
< nothing nothing
> > nothing
& nothing nothing
&lt; nothing lt;
&gt; nothing gt;
&amp; nothing amp;