1

I have a OOXML (from a Word .docx file) that looks like this:

<w:tr>
    <w:tc>
        <w:p>
            <w:r>
                <w:t>~TABLE_xxx~</w:t>
            </w:r>
        </w:p>
    </w:tc>
</w:tr>
<w:tr>
    <w:tc>
        <w:p>
            <w:r>
                <w:t>~TABLE_</w:t>
            </w:r>
            <w:r w:rsidRPr="00FB4DC5">
                <w:t>xxx</w:t>
            </w:r>
            <w:r>
                <w:t>~</w:t>
            </w:r>
         </w:p>
     </w:tc>
</w:tr>

I want to find all elements where the descendants text contains "~TABLE_xxx~".

I have tried the following:

//w:tr[descendant::text()[contains(., "~TABLE_xxx~")]]

However this only matches the first <w:tr> of my doc. My guess is that because the second one has text split in different <w:r> (Word "runs" of text), I don't get a match.

What is the way around that?

kjhughes
  • 89,675
  • 16
  • 141
  • 199
Arnaud B
  • 80
  • 5

2 Answers2

0

Search by 'xxx' is not an option?

//w:tr[descendant::text()[contains(., "xxx")]]
  • Nope, I simplified here, but "xxx" here might be found elsewhere in the doc, and it shouldn't match. – Arnaud B Apr 26 '18 at 00:58
0

Testing text nodes is the wrong way to go, especially with OOXML, which frequently breaks strings into w:r runs. Instead, test string-values.

This XPath,

//w:tr[contains(.,"~TABLE_xxx~")]

will select all w:tr elements whose string-value contains the targeted string.

See also: Testing text() nodes vs string values in XPath

kjhughes
  • 89,675
  • 16
  • 141
  • 199