I have a document with many lines like this:
<tr><td width="10%">doc_no_320F0321</td><td width="5%">116</td><td> bla bla bla 1976, bla bla point (2) bla bla bla. </td><td> bla bla bla 1976, bla bla point (1) bla bla bla. </td></tr>
(Beautified it would look like this:
<tr>
<td width="10%">doc_no_320F0321</td>
<td width="5%">116</td>
<td> bla bla bla 1976, bla bla point (2) bla bla bla. </td>
<td> bla bla bla 1976, bla bla point (1) bla bla bla. </td>
</tr>
)
What I need to do is to check if the digits from the third and forth < td > are the same, ignoring the other characters.
For this I'm trying to highlighting them with < mark > so that they are easier to see. I'm running this sed replace:
sed -i -r 's|(<td>.*?)([[:digit:]]+)(.*?<\/td>)|\1<mark>\2<\/mark>\3|g'
But it only surrounds the last digit in each row.
Can someone help me surround ALL combinations of digits in the 3rd and 4th tag?
Thanks.