I scraped texts from Wikipedia, now I would like to perform text analysis on them. I'd like to remove all the latex from them.
I have tried some regular expression, but unable to find the one that will do the trick.
Texts that I want to preserve. Remove the messy latex below.
2
{\displaystyle 2}
⁄
3
{\displaystyle {\sqrt {3}}}
. I want to preserve some texts here: (Similar latex as above)
2
{\displaystyle 2}
⁄
3
{\displaystyle {\sqrt {3}}}
I would expect the result to be all valid texts. In the case above, (Texts that I want to preserve. Remove the messy latex below. I want to preserve some texts here: (Similar latex as above))