-1

Just reading through some code for pre-processing text data, and came across these regex and am struggling to figure out what they mean.

ReviewText = ReviewText.str.replace('(<a).*(>).*(</a>)', '')   
ReviewText = ReviewText.str.replace('(\xa0)', ' ')
user266290
  • 13
  • 2

1 Answers1

-1

Well, it looks like they are playing with HTML using regexp . . . generally, folks frown on that but given you are using, not developing we'll ignore that issue for now.

Looks like the first line would take:

<a href="https://www.w3schools.com">Visit W3Schools.com!</a>

and suppress it to nothing.

The second one takes the shown string and changes it to a space.

As the person above stated, you need both the regexp and input to actually do anything with that. Once you have both the regexp and some input, I recommend playing with the input with a regexp checker . . . like here (or equal): https://pythex.org/

Frank Merrow
  • 741
  • 1
  • 6
  • 18