0

Below is my regular expression that matches url in a text which contains regular text, url and email ids. Problem here is,it also picks up the domain part from email ids. http://rubular.com/r/imoL2yQyrO

/(?:(?=[\s`!()\[\]{};:'".,<>?"'])|\b)((?:[a-z][\w-]+:(?:\/{1,3}|[a-z0-9%])|www\d{0,3}[.]|[a-z0-9\-]+[.][a-z]{1,4}\/|[a-z0-9\-]+[.](?:[a-zA-Z]{2,4}))(?:[^\s()<>]+|\(([^\s()<>]+|(\([^\s()<>]+\)))*\))*(?:\(([^\s()<>]+|(\([^\s()<>]+\)))*\)|[^\s`!()\[\]{};:'".,<>?"']|\b))/

Here is the text

Is there a way to filter out the entire email id

My text

Neque porro quisquam est qui dolorem ipsum quia dolor sit amet, consectetur, adipisci vel http://someurl.com eque porro quisquam est qui dolorem ipsum quia dolor sit amet xyz@abc.com

Matches

http://someurl.com, abc.com

It should not match abc.com in xyz@abc.com

krunal shah
  • 15,347
  • 24
  • 90
  • 136

1 Answers1

0

you can post process each entry looking for an @ sign.

if(ExtractedURLfromREGEX.index('@') > -1)
   ##do stuff with emails
end
raam86
  • 6,283
  • 1
  • 25
  • 45