-2

I made this simple functions that searches for emails in the source code of a page , the content is just the response taken from get request , now how do you return the matchs in findall as a list with out the \n or ant other unwanted strings

My goal is to get a list of all the matched strings (emails)

def find_emails(content):
    email_reg =  r"""[a-zA-Z0-9_.]+@+[a-zA-Z0-9]+.+[a-zA-Z0-9]*"""
    mail_lst = re.findall(email_reg , content)
    return mail_lst

when the program reaches this for loop i get the emails found in the regex but they are separated by \n and i get some random string in between the emails

I tried using brackets in my regex but this didn't make any difference

if __name__ == "__main__":
    res = find_emails(content)
    for item in res:
        print(item) 
m3sfit
  • 59
  • 5

1 Answers1

-1
import re

your regex is wrong , . matches all occurance

def find_emails(content):
    email_reg =  r"^[a-zA-Z0-9+_.-]+@[a-zA-Z0-9.-]+$"
    mail_lst = re.findall(email_reg , content)
    return mail_lst

print(find_emails("hiiii@asdasd.££$%%$com"))
print(find_emails("hiiii@asdasd.asdasd.com"))
PDHide
  • 10,919
  • 2
  • 12
  • 26