(from Python check for valid email address?)
I don't completely understand
[^@]+@[^@]+\.[^@]+
Can someone explain this in detail?
(from Python check for valid email address?)
I don't completely understand
[^@]+@[^@]+\.[^@]+
Can someone explain this in detail?
It looks for 1+ non-@
characters, followed by an @
, followed by 1+ non-@
characters, followed by a .
, followed by 1+ non-@
characters.
[]
s denote a character class, and the ^
negates the character class. +
matches 1+ of the preceding characters. Finally, the .
is escaped like \.
because the .
is a reserved symbol meaning "any character".
This means it isn't the best method for checking emails, since there are a lot more restrictions. For example, this would validate a 10,000 character long email or an email with a domain like !@#.com
.
Get used to using a tool like Regex101 for testing expressions and getting good descriptions.
[^@]+
- checks for anything that is not the @
symbol, one or more times.
@
searches for the @
symbol, clearly.
\.
searches for the .
character (it must be escaped since .
searches for any character)
So it looks for any string not containing @
, followed by @
, followed by any string not containing @
, followed by .
, followed by any string not containing @
.
A proper validator for the RFC822 address specification (section "6. ADDRESS SPECIFICATION" on page 27) is a bit more complex than a small regex.
In order to do this properly, a grammar would be needed(like the one described in said rfc) but a regex works too. Such a regex can be found in the Email::Valid module, more exactly right here. I haven't tried that regex in Python(but it works fine in Perl).
AFAIK that's the de facto way of checking if an e-mail address is rfc822-valid. Also see this SO post for more details.
But to answer your question now, the regex [^@]+@[^@]+\.[^@]+
reads as "At least one or more non-@ , then a @ , then at least one or more non-@ , then a dot, then at least one or more non-@".