-2

I found this regex online but I am struggling to understand it. It is this:

(?=^.{6,10}$)(?=.*\d)(?=.*[a-z])(?=.*[A-Z])(?=.*[!@#$%^&*()_+}{":;'?/>.<,])(?!.*\s).*$

http://regexlib.com/Search.aspx?k=password&c=-1&m=5&ps=100

The description is:

This regular expression match can be used for validating strong password. It expects at least 1 small-case letter, 1 Capital letter, 1 digit, 1 special character and the length should be between 6-10 characters. The sequence of the characters is not important. This expression follows the above 4 norms specified by Microsoft for a strong password.

I see there are the following groups. I have read that ?= means look ahead.

  • (?=^.{6,10}$) Does this means looks ahead that there should be 6-10 characters?
  • (?=.*\d) Does this mean that look ahead that there should be 0 or more characters followed by a digit (so at least one digit)?. Could this have been written as (?=\d+) meaning there should be at least 1 digit?
  • (?=.*[a-z]) pattern to match a-z. Again, could this have been written as (?=[a-z]+)?
  • (?=.*[A-Z]) pattern to match A-Z. Again, could this have been written as (?=[A-Z]+)?
  • (?=.*[!@#$%^&*()_+}{":;'?/>.<,]) Is .* not required here as well?
  • (?!.*\s).*$ - what does this mean?
halfer
  • 18,701
  • 13
  • 79
  • 158
Manu Chadha
  • 11,886
  • 11
  • 51
  • 115
  • You can use a utility like regex101, which gives you somewhat of a visual representation of the regular expression and a textual explanation in the sidebar: [Link](https://regex101.com/r/cFR4bA/1) – esqew Aug 23 '18 at 17:50
  • Note that the `&amp`, `&quote`, `&gt` and `&lt` should all be a single character (respectively `&`, `"`, `>` and ` – Aaron Aug 23 '18 at 18:02
  • Aaron - sorry, i didnt understand. – Manu Chadha Aug 23 '18 at 18:11
  • I suppose you meant `(?=^.{6,10}$)(?=.*\d)(?=.*[a-z])(?=.*[A-Z])(?=.*[!@#$%^&*()_+}{":'?&gt.<,])(?!.*\s).*$` – Manu Chadha Aug 24 '18 at 06:15
  • Why was the question downvoted? – Manu Chadha Aug 24 '18 at 06:25
  • I have not voted, and in general there is not much value in enquiring about why a particular question was downvoted, since the voters will have already gone. If they wished to give you feedback and forego their anonymity, they would already have done so. – halfer Aug 24 '18 at 08:12
  • There's a few items of feedback I can guess at though. Firstly, your subject titles are mostly written in a stylistic lower-case form, which may irritate readers, who expect the first letter in a sentence to be an upper case letter for reasons of readability and common English practice. Secondly, this question contained the phrase "Could someone break it down for me", which may have sounded to voters like a request for free work. – halfer Aug 24 '18 at 08:15
  • Thirdly, to my ears the question emphasised the reader doing the legwork, rather than the author: "Need help in understanding" and "Could someone" in particular - they are both requesting the effort of other people. I appreciate this is subject to cultural bias, but words mean what the reader thinks they mean, not what the writer meant. It is better to say "How can I understand X" or "How can I do X" or "How to do X". They have the benefit of having less of a pleading tone also. I have [an opinion piece about this here](https://meta.stackoverflow.com/q/366264/472495). – halfer Aug 24 '18 at 08:20

1 Answers1

0

You're correct about the first group. Because it's anchored at the beginning and end with ^ and $, it has to match the entire input. So it requires that there be 6-10 characters.

The next four lookaheads need .* at the beginning so that the match the required type of character anywhere in the input. If you just wrote (?=\d+) it would have to match the digits at the current position in the input, which is the beginning. By prefixing each of them with .*, it allows the different types (digit, lowercase, uppercase, punctuation) to be in any order. You don't need to put + after it because matching a single occurrence is enough.

(?!.*\s) is a negative lookahead. \s matches whitespace, so this means that the string must not contain any spaces.

Finally, .*$ just matches the entire input. This is just needed for there to be something after all the lookaheads.

Barmar
  • 596,455
  • 48
  • 393
  • 495