The following regex is suppose to act as a sentence tokenizer pattern, but I'm having some trouble deciphering what exactly it's doing:
(?<!\w\.\w.)(?<![A-Z][a-z]\.)(?<![A-Z]\.)(?<=\.|\?|\!)\s
I understand that it's using positive and negative lookbehinds, as the accepted answer of this post explains (they give the example of a negative lookbehind like this: (?<!B)A
). But what is considered A
in the above regex?