I was writing a Python regex to match urls. I have written a reasonably complex one that is working fairly well. I was trying to get the label part in the hostname to be more accurate based on what I read on Wikipedia.
Basically if you see my code snippet, I have a truth table to test the label part I have written. If you want to understand the truth table, its based on 'Restrictions on hostnames' in the hostname Wiki page.
I just can't get case 1(strings[0]) to work without breaking other cases and cannot figure out just why case 1 wouldn't work. I have used debugging tools but the failure string in question is too small for me to get any significant information.
Please help me out and give me a fix to the one I have written so that I know what I am missing.
If you are wondering why I am not using a third party library to match urls, I am learning regular expressions.
import re
F = False
T = True
results = [T,F,T,F,F,F,T,F,T,F,F,F,F,F]
strings = ['a','-','aa','--','-a','a-','aaa','aa-','a-a','a--','-aa','-a-','--a','---']
x = list(range(len(strings)))
regex_test = r'(?P<Label>(?P<Label_start>[a-zA-Z0-9])((?P<Label_mid>[a-zA-Z0-9-]{1,61})?)(?=(?P<Label_end>[a-zA-Z0-9](?=[./])))(?P=Label_end)?)\.'
if len(strings) == len(results):
for n in x:
if results[n] == bool(re.match(pattern = regex_test, string = strings[n] + '.')):
print("Works.")
if results[n] == True:
print(re.match(pattern = regex_test, string = strings[n] + '.').groupdict())
else:
print("Bug for: " + strings[n])
#print(str(re.match(pattern = regex_test, string = strings[n] + '.',flags=re.DEBUG)))