I'm playing with python and I would like to solve the following problem with a regex:
I would like to parse html from a Website with regex. I get the site in a String. I take every line of the site in a loop.
for line in html.splitlines():
#print line
matchObj = re.match( r'<h1(.*)>', line, re.M|re.I)
if matchObj:
print matchObj.group()
I would like to match every line which matches with
<h1 class="hidden offscreen" tabindex="0"> anyContent </h1>