0

I have the string

"<Request 'http://127.0.0.1:5000/findrisk?latitude=32.7766642&longitude=-96.79698789999998' [GET]>" 

and I am trying to get "latitude=32.7766642" and "longitude=-96.79698789999998"

I thought this would work:

re.findall('(latitude|longitude)=-?\d+.\d+', req)

basically, either latitude or longitude, followed by an equals sign, followed by an optional negative sign, followed by one or more digits, followed by a period, followed by one or more digits, but this is returning

['latitude', 'longitude']

I've tried online regex extractors and they are correctly extracting "latitude=32.7766642" and "longitude=-96.79698789999998", but python's re library isn't. Why is this the case?

4 Answers4

0

You capture only the labels in a group trying capturing also the values like so:

print(re.findall('(latitude|longitude)=(-?\d+.\d+)', req))

This will return list of tuples:

[('latitude', '32.7766642'), ('longitude', '-96.79698789999998')]

Full example:

import re
req ="<Request 'http://127.0.0.1:5000/findrisk? 
latitude=32.7766642&longitude=-96.79698789999998' [GET]>"
print(re.findall('(latitude|longitude)=(-?\d+.\d+)', req))
CodeSamurai777
  • 3,019
  • 2
  • 17
  • 35
0

Use of 'latitude=-?\d+\.\d+|longitude=-?\d+\.\d+' pattern with findall yields you a list of what is desired:

import re

req = "<Request 'http://127.0.0.1:5000/findrisk?latitude=32.7766642&longitude=-96.79698789999998' [GET]>"

print(re.findall('latitude=-?\d+\.\d+|longitude=-?\d+\.\d+', req))
# ['latitude=32.7766642', 'longitude=-96.79698789999998']
Austin
  • 24,608
  • 4
  • 20
  • 43
0

The problem with your regex, assuming Python, is that the parentheses here are assumed to be capture expressions, and not grouping the way you intended. So what you really want is to capture the full expression, but group without capturing, the keyword either latitude or longitude.

From the Python re module documentation,

(?:...) Non-grouping version of regular parentheses.

And this is what you want. So your code should look like this:

re.findall('((?:latitude|longitude)=-?\d+.\d+)', req)

Note that I'm capturing the entire thing, and grouping using the non-grouping parentheses as in the docs. On my system, this gives me the result you want:

['latitude=32.7766642', 'longitude=-96.79698789999998']
Rahul
  • 483
  • 1
  • 4
  • 16
  • you do not need the outer parantheses: `re.findall('(?:latitude|longitude)=-?\d+.\d+', req)` should be enough. – colidyre Sep 09 '18 at 10:23
0

You can change the regex as mentioned in other answers. But you can also use re.finditer() and re.group() to have wanted behaviour:

[x.group() for x in re.finditer('(latitude|longitude)=-?\d+.\d+', req)]

You then have a better control what you want to group. .group() returns subgroups and with no parameter or parameter 0 this means simply return the entire match.

colidyre
  • 3,035
  • 11
  • 30
  • 43