How to check a part of string with regex pattern in python

Question

I want to check if a string contains some part that matches a given regex pattern. My regex is to check for the presence of an IP address, it goes like this

regex = '''^(25[0-5]|2[0-4][0-9]|[0-1]?[0-9][0-9]?)\.( 
            25[0-5]|2[0-4][0-9]|[0-1]?[0-9][0-9]?)\.( 
            25[0-5]|2[0-4][0-9]|[0-1]?[0-9][0-9]?)\.( 
            25[0-5]|2[0-4][0-9]|[0-1]?[0-9][0-9]?)$'''

I want to check if a string like this contains an IP address in it

url_string = "http://110.234.52.124/paypal.ca/index.html"

since it has an IP address, I want to detect that, how can I do that?

Also, note your regex matches a full string, you need to replace the `^` and `$` with `\b` and add the `r` prefix before the string literal. — Wiktor Stribiżew, Oct 27 '20 at 19:53
thanks! , it worked, had been trying that but missed the \b part. — DhruvStan7, Oct 27 '20 at 20:04
So, the only problem turned out to be anchors. You just need to check the basic regex concepts then. — Wiktor Stribiżew, Oct 27 '20 at 22:04

Wiktor Stribiżew · Answer 1 · 2020-10-27T20:06:14.647

There are at least two issues with the regex:

It contains whitespace that is used as formatting whitespace, a re.X or re.VERBOSE options are required for it to work
There are ^ and $ anchors here that require a full string match. You probably want to use word boundaries, \b instead
If you add word boundaries, the regular string literal will require doubling backslashes, but if you add a r prefix and make it a raw string literal, you can just use \b
If there are other dot-separated number strings that are not IPs and need filtering out, you need (?<!\d)(?<!\d\.) at the start instead of ^ and (?!\.?\d) at the end instead of $.

You can use

import re

regex = r'''\b(25[0-5]|2[0-4][0-9]|[0-1]?[0-9][0-9]?)\.( 
            25[0-5]|2[0-4][0-9]|[0-1]?[0-9][0-9]?)\.( 
            25[0-5]|2[0-4][0-9]|[0-1]?[0-9][0-9]?)\.( 
            25[0-5]|2[0-4][0-9]|[0-1]?[0-9][0-9]?)\b'''
url_string = "http://110.234.52.124/paypal.ca/index.html"
print( bool(re.search(regex, url_string, re.X)) )
# => True

See the Python demo

However, you may define an octet pattern as a variable, and build a pattern dynamically removing the need to use re.X and that long pattern:

import re
o = r'(?:25[0-5]|2[0-4][0-9]|[0-1]?[0-9][0-9]?)'
regex = fr'\b{o}(?:\.{o}){{3}}\b'
# OR regex = fr'(?<!\d)(?<!\d\.){o}(?:\.{o}){{3}}(?!\.?\d)'
url_string = "http://110.234.52.124/paypal.ca/index.html"
print( bool(re.search(regex, url_string, re.X)) )
# => True

See the Python demo. Note the double braces around the {{3}} (in an f-string, literal braces are defined with double braces).

Yes, the issue was point no 2, i was missing th /b part, now it worked. Thanks ! — DhruvStan7, Oct 27 '20 at 20:05

score 1 · Accepted Answer · answered Oct 27 '20 at 20:10

1

import re

regex = "(25[0-5]|2[0-4][0-9]|[0-1]?[0-9][0-9]?)\.( \
            25[0-5]|2[0-4][0-9]|[0-1]?[0-9][0-9]?)\.( \
            25[0-5]|2[0-4][0-9]|[0-1]?[0-9][0-9]?)\.( \
            25[0-5]|2[0-4][0-9]|[0-1]?[0-9][0-9]?)"

result = re.search(regex, "http://110.234.52.124/paypal.com")

you just need remove ^ and $ and call this function if result is None that means not found

answered Oct 27 '20 at 20:10

amir Reza Seddighin

65
2

2

@DhruvStan7 But it [matches](https://ideone.com/Yhx4Bp) strings like `http://paypal.com/11111111110.234.52.12456678`, too. – Wiktor Stribiżew Oct 27 '20 at 20:20
Ya that would be fine since I want to find the occurrence of IP address, be it anywhere – DhruvStan7 Oct 27 '20 at 20:51

How to check a part of string with regex pattern in python

2 Answers2