1

My text file "reg1.txt" goes like:

Python trainings going on well We are connecting to server having IP 192.168.101.124 for Python hands-on My email id is john1@xyz.com use this email for official purpose. Python server IP is 101.201.17.155 used at Cityone campus PYThon server IP is 101.201.101.5 used at Citytwo campus My friend email id is peter1@xyz.com use this email for official purpose. My manager email id is cooldude@xyz.com use this email for official purpose. The PYTHON server IP is 173.101.255.15 used at Citythree campus The Testing server IP is 95.101.175.101 used at Citythree campus

The problem is to find all the IPs in the file. My code goes like:

import re
import os
f1=open("reg1.txt","r")
for line in f1:
    rx=re.search("(\d{1,3}.){3}\d{1,3}",line)
    print(rx)
f1.close()

f2=open("reg1.txt","r")
for line in f2:
    rx=re.search("(\d{1,3}.){3}\d{1,3}",line)
    if rx:
        print(rx.groups())
f2.close()

My console shows results:

<re.Match object; span=(38, 53), match='192.168.101.124'>
None
<re.Match object; span=(34, 48), match='101.201.17.155'>
<re.Match object; span=(20, 33), match='101.201.101.5'>
None
None
<re.Match object; span=(24, 38), match='173.101.255.15'>
<re.Match object; span=(25, 39), match='95.101.175.101'>
('101.',)
('17.',)
('101.',)
('255.',)
('175.',)

Why the code prints only the 3rd portion of the matched object when the matching shows the full span of the ip address?

How to print the whole IP address ?

2 Answers2

0

use print(rx.group()) instead print(rx.groups())

Match.groups(default=None) Return a tuple containing all the subgroups of the match, from 1 up to however many groups are in the pattern.

but in your case, you're only capturing only 1 group i.e (\d{1,3}.)

https://docs.python.org/3/library/re.html#re.Match.groups

Vishal Singh
  • 5,236
  • 2
  • 15
  • 27
  • `groups` Return a tuple containing all the subgroups of the match, from 1 up to however many groups are in the pattern. – Vishal Singh Jul 01 '20 at 12:46
0

You may read the file into a variable and run a single call to re.findall:

import re

rx = r"(?<!\d)(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)(?:\.(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)){3}(?!\d)"
with open("reg1.txt","r") as f1:
    contents = f1.read()            # Read the file into contents variable
    print(re.findall(rx, contents)) # Extract all IPs

You may pass f1.read() instead of assigning to contents directly to re.findall.

The pattern is taken from my previous answer, I just added digit boundaries to it, (?<!\d) (no digit allowed immediately to the left) and (?!\d) (no digit allowed immediately to the right). You may consider using \b, word boundaries, instead.

Wiktor Stribiżew
  • 484,719
  • 26
  • 302
  • 397
  • This is helpful! I was perticularly concerned about what's wrong with groups(). Thanks! – SUDIPTA SAMAL Jul 01 '20 at 12:43
  • @SUDIPTASAMAL There is nothing wrong. `re.search` fetches the first match only, you need multiple matches and those can be retrieved with `re.findall` or `re.finditer`. `.groups` only getch you *captured* substrings from the match data object. – Wiktor Stribiżew Jul 01 '20 at 12:49
  • I use the find all. I was playing with "search" and this rises. So I post it. Findall has the bigger scope and more functionalities. Thanks for your input – SUDIPTA SAMAL Jul 01 '20 at 12:51
  • @SUDIPTASAMAL A [non-capturing group](https://stackoverflow.com/questions/3512471/what-is-a-non-capturing-group-in-regular-expressions). – Wiktor Stribiżew Jul 01 '20 at 17:24