Your code does not work because -- aside from several issues with your regex that have been pointed out in other answers -- the website you provided displays the port number of each IP by executing some javascript in the underlying HTML code.
In order to capture each IP and its associated port number, you first need to execute the javascript so that the port numbers are properly printed in the HTML response (you can follow the guidelines here: Web-scraping JavaScript page with Python). Then you need to extract this information from the javascript-computed HTML response.
By inspecting the HTML response, I found out that each port number is preceded by :</font>
and followed by <
.
A working code snippet can be found below. I took the liberty of slightly modifying your IP-regex as only certain IP addresses were associated with a port number (other IPs were related to the hostname column and should be discarded) - namely, the IPs of interest are those followed by the <script
string.
import dryscrape
import re
url = 'http://spys.one/free-proxy-list/FR/'
#get html with javascript
session = dryscrape.Session()
session.visit(url)
response = session.body()
#capture ip:
IP = re.findall(r'[0-9]+(?:\.[0-9]+){3}(?=<script)',response)
#capture port:
port = re.findall(r'(?<=:</font>)(.*?)(?=\<)',response)
#join IP with ports
IP_with_ports = []
for i in range(len(IP)):
IP_with_ports.append(IP[i] + ":" + port[i])
print (IP_with_ports)
OUTPUT: ['178.32.213.128:80', '151.80.207.148:80', '134.119.223.242:80', '37.59.0.139:17459', ..., '37.59.0.139:17658']
Do note that the code above only works for the website you provided, as each website has its own logic for displaying data.