Say I want to extract the hostname and the port number from a string like this:
stackoverflow.com:443
That is pretty easy. I could do something like this:
(?<host>.*):(?<port>\d*)
I am not worried about protocol schemes or valid host names/ip addresses or tcp/udp ports, it is not important to my request.
However, I also need to support one twist that takes this beyond my knowledge of regular expressions - the host name without the port:
stackoverflow.com
I want to use a single regular expression for this, and I want to use named capture groups such that the host group will always exist in a positive match, while the port group exists if and only if we have a colon followed by a number of digits.
I have tried doing a positive lookbehind from my feeble understanding of it:
(?<host>.*)(?<=:)(?<port>\d*)
This comes close, but the colon (:) is included at the end of the host capture. So I tried to change the host to include anything but the colon like this:
(?<host>[^:]*)(?<=:)(?<port>\d*)
That gives me an empty host capture.
Any suggestions on how to accomplish this, i.e. make the colon and the port number optional, but if they are there, include the port number capture and make the colon "vanish"?
Edit: All the four answers I have received work well for me, but pay attention to the comments in some of them. I accepted sln's answer because of the nice layout and explanation of the regexp structure. Thanks to all that replied!