0

Regex noob here, I need help! I would like to create a broad regex that matches phone numbers (any international format) from text and returns only the numbers. I have other functions that will further validate the number based on country, so I am not worried about that for this regex. I mainly just need to strip the numbers out of text, but each number separately. Also, number delimiters can include +-.() or a single space.

Example

This is some text with +1 (234) 222-9898 a phone number in it and a random number 12.  Also here is a +44 0800 655 5059 UK number

I would like the regex to return only

['12342229898', '12', '4408006555059']

I appreciate the help, let me know if you need more requirements!

Yevgen Gorbunkov
  • 12,646
  • 3
  • 13
  • 31
Devon Norris
  • 159
  • 1
  • 9
  • 1
    Looks like you are looking to create a regex, but do not know where to get started. Please check [Reference - What does this regex mean](https://stackoverflow.com/questions/22937618) resource, it has plenty of hints. Also, refer to [Learning Regular Expressions](https://stackoverflow.com/questions/4736) post for some basic regex info. Once you get some expression ready and still have issues with the solution, please edit the question with the latest details and we'll be glad to help you fix the problem. – Wiktor Stribiżew Mar 19 '20 at 22:08
  • 1
    @WiktorStribiżew I realize I should probably learn how to construct regex, and I do understand some basics. That being said, I was simply looking for a quick solution, not a link to a "how to do regex" resource. I will eventually one day take the time to learn how to write it myself – Devon Norris Mar 20 '20 at 17:08

3 Answers3

1

Match numbers with spaces and delimiters, then clean up unwanted characters programmatically after the fact. Example for an expression:

/(?:\+\d+)?(?:[-+. ]?(?:\(\d+\)|\d+))+/g

Edit: Corrected the quantifier. Sorry, should have tested the expression first. ^^

oriberu
  • 825
  • 6
  • 4
  • Hey nice, I just tried this and it works...Is there any way to modify it to not return empty strings with the javascript match function? Or do I just need to filter those out separately? – Devon Norris Mar 19 '20 at 21:58
  • @DevonNorris : it is rather weird saying it works whereas it returns bunch of empty strings and digit groups with all the braces/pluses/spaces/etc – Yevgen Gorbunkov Mar 19 '20 at 22:00
  • @DevonNorris It would be considerably easier to run cleanup after the fact, if you decide to use regex for this task. – oriberu Mar 19 '20 at 22:02
  • @DevonNorris I didn't know what you meant with empty strings; turns out I should have tested the expression first. Wrong quantifier, sorry. – oriberu Mar 19 '20 at 22:13
  • @oriberu ah yes that works! Sorry for not clarifying – Devon Norris Mar 19 '20 at 22:19
  • @YevgenGorbunkov This answer worked for what I needed. Also I probably didn't explain the entire context of what I am trying to do. Your answer worked as well, but failed a few edge cases that I had. I did upvote your answer though. Not sure if you are the one who downvoted my question, but if you are just here for getting stackoverflow points thats not cool – Devon Norris Mar 20 '20 at 15:37
  • @YevgenGorbunkov To be fair, the answer was upvoted/selected more than an hour after I fixed my mistake and I stated from the beginning that the approach shown here was to fetch the numbers as is and clean them up programmatically after. Also, since you brought it up, I upvoted your answer right after you made it. – oriberu Mar 20 '20 at 16:06
  • @YevgenGorbunkov He refactored it to not return empty strings, it actually works perfectly and is very helpful for what I need. Sorry if I didn't clarify something to you, but just calling other answers crappy is not helpful – Devon Norris Mar 20 '20 at 16:09
  • @YevgenGorbunkov It shows 23:32 (accept) and 22:12 (edit). I still see the orange arrow that shows I upvoted your post, which I did when I left my comment on it. I can't speak to the discrepancy, but that's what I see. -- Edit: also, that's where I'll end my involvement in this discourse. Cheers. – oriberu Mar 20 '20 at 16:14
  • @YevgenGorbunkov I don't need to prove anything to you, nor do I care if you are offended. Also to the point of performance, your solution also uses regex, so I really don't understand how it is blazingly faster. This solution works in all of my test cases that I am using it for, and performance seems to be great. Have a great life being snoody to people on stack overflow, I'm done with this conversation as well. ORIBERU Thanks for the great answer! – Devon Norris Mar 20 '20 at 16:22
1

Complete solution

To get plain groups of digits, one may

str.match(/((\d+)[\W]+)+/g)
  • clean up the result by replacing all non-digit (\D) characters
.map(chunk => chunk.replace(/\D/g,''))

Live snippet as a proof of concept:

const str = 'This is some text with +1 (234) 222-9898 a phone number in it and a random number 12.  Also here is a +44 0800 655 5059 UK number',

      digits = str
        .match(/((\d+)[\W]+)+/g)
        .map(chunk => chunk.replace(/\D/g,''))
      
console.log(JSON.stringify(digits))
.as-console-wrapper{min-height:100%;}

Community
  • 1
  • 1
Yevgen Gorbunkov
  • 12,646
  • 3
  • 13
  • 31
0

I would do that in two separated steps. The first would find the blocks that you are searching for. That is numbers with some special chars mixed [ "+", "-", "(", ")" ]. The second step would extract only the numbers from each of those blocks.

Step 1 - Finding the Block

([[0-9]+[ \+\-\(\)[0-9]*]*]*)

https://regex101.com/r/UnYnSN/1

(                         # find each block that
  [                       # provide these values
     [0-9]+               # starts with some number
     [ \+\-\(\)[0-9]*]*   # then, may have any of these special chars or numbers
  ]*                      # repeated times
)                         # end of the block

After that, You have the list of blocks with numbers:

[
   "1 (234) 222-9898 ",
   "12",
   "44 0800 655 5059"
]

Step 2 - Extracting only the Numbers of Each Block

For each of these elements, I would use some simple function that extracts only the number elements of the string

let text = "1 (234) 222-9898"
let letters = text.split("")
let numbers = "0123456789"
let only_numbers_from_text = letters
    .filter( letter => numbers.indexOf(letter) > -1 )
    .join("")
console.log( only_numbers_from_text )

Probably is not the one-line solution. But I think is the easiest to understand and maintain.

Thiago Mata
  • 2,262
  • 28
  • 27