25

I have some strings like this

string phoneNumber = "(914) 395-1430";

I would like to strip out the parethenses and the dash, in other word just keep the numeric values.

So the output could look like this

9143951430

How do I get the desired output ?

CSᵠ
  • 9,711
  • 9
  • 37
  • 61
meda
  • 43,711
  • 13
  • 85
  • 120
  • 2
    Even if you aren't experienced in regex, you should have at least done some research. The number of answers is inversely proportional to the difficulty of the problem. – Jerry Oct 03 '13 at 19:36
  • 1
    There is always someone like you to complain about questions, this is a Q&A site, if I knew I would not ask thanks! – meda Oct 03 '13 at 19:39
  • I'm not complaining, I'm just letting you know that there's already this kind of stuff out there. And one of SO's policy is to redirect all duplicate questions to one main question. Duplicate questions are often deleted like that. – Jerry Oct 03 '13 at 19:45
  • @SystemDown I know how to but if Im not sure I will ask, see I ended up using the answer that does not require regex. Which was something I was totally unaware of – meda Oct 03 '13 at 19:45
  • @Jerry then you should point me to these duplicates, and I could careless about you guys downvote :P – meda Oct 03 '13 at 19:52
  • Huh? I didn't give a single downvote to your question o.o But if you want a possible duplicate, you can use [this question](http://stackoverflow.com/q/1120198/1578604) which should need only a few tweaks. – Jerry Oct 03 '13 at 19:55
  • 5
    why the heck is that answer closed ? i arrive here from google looking for "c# extract numerics from string", and this perfectly valid and simple and universal question is closed as "off topic" what topic ? the programming topic ? i think its pretty well damn on topic. what the heck SO – v.oddou Jul 29 '14 at 07:13
  • it would have been a better option to mark it as duplicate to this question: http://stackoverflow.com/questions/844461/return-only-digits-0-9-from-a-string – Teodor Tite Apr 12 '17 at 12:52

6 Answers6

49

You do any of the following:

  • Use regular expressions. You can use a regular expression with either

    • A negative character class that defines the characters that are what you don't want (those characters other than decimal digits):

      private static readonly Regex rxNonDigits = new Regex( @"[^\d]+");
      

      In which case, you can do take either of these approaches:

      // simply replace the offending substrings with an empty string
      private string CleanStringOfNonDigits_V1( string s )
      {
        if ( string.IsNullOrEmpty(s) ) return s ;
        string cleaned = rxNonDigits.Replace(s, "") ;
        return cleaned ;
      }
      
      // split the string into an array of good substrings
      // using the bad substrings as the delimiter. Then use
      // String.Join() to splice things back together.
      private string CleanStringOfNonDigits_V2( string s )
      {
        if (string.IsNullOrEmpty(s)) return s;
        string cleaned = String.Join( rxNonDigits.Split(s) );
        return cleaned ;
      }
      
    • a positive character set that defines what you do want (decimal digits):

      private static Regex rxDigits = new Regex( @"[\d]+") ;
      

      In which case you can do something like this:

      private string CleanStringOfNonDigits_V3( string s )
      {
        if ( string.IsNullOrEmpty(s) ) return s ;
        StringBuilder sb = new StringBuilder() ;
        for ( Match m = rxDigits.Match(s) ; m.Success ; m = m.NextMatch() )
        {
          sb.Append(m.Value) ;
        }
        string cleaned = sb.ToString() ;
        return cleaned ;
      }
      
  • You're not required to use a regular expression, either.

    • You could use LINQ directly, since a string is an IEnumerable<char>:

      private string CleanStringOfNonDigits_V4( string s )
      {
        if ( string.IsNullOrEmpty(s) ) return s;
        string cleaned = new string( s.Where( char.IsDigit ).ToArray() ) ;
        return cleaned;
      }
      
    • If you're only dealing with western alphabets where the only decimal digits you'll see are ASCII, skipping char.IsDigit will likely buy you a little performance:

      private string CleanStringOfNonDigits_V5( string s )
      {
        if (string.IsNullOrEmpty(s)) return s;
        string cleaned = new string(s.Where( c => c-'0' < 10 ).ToArray() ) ;
        return cleaned;
      }
      
  • Finally, you can simply iterate over the string, chucking the digits you don't want, like this:

    private string CleanStringOfNonDigits_V6( string s )
    {
      if (string.IsNullOrEmpty(s)) return s;
      StringBuilder sb = new StringBuilder(s.Length) ;
      for (int i = 0; i < s.Length; ++i)
      {
        char c = s[i];
        if ( c < '0' ) continue ;
        if ( c > '9' ) continue ;
        sb.Append(s[i]);
      }
      string cleaned = sb.ToString();
      return cleaned;
    }
    

    Or this:

    private string CleanStringOfNonDigits_V7(string s)
    {
      if (string.IsNullOrEmpty(s)) return s;
      StringBuilder sb = new StringBuilder(s);
      int j = 0 ;
      int i = 0 ;
      while ( i < sb.Length )
      {
        bool isDigit = char.IsDigit( sb[i] ) ;
        if ( isDigit )
        {
          sb[j++] = sb[i++];
        }
        else
        {
          ++i ;
        }
      }
      sb.Length = j;
      string cleaned = sb.ToString();
      return cleaned;
    }
    

From a standpoint of clarity and cleanness of code, the version 1 is what you want. It's hard to beat a one liner.

If performance matters, my suspicion is that the version 7, the last version, is the winner. It creates one temporary — a StringBuilder() and does the transformation in-place within the StringBuilder's in-place buffer.

The other options all do more work.

Nicholas Carey
  • 60,260
  • 12
  • 84
  • 126
  • 9
    Wow , what a very detailed answer, you have answered more than my question , which gives me more option and also understanding Users such like you are making SO a great community. Thanks a lot!!! – meda Oct 03 '13 at 21:11
22

use reg expression

 string result = Regex.Replace(phoneNumber, @"[^\d]", "");
COLD TOLD
  • 12,989
  • 3
  • 31
  • 49
  • Nice one-liner. Or `... @"[^\d]+" ...` ("+" added to the expression) to gather more characters at a time per replacement. Haven't tested whether that makes a speed difference. – ToolmakerSteve Jun 01 '18 at 22:44
12

try something like this

  return new String(input.Where(Char.IsDigit).ToArray());
BRAHIM Kamel
  • 12,559
  • 30
  • 46
10
string phoneNumber = "(914) 395-1430";
var numbers = String.Join("", phoneNumber.Where(char.IsDigit));
L.B
  • 106,644
  • 18
  • 163
  • 208
4

He means everything @gleng

Regex rgx = new Regex(@"\D");
str = rgx.Replace(str, "");
Tim S.
  • 52,076
  • 7
  • 84
  • 114
Darka
  • 2,670
  • 1
  • 11
  • 29
  • The `@` is important! – Jerry Oct 03 '13 at 19:37
  • Thanks both. but can you remember me why? I was using some time ago but don't remmeber why? Its because to say it is text :/ – Darka Oct 03 '13 at 19:41
  • The `@` turns the text into raw text so that you don't need to escape the backslash. Otherwise, you'll get an attempted substitution by the program and end up with a regex of `D` only. – Jerry Oct 03 '13 at 19:43
  • yea something remember. Thanks again. – Darka Oct 03 '13 at 19:44
2

Instead of a regular expression, you can use a LINQ method:

phoneNumber = String.Concat(phoneNumber.Where(c => c >= '0' && c <= '9'));

or:

phoneNumber = String.Concat(phoneNumber.Where(Char.IsDigit));
Guffa
  • 640,220
  • 96
  • 678
  • 956