8

What's the shortest regex that can match non-zero floating point numbers with any number of decimal places?

It should accept numbers like

-1
-5.9652
-7.00002
-0.8
-0.0500
-0.58000
0.01
0.000005
0.9900
5
7.5
7.005

but reject constructions such as

.
.02
-.
-.996
0
-0
0.
-0.
-0.000
0.00
--
..
+
+0
+1
+.
+1.26
,etc

I do not need support for the scientific notation, with e , E and such.
The language I'm using is C#, by the way.

luvieere
  • 35,580
  • 18
  • 120
  • 178
  • 3
    Ah, I smell a "my regex is shorter then yours" contest. The things geeks fight over... :) – Chen Levy Dec 02 '09 at 08:31
  • `0.0` will be in which side? accept or reject? – YOU Dec 02 '09 at 09:15
  • 1
    You forgot to include `0.000` in your test cases, most of the early answers accept it, but it's still zero in my book. :) –  Dec 02 '09 at 09:19
  • How about 001.000? accept right? – YOU Dec 02 '09 at 09:40
  • I'm willing to pass `001.000` as accepted, provided the solution is short enough. – luvieere Dec 02 '09 at 09:42
  • Allowing leading zeros will cause confusion, since they are almost universally disallowed. By the same token, trailing zeros are almost always accepted (and can be significant). –  Dec 02 '09 at 10:38

4 Answers4

7
^-?(0\.\d*[1-9]|[1-9]\d*(\.\d+)?)$

EDIT Updated to reflect new requirements (last decimals can be zero)

^-?(0\.\d*[1-9]\d*|[1-9]\d*(\.\d+)?)$

(Shorter than using lookahead: ^-?(0\.(?=[1-9])\d*|[1-9]\d*(\.\d+)?)$.)


EDIT2 If e.g. 001.000 can pass

^-?(?=.*[1-9])\d+(\.\d+)?$
jensgram
  • 29,088
  • 5
  • 77
  • 95
  • Unfortunately I'm not familiar with the regex specifics in C#. – jensgram Dec 02 '09 at 08:44
  • Yet, fortunately, your syntax was correct. As an addendum, I would go for `^-?(0\.\d*[1-9]\d*|[1-9]\d*(\.\d+)?)$` instead, in order to preserve consistency of being able to enter final zeros after numbers in the (-1, 1) range too, not only after numbers that begin with a positive digit. – luvieere Dec 02 '09 at 09:21
  • 2
    Almost: rejects 0.10. Add another `\d*` after the first `[1-9]`. –  Dec 02 '09 at 09:21
  • Yeah, I was not quite sure as to whether e.g. `0.10` should be rejected or not. I see now that I was less than consistent :) – jensgram Dec 02 '09 at 09:27
  • @luvieere I think my l33t regex0rz skillz are pretty much exhausted by now :) 26 chars seems to be my best shot. – jensgram Dec 02 '09 at 10:36
0

This is the one I always use:

(\+|-)?([0-9]+\.?[0-9]*|\.[0-9]+)([eE](\+|-)?[0-9]+)?

Utilized in a PHP example:

<?php

$s= '1.234e4';

preg_match('~(\+|-)?([0-9]+\.?[0-9]*|\.[0-9]+)([eE](\+|-)?[0-9]+)?~', $s, $m);
print_r($m);

?>

Output:

Array
(
    [0] => 1.234e4
    [1] =>
    [2] => 1.234
    [3] => e4
)
leepowers
  • 35,484
  • 22
  • 93
  • 127
0
-?(?!0)\d+(\.\d+)?

Note: Remember to put ^ $ if it's not done by your regexp matcher.

May I ask why the "shortest"? A pre-compiler RegExp or the same with non-matching groups could be faster. Also a test for zero could possibly be faster too.

Wernight
  • 32,087
  • 22
  • 110
  • 128
  • I want the shortest as it will go someplace in a XAML file and I want to keep it as brief as possible. – luvieere Dec 02 '09 at 09:09
  • A difference of 5-15 bytes matters enough to disregard performance and clarity? –  Dec 02 '09 at 09:16
  • Characters in a regexp don't matter much once it's compiled. A RegExp evaluator is a finite state machine. There are many way to improve a FSM graph, and that's what some compilers do. In short, there is not a direct relation between the RegExp string length and it's evaluation speed. – Wernight Dec 02 '09 at 09:24
  • I'm not concerned about speed, but about visual compactness. I wouldn't want I big regex in XAML, it's hard to follow. – luvieere Dec 02 '09 at 09:30
  • What is the `!` sign? I've tested this and it doesn't work, it fails with 5.02 and many others... – luvieere Dec 02 '09 at 09:37
  • `(?!...)` is a negative lookahead assertion. At that point, the `...` expression must not match for the assertion to succeed. http://www.regular-expressions.info/lookaround.html –  Dec 02 '09 at 09:52
  • luvieere: Code-golf ("what's the shortest way to do ... regardless of other considerations") is **not** about making code that is easy to follow. Code should be concise but not necessarily the shortest possible. It definitely sounds like you won't be editing this regex often, so why do you care if it's 5 characters or 25? Take a look at a language like APL, which definitely produced *short* (in terms of characters) programs. http://en.wikipedia.org/wiki/APL_(programming_language)#Examples –  Dec 02 '09 at 10:01
  • 1
    This matches 5.02 for me (not tested in C# though), lack of `(?!)` support might be your issue? The `.` should've been escaped. (That kind of error is easy to make when you worry about code size instead of other things... :P) –  Dec 02 '09 at 10:05
  • "That kind of error is easy to make" ... tell me about it! And easy to correct as well, just click edit and go right at it. – luvieere Dec 02 '09 at 10:09
  • Incidentally, the `\.` only appeared as `.` because of formatting (fixed now). This won't accept "-1. 5", or did you mean something else? –  Dec 02 '09 at 11:37
0

You might wish to consider these variations.

Community
  • 1
  • 1
tchrist
  • 74,913
  • 28
  • 118
  • 169