2

After reading the ASP.NET request validation causes: is there a list? post regarding what causes ASP.NET to complain about dangerous inputs I decided to write my own regular expression to use in a RegularExpressionValidator.

I created a regular expression for testing points 2 and 3 from Travis's accepted answer...

  • 2 - If the & character is in a &# sequence (e.g.,   for a non-breaking space), it's a "dangerous string."
  • 3 - If the < character is part of <x (where "x" is any alphabetic character a-z), <!, </, or <?, it's a "dangerous string."

^(.)(&#)+|(<[a-zA-Z!/\?])+(.)$

This seems to work great using the tester on regexlib.com as it matched all the things you'd expect and nothing you wouldn't.

But when I use the expression on an ASP.NET RegularExpressionValidator the validator fires on any text at all! It does the same on Firefox or IE and whether EnableClientScript is true or false. I'm using .NET 4.5.1, but I don't expect that makes any difference. Any ideas why and how to fix it or why it isn't working?

Community
  • 1
  • 1
Simon Gymer
  • 198
  • 1
  • 10
  • I'm not sure on your intentions, whether you're writing this just as an exercise or whether you want to use it in your application, but I'd advise allowing HTML in input and concentrate your efforts on correctly encoding output as necessary. "Dangerous HTML" is only dangerous if output incorrectly. See the [OWASP XSS Cheat Sheet](https://www.owasp.org/index.php/XSS_(Cross_Site_Scripting)_Prevention_Cheat_Sheet) for more info. – SilverlightFox Jan 31 '14 at 10:41
  • I wonder if I'm looking at it wrong in that I'm trying to allow "safe" input, but the validator matches "unsafe" input instead, so even if the matching worked it would precisely match the opposite of what I want? – Simon Gymer Jan 31 '14 at 10:52
  • In ASP.NET a HttpRequestValidationException is thrown if the user tries to submit a form with potentially dangerous input on it. You can stop this happening by turning off request validation, but I do not want to bypass the request validation (as that's an important part of the security) so I want a graceful way to handle the most common ways for request validation to fail so that the user can correct the input rather than anything else. I could write a custom validator to do this, but I thought a regex would be better. – Simon Gymer Jan 31 '14 at 11:10
  • I see. I prefer to turn it off, but you have to be confident you are outputting stuff correctly. See the XSS part of [my other answer here](http://stackoverflow.com/a/20903746/413180) for some info on ASP.NET request validation and where it can be vulnerable. – SilverlightFox Jan 31 '14 at 11:17

2 Answers2

2

I've not looked into the differences between the ASP.NET RegularExpressionValidator and regexlib.com's parser, but I'd wager that there's something wrong with your regex.

When I tried your regex on regexpal.com, it didn't match anything I'd expect it to match (maybe I haven't understood the requirements properly though).

Edit

The following will match a string which does not contain the following strings:

  • &#
  • <a-z
  • <!
  • </
  • <?

Here it is:

^((?!(&#)|(<[a-zA-Z!/\?])).)*$

See it in action at RegexPal.com

Please see this question for details of inverse regex.

My Original Answer (opposite of requirement)

I came up with this to allow any characters before and after point 2 OR any characters before and after point 3

Here it is:

^.*(&#)+.*$|^.*(<[a-zA-Z!/\?])+.*$

View on RegexPal

Community
  • 1
  • 1
theyetiman
  • 7,447
  • 2
  • 26
  • 39
  • Input with in it or < followed by an alphabetic character or ! or / or ? should not be allowed. So "naughtyinput" would be not allowed, but "naughty&input" would be allowed and "naughty – Simon Gymer Jan 31 '14 at 10:54
  • Please see my edit - I think that new regex is what you're after – theyetiman Jan 31 '14 at 11:19
0

I think this is what you are looking for:

&#[!?a-zA-Z/]+

Although, I can't understand your question very well, so I may need some correction.

Vasili Syrakis
  • 8,340
  • 1
  • 31
  • 51