-2

I have a form which generates MySQL query at run-time after submitting this form. This query includes series of CASE WHEN END statements like this:

CASE 
    WHEN <any_user_defined_condition> THEN <any_user_defined_value>
    ELSE <any_user_defined_value>
END,
CASE 
    WHEN <any_user_defined_condition> THEN <any_user_defined_value>
    WHEN <any_user_defined_condition> THEN <any_user_defined_value>
    ELSE <any_user_defined_value>
END,
CASE 
    WHEN <any_user_defined_condition> THEN <any_user_defined_value>
    ELSE <any_user_defined_value>
END

Please suggest me a PHP regular expression which can validate above format so that I can validate MySQL query format before saving it into the database.

Thanks a ton in advance!

Edited:

I have few drop-downs through which user can create certain pre-decided conditions which is known as <any_user_defined_condition> in my question. And of course, I am validating it strictly before allowing the user to submit it. So overall, <any_user_defined_condition> will be replaced by some AGGREGATE functions or some MYSQL built-in functions along-with user-specified strings (which will be again 100% SQL-Injection safe). Ultimately, I don't care about the formatting (whether this entire example written in one line or in multiple lines) but I am actually concerned about the style of string.

d.coder
  • 1,720
  • 15
  • 23
  • Do you want to validate the style of writing the `CASE WHEN THEN ELSE END` or write an expression inside it? Since I understood the title wrong – Toleo Apr 09 '18 at 13:30
  • Validate style of writing? Ultimately, I want to save a valid executable query in the database. – d.coder Apr 09 '18 at 13:33
  • 1
    Maybe `(?s)CASE.*?END`? Showing some attempts would help. – chris85 Apr 09 '18 at 13:39
  • 1
    Saving executable queries in the DB sounds like a bad/dangerous design. – chris85 Apr 09 '18 at 13:41
  • @chris85: This query basically is to generate dynamic reports at run-time and I am saving only `SELECT` and `WHERE` part of the entire query in DB which will be combined with the actual table at a later stage depending upon several other parameters. And because this entire process is done in admin panel so I don't think it is dangerous. Anyways, thanks for bringing it up but this is not the topic to discuss. – d.coder Apr 09 '18 at 13:48

2 Answers2

1

As I undersood secondly, That you want to verify the Query writing on CASE WHEN condition, Try the following RegEx:

with Group Capture in case you wanted to check each part and do a process on it

/((CASE\s+((WHEN\s+.+\s+THEN\s+.+)(\s+WHEN\s+.+THEN\s+.+)*)\s+ELSE\s+.+\s+END)(,\s*CASE\s+((WHEN\s+.+THEN\s+.+)(\s+WHEN\s+.+\s+THEN\s+.+)*)\s+ELSE\s+.+\s+END)*)/i

Like this: https://regex101.com/r/PamqfQ/4


Or for full check with non-Group Capture in case you just want to validate the entire string

/(?:(?:CASE\s+(?:(?:WHEN\s+.+\s+THEN\s+.+)(?:\s+WHEN\s+.+THEN\s+.+)*)\s+ELSE\s+.+\s+END)(?:,\s*CASE\s+(?:(?:WHEN\s+.+THEN\s+.+)(?:\s+WHEN\s+.+\s+THEN\s+.+)*)\s+ELSE\s+.+\s+END)*)/i 

Like this: https://regex101.com/r/PamqfQ/5


You can replace the .+ with the allowed characters or words you want.

Toleo
  • 714
  • 4
  • 19
  • Very close but it is not validating if there is two `WHEN` statements within one `CASE` like: `CASE WHEN THEN WHEN THEN ELSE END` – d.coder Apr 09 '18 at 14:25
  • I was testing this regex with following string but it failed to validate it: `$regex = preg_match("/((CASE\s+((WHEN\s+.+THEN\s+.+)(\s+WHEN\s+.+THEN\s+.+)*)\s+.+\s+ELSE\s+.+\s+END)(,\s*CASE\s+((WHEN\s+.+THEN\s+.+)(\s+WHEN\s+.+THEN\s+.+)*)\s+.+\s+ELSE\s+.+\s+END)*)/i", "case when sum(finance_detail_savings.runrate) then 10 else 0 end", $matches);` It results in empty arrays. – d.coder Apr 10 '18 at 09:12
  • @d.coder Yes, My mistake, forgot to check it after changing it in version 3, Check it now, version 4 captures the groups, and version 5 matches it all at once without capturing group, pick what you please. – Toleo Apr 10 '18 at 13:59
0

It's a design limitation of regular expressions that they cannot be used to parse arbitrary recursive expressions. For example:

(1+2)/(3+(4))

You cannot make a regular expression that validates an arithmetic expression that has nested parentheses.

Since the SQL CASE expression includes "any user-defined condition" then it includes recursive expressions. Therefore you can't validate it with a regular expression.

Are you allowing user input to be used verbatim as CASE expressions? This is called SQL injection, and it's considered to be a security flaw.

It would be better for you to allow users to choose conditions, but not use their input as the code itself. Generate the code from their choices.

Your users are just going to have to be satisfied with choosing a few options for report types. Don't give them a form to submit their own code.

If you were an electrician, and someone told you to line the electrical panel with flammable material, you would tell them, "that's a fire hazard, and I'm not allowed to do it." They might say, "it's okay, it's an admin panel." And you would say, "nevertheless, it's a fire hazard, and a bad idea."


Re your comments:

So there are two ways to understand what you are doing. One is that your dropdown allows the user to choose the expression, and the value of the dropdown is a partial SQL string, which is then sent as part of the form. That's why you would need to validate it in the server-side code.

An alternative technique is to make the dropdown values be simply "1", "2", "3", "4", etc. and those values are sent as the form submission. On the sever-side, these values are used to choose partial SQL strings that are predefined in your server-side code.

If you use the latter method, then you don't need to validate anything, because there's no way any client can spoof the form submission. They can only choose an expression, not specify a literal expression. The server-side code doesn't need to validate what it already has.

Bill Karwin
  • 462,430
  • 80
  • 609
  • 762
  • I got your point and completely agree with you. Actually, I don't want to get into this part just to save mine as well as others time. But because now you have posted it as an answer then let me clear this. What you mentioned as a safety precaution is exactly what is happening in my admin panel. I have few drop-downs through which user can create certain pre-decided conditions which is known as `` in my question. And of course, I am validating it strictly before allowing the user to submit it. – d.coder Apr 09 '18 at 14:01
  • So overall, `` will be replaced by some `AGGREGATE` functions or MYSQL `built-in` functions along-with user-specified strings (which will be again 100% SQL-Injection safe) – d.coder Apr 09 '18 at 14:04
  • The alternative technique you've suggested is fine but I am operating on multiple Databases and trust me it will add huge overhead on server-side script just to maintain these partial SQL strings. That's why it was mutually decided to split it in two-part, some part through FORM and rest critical part at server-side. – d.coder Apr 09 '18 at 14:35
  • 1
    Okay I will trust you've already considered it. Still, you can't use regular expressions to parse any SQL expression. :) – Bill Karwin Apr 09 '18 at 15:17