-2

I spent two hours trying to figure it out first. I tested my expression on several regexp online services and on several versions of mySQL. I divided the originally more complex expression into groups and chose the one that I thought was working incorrectly. Can you please tell me why in mySQL 8.0.15 regexp works like this:

>select '(77' REGEXP '^((\.([0-9]+))?)$';

>1

I thought \. could only mean a dot (.), but in fact, any symbol. Where am I wrong?

o_nerti
  • 11
  • 7
  • 3
    `'\.'` is `.` that matches the first `7`, you need `'\\.'`. *Because MySQL uses the C escape syntax in strings (for example, `\n` to represent the newline character), you must double any ``\`` that you use in your **expr** and **pat** arguments.* – Wiktor Stribiżew Jan 10 '20 at 10:41
  • Thank you so much for answering, you're right, I should have escaped the backslash. Tell me, please, if it's not difficult, why did my question cause a negative reaction? I would like to understand and avoid something like that in the future. – o_nerti Jan 10 '20 at 10:48
  • Avoid asking questions like "what does this code/regex do". Errors like yours are just typos since these are not real issues. Before asking what a regex means make sure you go through the string basics. Test regexps at http://regex101.com to see what they mean. MySQL 8+ regex is ICU based, so that site will be very helpful. – Wiktor Stribiżew Jan 10 '20 at 10:52
  • Of course, I spent two hours trying to figure it out first. I tested my expression on several regexp online services and on several versions of mySQL. I divided the originally more complex expression into groups and chose the one that I thought was working incorrectly. I also know very well about the problem of character escaping. But, unfortunately, I could not guess that these things are related. That's why I approached the expert community with a question. I still do not understand, honestly, how exactly I should ask a question here. – o_nerti Jan 10 '20 at 11:00
  • Just like this. You never know what the reaction will be. Again, avoid asking "why does it work like this" with just 3-4 lines in the question. Make sure you add the efforts like in the comment above in the question body. But this question is a dupe anyway. – Wiktor Stribiżew Jan 10 '20 at 11:01

1 Answers1

-1

(The suggested dup talks about word boundaries; this Question is more about backslashes and dot.)

I believe the main confusion has to do with clients turning backslashes and with the regexp processor tossing it when it is irrelevant.

Also, the test case gets confusing because dot matches anything, including dot.

I tested these

SELECT
   '468' REGEXP '4.8',
   '468' REGEXP '4\.8',
   '468' REGEXP '4\\.8',   -- fail
   '4\.8' REGEXP '4\.8',
   '4\.8' REGEXP '4\\.8',
   '4\\.8' REGEXP '4\.8',   -- fail
   '4\\.8' REGEXP '4\\.8',   -- fail
   '4\\.8' REGEXP '4\\\\.8',
   length('4\.8'),   -- 3
   length('4\\.8'),   -- 4
   length('4\\\\.8');   -- 5, hence 2 backslashes and a dot
[ 1 1   0   1   1   0   0   1   3   4   5 ]

SELECT
   '.' REGEXP '.',
   'x' REGEXP '.',
   '.' REGEXP '\.',
   'x' REGEXP '\.',
   '.' REGEXP '\\.',
   'x' REGEXP '\\.',   -- fail
   '\\x' REGEXP '\\.',   -- fail
   'just dot',
   length('.'),   -- 1
   length('\.'),   -- 1, hence dot
   length('\\.');   -- 2, hence backslash + dot
 [ 1    1   1   1   1   0   0   just dot    1   1   2 ]

Same results on all of these: MariaDB 10.0.28, MariaDB 10.1.43, MariaDB 10.2.30, MariaDB 10.3.21, MySQL 5.5.42, Percona 5.6.22, MySQL 8.0.17, Percona 5.5.45

The client was Perl with DBIx::DWIW, which possibly does the minimal amount of backslash stripping.

They succeed, except as noted. The lengths give a clue of how many backslashes have been stripped by the time MySQL sees it as a string.

Rick James
  • 106,233
  • 9
  • 103
  • 171
  • The answer I linked to says exactly what is required: *The ``\`` char should be escaped in order to define a literal backslash, since ``\`` itself allows escaping characters for the MySQL engine. See this MySQL 8 documentation: Because MySQL uses the C escape syntax in strings (for example, `\n` to represent the newline character), you must double any ``\`` that you use in your expr and pat arguments.* Please re-close. – Wiktor Stribiżew Jan 12 '20 at 23:51