-1

My PHP code receives a $request from an AJAX call. I am able to extract the $name from this parameter. As this name is in German, the allowed characters also include ä, ö and ü.

I want to validate $name = "Bär" via preg_match. I am sure, that the ä is correctly arriving as an UTF-8 encoded string in my PHP code. But if I do this

preg_match('/^[a-zA-ZäöüÄÖÜ]*$/', $name);

I get false, although it should be true. I only receive true in case I do

preg_match(utf8_encode('/^[a-zA-ZäöüÄÖÜ]*$/'), $name);

Can someone explain this to me and also how I set PHP to globaly encode every string to UTF-8?

Steevie
  • 450
  • 1
  • 5
  • 16
  • Of course you get false, when you apply a pattern that allows for only _one single character_ from beginning to end, on a value consisting of _three_ characters … You missed a proper quantifier after the character class here. – CBroe Apr 08 '20 at 11:59
  • You are right, I missed the * in my post, but that was not the problem. The problem was my IDE. The answer from Joni below was the right one and it helped. – Steevie Apr 08 '20 at 16:32
  • I just corrected my post and added the * to the code. That was a copying mistake but not the reason for the problem. The reason for the problem was my IDE. See the answer from Joni below. – Steevie Apr 08 '20 at 16:37

2 Answers2

1

PHP strings do not have any specific character encoding. String literals contain the bytes that the interpreter finds between the quotes in the source file.

You have to make sure that the text editor or IDE that you are using is saving files in UTF-8. You'll typically find the character encoding in the settings menu.

Joni
  • 101,441
  • 12
  • 123
  • 178
  • 1
    That's it. Thanks so much. I would never have assumed that the IDE is the problem. I am using Eclipse and changed the file encoding to UTF-8 now. I used the description at https://stackoverflow.com/questions/9180981/how-to-support-utf-8-encoding-in-eclipse and it helped. – Steevie Apr 08 '20 at 16:30
-1

Your regular expression is wrong. You only test for one sign. The + stands for 1 or more characters. If your PHP code is saved as UTF-8 (without BOM), the u flag is required for Unicode.

$name = "Bär";
$result = preg_match('/^[a-zA-ZäöüÄÖÜ]+$/u', $name);

var_dump($result);  //int(1)

For all German umlauts the ß is still missing in the list.

jspit
  • 3,933
  • 1
  • 6
  • 13
  • Many thanks for replying, but the problem was my IDE (see answer from Joni below). You are right, that I missed the * in my post, but that was my mistake when copying. Also the /u did not work for me. It resulted in an error, which is meanwhile understandable for me as the file was not UTF-8 encoded. – Steevie Apr 08 '20 at 16:36