0

I have a string like: Chào Bạn.

In PHP, I want to convert it to lowercase and remove all special character, whitespace.

Input: Chào Bạn

Output: chaoban

In php code:

$string = 'Chào Bạn';
$newString = preg_replace('/\s+/', '', $string);
echo strtolower($newString);

The result like $newString = chàobạn.

I can't remove the special character.

Pavel_K
  • 8,216
  • 6
  • 44
  • 127
vanloc
  • 3,908
  • 2
  • 33
  • 67
  • Possible duplicate of [Regular Expression Sanitize (PHP)](https://stackoverflow.com/questions/3022185/regular-expression-sanitize-php) – vv01f Aug 19 '17 at 07:57
  • Transliteration is always tricky, but check out [`iconv()`](http://php.net/manual/en/function.iconv.php). – BenM Aug 19 '17 at 07:59
  • 1
    Why do you use preg_replace to remove a space? Such a waste of computing power. – Andreas Aug 19 '17 at 08:00
  • Do you think `preg_replace` have much time than `str_replace`? @Andreas – vanloc Aug 19 '17 at 08:02
  • There is no such thing as a "special character". What should be "special" about some of them? There are only characters, lot's of them. UTF-8 or example defines roughly 112000 characters. – arkascha Aug 19 '17 at 08:03
  • @arkascha http://php.net/manual/en/function.htmlspecialchars.php – rndus2r Aug 19 '17 at 08:07
  • 3
    Possible duplicate of [How to convert special characters to normal characters?](https://stackoverflow.com/questions/9720665/how-to-convert-special-characters-to-normal-characters) – rndus2r Aug 19 '17 at 08:09
  • Thanks @rndus2r. Should using `iconv()` in my case. – vanloc Aug 19 '17 at 08:14
  • @vanloc yes preg replace is normally slower and uses more memory than str_replace – Andreas Aug 19 '17 at 08:16
  • 1
    https://3v4l.org/7D9DY/perf#output vs https://3v4l.org/q0RMG/perf#output – Andreas Aug 19 '17 at 08:18
  • @Andreas It's helpful information. I using `str_replace` instead of `preg_replace`. – vanloc Aug 19 '17 at 08:19
  • Hi, @Andreas, with text `Bất động sản`. The `iconv` seems not working. My language is Vietnam. Here the link to test my code: http://sandbox.onlinephpfunctions.com/code/41e654b8ad09594fb7540272cba0fc456615401d – vanloc Aug 19 '17 at 08:35
  • @rndus2r That refers to a very specific task. It 1. is not a general statement about characters but about handling a specific, clearly defined set of characters used as control characters in a language and 2. is poorly worded: the issue is that the term "special characters" is wide spread which does not do much good since it encourages people to think of "good", normal characters and "special", other, second grade characters one should get rid of. A stupid position. – arkascha Aug 19 '17 at 09:17
  • @arkascha Not true, special characters in php field are very well defined. htmlspecialchars/entities (HTML), mysql(i)_real_escape_string (MYSQL) and mb_* (generall string functions) are a really good example for that. In this case, OP even gave you an example what he meant with special characters (^[:alphanumeric:]) and he wanted do downgrade those special characters to alphanumeric characters. tl;dr: special characters depend on context, yes, but context was given and clear. – rndus2r Aug 19 '17 at 09:53
  • @rndus2r That is _exactly_ what I wrote about those php functions and it says _nothing_ about the general applicability of the term "special characters". So why do you say "Not true"? And why do you then claim that the example the OP gave should define the context the term was meant in this question? It clearly is not! There clearly is _no_ specific content defined here, the OP said _nothing_ about any context or situation or purpose of his task. He gave an example, or better attempt, yes, but you certainly can not derive any definition from some attempt. You just claim things. – arkascha Aug 19 '17 at 10:47
  • You are generalizing whereas we are talking about a specific case. And then again your talking about a special case when you actually want to generalize. OP has the power to define the context the term was meant, it is his question afterall. He is _not_ redefining the term, he is _using_ it from a pre-defined manner. These are big differences. Special characters are bound to their context and in those fields they are well defined. That's a fact (see all the functions above). OP btw approved the duplicate question, so yes, he defined it cleary. You might not have seen that. – rndus2r Aug 19 '17 at 10:53
  • Sorry about my question. I know my fault is: not defined term of `special character`. Before comment, I mention I test on the character of language Vietnamese (it have many special characters). Two example: ` Chào bạn ` and ` Bất động sản ` is demo data to test. I think I don't need that because I find another approach to resolve my problem. Thanks, @rndus2r and @arkascha. – vanloc Aug 19 '17 at 21:47

2 Answers2

0

You can use below functions for your solution.

function clean($string) {
   $string = strtolower($string); // Convert string into lower.
   $string = str_replace(' ', '', $string); // Replaces all spaces .

   return preg_replace('/[^A-Za-z0-9\-]/', '', $string); // Removes special chars.
}
Jalpa
  • 679
  • 3
  • 13
  • You don't need the `str_replace`, `preg_replace` handles it aswell – rndus2r Aug 19 '17 at 08:05
  • Your answer have resulted is: `chobn` – vanloc Aug 19 '17 at 08:05
  • Hi @Jalpa, add line: `$string = iconv('utf-8', 'ascii//TRANSLIT', $string);` before `return` in your `clean()` function and remove line have `str_replace` because like @rndus2r said. It working. I will accept your answer when you edit answer. – vanloc Aug 19 '17 at 08:13
0

Try this:

$string = 'Chào Bạn';
$newString = preg_replace("/[^A-Za-z0-9]/", "", $string);
echo strtolower($newString);
BenM
  • 49,881
  • 23
  • 107
  • 158
Vaghani Janak
  • 581
  • 4
  • 14
  • This is not an answer because it will simply remove all non-alphanumeric characters. OP wants those special characters back to normal characters. – rndus2r Aug 19 '17 at 09:54