13

Everything is in the question : I have a Php script that is a UTF-8 file. In this script I want to do this :

  <?
  echo "âêïû\n";
  ?>

If I run it in a Windows prompt I get this :

C:\php>php -c C:\WINDOWS\php.ini -f mysqldump.php
âêïû
C:\php>

I've not been able to find the right conversion scheme. I've tried also this code :

$tab = mb_list_encodings();
foreach ($tab as $enc1) {
  foreach ($tab as $enc2) {
    $t=mb_convert_encoding("âêïû\n", $enc1, $enc2);
    if (strlen($t)<14) {
      echo $enc1." ".$enc2." = ".$t."\n";
    }
  }
}

And I didn't find the right conversion !

Any help would be greatly appreciated

Dragonthoughts
  • 2,069
  • 8
  • 20
  • 23
Olivier Pons
  • 13,972
  • 24
  • 98
  • 190

4 Answers4

19

The problem is Windows cmd line by default does not support UTF8. From this link, if you follow these

  1. Open a command prompt window
  2. Change the properties of the window to use something besides the default raster font. he Lucida Console True Type font seems to work well.
  3. Run "chcp 65001" from the command prompt

You should be able to output utf8.

Doug T.
  • 59,839
  • 22
  • 131
  • 193
  • +1 Good to know that you can change the encoding in the shell – Peter Bailey Oct 30 '09 at 15:18
  • Okay I've tried "chcp 65001". Now each time I run "php -c C:\WINDOWS\php.ini -f mysqldump.php | more" I get a "Out of memory error". Then I try without the " | more" (this was too risky for Windows I guess grrr) and the script stops itself at the beginning... – Olivier Pons Oct 30 '09 at 15:19
  • can you try just "echo 'hello world'; "? – Doug T. Oct 30 '09 at 15:22
  • 2
    Please note that the default character set of the Windows command line is **NOT** ISO-8859-1 but rather Windows-1252 (at least for Latin1 / Western Europe). – Stefan Gehrig Oct 30 '09 at 16:22
  • `chcp 65001` is a hack and doesn't support full UTF-8 or multibyte input – Alastair McCormack Jan 29 '16 at 20:22
8

You put me on the right track but there was kinddof a problem (I love Windows \o/) :

C:\php>chcp 65001
Page de codes active : 65001
C:\php>php -c C:\WINDOWS\php.ini -f mysqldump.php | more
Mémoire insuffisante.

Mémoire insuffisante = not enough memory.

If I try

C:\php>chcp 1252
C:\php>php -c C:\WINDOWS\php.ini -f mysqldump.php
C:\php>ééîîïïÂÂÂÂâûü

it works. Only God knows why. But it works. Thanks for putting me on the right track !!

By the way the php code to go properly form UTF8 to command prompt is :

  echo mb_convert_encoding($utf8_string, "pass", "auto");
Olivier Pons
  • 13,972
  • 24
  • 98
  • 190
  • µYep I'm sure, I'm using it on www.acarat.com which is full utf-8 site – Olivier Pons Nov 01 '09 at 13:25
  • 1
    God doesn't know why! – markus Nov 19 '13 at 00:21
  • 1
    The BTW append to the end save me lots of headaches as `mb_convert_encoding($utf8_string, "pass", "auto")` is also the way to READ/WRITE UTF-8 named files locally in windows. – lalengua Mar 20 '16 at 18:45
  • @lalengua what does 2nd param `"pass"` mean? I've found only `"auto"` for 3rd one: "auto" is expanded to "ASCII,JIS,UTF-8,EUC-JP,SJIS" (c) php.net – vladkras Sep 14 '16 at 04:54
  • @vladkras Hard to find, but, as you know `mb_convert_encoding` function does a double process: first it decodes the string and then encodes it again with a new encoding. The `pass` constant is defined [here](https://github.com/php/php-src/blob/1c295d4a9ac78fcc2f77d6695987598bb7abcb83/ext/mbstring/libmbfl/mbfl/mbfilter_pass.c) as `mbfl_no_encoding_pass` in the source code of PHP and it means that the function will return a (Unicode?) string, not encoded at all. Maybe a later process in console encodes it again? Source graph [here](http://fossies.org/dox/php-5.6.26/mbfilter__pass_8c.html) – lalengua Sep 23 '16 at 03:22
1

Try this another one. It's working with Russian encoding, I hope it will work with French:

class ConsoleHelper
{
    /**
     * @var boolean
     */
    private static $isEncodingSet = false;

    /**
     * @param string $message
     * @return string
     */
    public static function encodeMessage($message)
    {
        $isWindows = (DIRECTORY_SEPARATOR == '\\');
        if ($isWindows) {
            if ( ! self::$isEncodingSet) {
                shell_exec('chcp 866');
                self::$isEncodingSet = true;
            }
            $message = iconv('utf-8', 'cp866', $message);
        }
        return $message;
    }
}
user3551026
  • 408
  • 4
  • 7
  • You're using `866` and I guess for french it's `1252`. Anyway I've given up on Windows 6 years ago, I'm more than happy I've made the switch forever to Linux Mint (except for some games, of course). To be polite, Windows is not my cup of tea anymore. Of course I'm *much more* rude when I talk about it. Here politeness is important **`<8^D`** – Olivier Pons Nov 21 '18 at 08:53
  • For french it's cp850. – Maxence Jun 24 '20 at 16:44
0

It looks the default encoding is Code page 437.

Michas
  • 6,155
  • 5
  • 34
  • 54