2

On Windows, the output of PHP command-line scripts is interpreted according to currently configured code page and console font. Here in Western Europe this often defaults to 850 and bitmap. That means that a script written as UTF-8 (the de-facto standard since PHP/5.4):

<?php
echo 'Café: 1,25 €' . PHP_EOL;

... will typically look this way:

C:\tmp>php test.php
Caf├®: 1,25 Ôé¼

The usual workaround is use a *.bat wrapper:

@echo off
chcp 65001 > NUL
php test.php

It doesn't fix the font issue but it's normally good enough.

My double question:

  1. Is it possible to set the code page from within PHP so we can omit the wrapper? (Using program execution functions to run chcp does not work because it happens in a different process.)

  2. Is this a limitation of the console libraries used by PHP? (Node.js scripts always displays correct output from UTF-8 sources no matter the local code page, font aside.)

Community
  • 1
  • 1
Álvaro González
  • 128,942
  • 37
  • 233
  • 325
  • 1
    To properly support multi-byte Unicode in Windows console, they must interact with the low-level console API. AFIAK, Java and Node.js do this already and Python has a drop-in module to enable it. If you have no luck with PHP directly, then I suppose you could write a wrapper in a supporting language that takes UTF-8 from PHP and renders it correctly. – Alastair McCormack Jan 29 '16 at 20:18

1 Answers1

2

Answer to your question 1:

You can try iconv. Note that this way you change encoding of the output, not the console code page. However, it results in identical encoding of both console and script output which is important (see this post):

iconv("UTF-8", "CP1252", $data); // copied from example on php.net

Wrapping it in a function gives you quite convenient tool to output strings to console:

function message($string)
{
  iconv("UTF-8", "CP1252", $string);
}

So instead of:

echo $string;

Use:

message($string);

You can go even further by getting current console code page from your code:

function getCodePage()
{
  $consoleEncoding = explode(":", exec("chcp"));
  return trim($consoleEncoding[1]);
}

That gives you possibility to change message function so the script always uses the correct code page:

function message($string)
{
  iconv("UTF-8", "CP" .getCodePage(), $string);
}
Community
  • 1
  • 1
rsl
  • 89
  • 5