5

I developed a PHP script which should connect to a pervasive database system:

$connection_string = "Driver={Pervasive ODBC Client Interface};ServerName=127.0.0.1;dbq=@test"; 
$conn = odbc_connect($connection_string,"administrator","password");

If I execute a query, the returning data is not UTF8. mb_detect_encoding tells me, the encoding is ASCII. I tried to convert the data via iconv, but it doesn't work. So i tried something like that to change the encoding after the script connected:

odbc_exec($conn, "SET NAMES 'UTF8'");
odbc_exec($conn, "SET client_encoding='UTF-8'");

But nothing helps! Can anyone help me? Thanks.

------------------------------ edit -------------------------------

here is the complete script, because nothing works so far:

class api {

    function doRequest($Url){
        $ch = curl_init();
        curl_setopt($ch, CURLOPT_URL, $Url);
        curl_setopt($ch, CURLOPT_REFERER, "http://www.example.org/yay.htm");
        curl_setopt($ch, CURLOPT_USERAGENT, "MozillaXYZ/1.0");
        curl_setopt($ch, CURLOPT_HEADER, 0);
        curl_setopt($ch, CURLOPT_TIMEOUT, 10);
        curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
        curl_setopt($ch, CURLOPT_ENCODING, 'UTF-8');
        $output = curl_exec($ch);
        curl_close($ch);
    }

}

$connection_string = "Driver={Pervasive ODBC Client Interface};ServerName=127.0.0.1;dbq=@test;Client_CSet=UTF-8;Server_CSet=UTF-8"; 
$conn = odbc_connect($connection_string,"administrator","xxx");

if ($conn) {

    $sql = "SELECT field FROM table where primaryid = 102"; 
    $cols = odbc_exec($conn, $sql);

    while( $row = odbc_fetch_array($cols) ) { 

        $api = new api(); 
        // --- 1 ---
        $api->doRequest("http://example.de/api.html?value=" . @urlencode($row["field"])); 
        // --- 2 ---
        $api->doRequest("http://example.de/api.html?value=" . $row["field"]); 
        // --- 3 ---
        $api->doRequest("http://example.de/api.html?value=" . utf8_decode($row["field"])); 

    }

}

The server log says the follwing:

--- 1 --- [24/May/2016:14:05:07 +0200] "GET /api.html?value=Talstra%E1e+7++++++++++++++++++++++++++++++++++++++++++++++++ HTTP/1.1" 200 93 "http://www.example.org/yay.htm" "MozillaXYZ/1.0"
--- 2 --- [24/May/2016:11:31:10 +0200] "GET /api.html?value=Talstra\xe1e 7                                                 HTTP/1.1" 200 83 "http://www.example.org/yay.htm" "MozillaXYZ/1.0"
--- 3 --- [24/May/2016:14:05:07 +0200] "GET /api.html?value=Talstra?e 7                                                 HTTP/1.1" 200 93 "http://www.example.org/yay.htm" "MozillaXYZ/1.0"

%E1 stand for á, but it should be ß (german character)

\xe1 stand for á, but it should be ß (german character)

Tobias Bambullis
  • 686
  • 5
  • 16
  • 44

5 Answers5

4

Your database is in ASCII Extended, not "Just ASCII"

The clue lies here:

%E1 stand for á, but it should be ß (german character)

%E1, or 225 for simplicity, stands for á in UTF8, . In extended ASCII its ß. Hold alt and type 225, you get a ß.

If the following from your question is in fact correct:

If I execute a query, the returning data is not UTF8.

Because the data isn't in UTF8.

What you have in your database is extended ASCII characters. Regular ASCII is a subset of UTF8, which is up to character at 128, extended isn't.

If you tried this, it won't work;

iconv("ASCII", "UTF-8", $string);

You can try this first, because its the least invasive, looks like mysql supports cp850, so you can try this at the top of your script:

odbc_exec($conn, "SET NAMES 'CP850'");
odbc_exec($conn, "SET client_encoding='CP850'");

This might work, if your original assertion is correct:

iconv("CP437", "UTF-8", $string);

or this, my initial hunch, that your database is in latin-1:

iconv("CP850", "UTF-8", $string);

IBM CP850 has all the printable characters that ISO-8859-1(latin-1) has, its just that ß is at 223 in ISO-8859-1.

You can see the position of ß in the table on this page: https://en.wikipedia.org/wiki/Western_Latin_character_sets_%28computing%29

As a drop in replacement to your existing code, in your question, see if this works:

    $api->doRequest("http://example.de/api.html?value=" . $iconv("CP850", "UTF-8",$row["field"])); 
    // --- 2 ---
    $api->doRequest("http://example.de/api.html?value=" . $iconv("CP850", "UTF-8",$row["field"])); 
    // --- 3 ---
    $api->doRequest("http://example.de/api.html?value=" . $iconv("CP850", "UTF-8",$row["field"])); 

This will work if your entire database is in the same encoding.

If your database isn't consistently adhering to one encoding, it might be possible that no one answer is completely right. If that is the case, you can also try the answer here, but with a different encoding:

Latin-1 / UTF-8 encoding php

// If it's not already UTF-8, convert to it
if (mb_detect_encoding($row["field"], 'utf-8', true) === false) {
    $row["field"] = mb_convert_encoding($row["field"], 'utf-8', 'iso-8859-1');
}

My real correct answer is, if you can, insert the data in UTF8 correctly, so you dont have problems like this. Of course, that is not always possible.

Reference:

Force encode from US-ASCII to UTF-8 (iconv)

Community
  • 1
  • 1
Paul Stanley
  • 3,695
  • 6
  • 30
  • 49
  • thank you so much! Finally I changed the following and it works: $connection_string = "Driver={Pervasive ODBC Client Interface};ServerName=127.0.0.1;dbq=@test;Client_CSet=UTF-8;Server_CSet=CP850"; and iconv("CP850", "UTF-8",$row["field"]) – Tobias Bambullis May 31 '16 at 08:31
2

Try adding Client_CSet=UTF-8 to your connection string.

DonBoitnott
  • 10,011
  • 6
  • 42
  • 63
  • it doesn't work for me so i added the whole script to my question - do you have an idea? – Tobias Bambullis May 24 '16 at 10:00
  • There is a matching `Server_CSet` option that might be required. Allows you to specify both the server and client encoding to facilitate collation. – DonBoitnott May 24 '16 at 11:16
  • i have updated my question with your changes and the results again - that is very annoying, i tried almost everything that is possible. It would be great if you have got another idea :) – Tobias Bambullis May 24 '16 at 12:19
  • I am starting to suspect that the database encoding is not what you think it is. Open PCC, right-click the database, click Properties. In Code Page, click the `Change connection encoding` link. On the pop-up, what is shown for the `PCC connection encoding` drop-down? – DonBoitnott May 24 '16 at 14:06
2

If you know the encoding on server try to add this to your connection string,

Client_CSet=UTF-8;Server_CSet=SERVER_ENCODING // for example WINDOWS-1251
1

Make sure that your database charset is utf8

try this
$connection_string = "Driver={Pervasive ODBC Client Interface};ServerName=127.0.0.1;dbq=@test;charset=UTF-8";

this may help you encoding

mohamed abdallah
  • 134
  • 2
  • 10
0

1 try

$connection_string = "Driver={Pervasive ODBC Client Interface};ServerName=127.0.0.1;dbq=@test;  CharacterSet => UTF-8"; 
$conn = odbc_connect($connection_string,"administrator","password");

let me know if it works .. i try to help out. had a similuar Problem a bit ago :)

Community
  • 1
  • 1
KikiTheOne
  • 523
  • 4
  • 13