0

I'm trying to display MySQL utf8_general_ci encoded texts in fpdf by using PHP's utf8_decode. Everything worked as expected, but when our customer entered his texts, every ü showed up as u? etc.

The problem is that I can't tell the difference between his ü and mine. Both show up fine in phpMyAdmin or our CMS. Once I replace his ü with one typed by me it works.

What's the hidden difference here?

Steffen
  • 783
  • 1
  • 6
  • 17
  • possible duplicate of [UTF-8 all the way through](http://stackoverflow.com/questions/279170/utf-8-all-the-way-through) – deceze Oct 24 '13 at 11:15
  • If you're using `utf8_decode` anywhere in your front-to-back UTF-8 workflow, you're *not* using UTF-8 all the way through. See aforelinked duplicate. – deceze Oct 24 '13 at 11:16
  • Please read my question. I don't want to use Unicode all the way through, I decode for a reason. My question is not general Unicode setup, I'm asking how there can be an apparent encoding difference within a single MySQL field. (I don't think that's possible, but that's how it looks.) Again, everything works perfectly fine, except for FPDF, where there's a curious mismatch between `ü` entered by me and `ü` entered by our customer. But the mismatch doesn't show up anywhere else, so I don't know how to debug it. I would expect one to be bungled in phpMyAdmin, but both look just fine. – Steffen Oct 24 '13 at 11:54
  • If a character shows up incorrectly, that's because it is being interpreted in the wrong encoding. FPDF expects text to be encoded in some specific encoding, but you're giving it text encoded in something else. Boom, encoding mismatch, screwed up characters. How exactly it's screwing up I don't know without more details. You have *at least* one encoding conversion by using `utf8_decode`, that's already a bad sign. I don't know how many more conversions you have at what point. Figure it out by reading the other question, or http://kunststube.net/frontback/ or http://kunststube.net/encoding. – deceze Oct 24 '13 at 12:00
  • You may try different font for some special utf-8 charcters.. e.g. "arialuni.ttf" – kwelsan Oct 24 '13 at 12:46
  • I use `utf8_decode` exactly because FPDF needs it, hardly a bad sign. You're unwillingness to understand my specific question is really confusing. I got my answer elsewhere and will be posting a reply myself later. – Steffen Oct 24 '13 at 12:55
  • I must admit I jumped to conclusions, because most questions involving UTF-8 and especially `utf8_decode` are simply problems of mistreating encodings. It would have helped a lot if you'd have specified explicitly what encoding you expect to be what in at what point. Since FPDF can also work just fine with UTF-8 directly if so instructed, `utf8_decode` seemed an easy culprit. – deceze Oct 24 '13 at 14:23

2 Answers2

1

So our client entered his texts copied from Word. I exported the table and opened it with a text editor. My ü was fine and his ü was only the canonical equivalent. I suspect utf8_decode can't handle that and just returned u?. This also explains why the error didn't show up anywhere else.

Steffen
  • 783
  • 1
  • 6
  • 17
0

Try this code

iconv('UTF-8', 'windows-1252', value)
Jecker
  • 216
  • 1
  • 4
  • 12