Difference between UTF encodings?

Question

I have a simple question - what is the difference between UTF-8, UTF-16 and UTF-32? I know that encoded strings have different sizes, but what is the UTF-16 and UTF-32 for?Should't UTF-8 be able to handle all languages correctly? And how does UTF-7 fit into this?

EDIT

Ok, I relatively understand the technical side of the whole thing, but I still don't see a reason why I should use for example UTF-16 instead of UTF-8 in my app. So my question is - what is the practical usage of other encodings then UTF-8?

I just would like to know some practical example of UTF-32 let's say. Does it have any real application? — Petr Mensik, Jun 15 '12 at 20:53

score 3 · Accepted Answer · answered Jun 10 '12 at 17:38

This article by the famous Joel Spolsky explains it perfctly: http://www.joelonsoftware.com/articles/Unicode.html

Quote:

There are hundreds of traditional encodings which can only store some code points correctly and change all the other code points into question marks. Some popular encodings of English text are Windows-1252 (the Windows 9x standard for Western European languages) and ISO-8859-1, aka Latin-1 (also useful for any Western European language). But try to store Russian or Hebrew letters in these encodings and you get a bunch of question marks. UTF 7, 8, 16, and 32 all have the nice property of being able to store any code point correctly.

score 0 · Answer 2 · answered Jun 10 '12 at 17:36

0

Perhaps the Unicode FAQ would help?

There is a comparison chart that summarises some of the differences.

answered Jun 10 '12 at 17:36

MutterMumble

36
2

Difference between UTF encodings?

2 Answers2