Questions tagged [character-set]

A character set maps a set of characters to specific numeric values, e.g. ASCII, UTF-8 and ISO-8859-1.

A character set maps a set of characters to specific numeric values.

Modern computer languages, editors and tools facilitate encoding and decoding of data between internal representations of data and specific character sets. Examples include ASCII, UTF-8 and ISO-8859-1.

Consideration should be given to using the appropriate character set for transmission and persistence of data, particularly text that can contain special characters (such as European languages like French or German) or be in a completely different script (such as Japanese) - see internationalisation (also referred to as i18n).

92 questions

1171

votes

8 answers

What's the difference between utf8_general_ci and utf8_unicode_ci?

Between utf8_general_ci and utf8_unicode_ci, are there any differences in terms of performance?

asked Apr 20 '09 at 03:43

KahWee Teng

12,350
3
19
21

557

votes

21 answers

Best way to convert text files between character sets?

What is the fastest, easiest tool or method to convert text files between character sets? Specifically, I need to convert from UTF-8 to ISO-8859-15 and vice versa. Everything goes: one-liners in your favorite scripting language, command-line tools…

text unicode utf-8 character-set

asked Sep 15 '08 at 17:21

Antti Kissaniemi

17,999
13
51
47

324

votes

4 answers

What does character set and collation mean exactly?

I can read the MySQL documentation and it's pretty clear. But, how does one decide which character set to use? On what data does collation have an effect? I'm asking for an explanation of the two and how to choose them.

mysql database database-design character-set

asked Dec 04 '08 at 16:47

Sander Versluys

67,197
23
79
89

votes

2 answers

About the "Character set" option in Visual Studio

I have an inquiry about the "Character set" option in Visual Studio. The Character Set options are: Not Set Use Unicode Character Set Use Multi-Byte Character Set I want to know what the difference between three options in Character…

visual-studio character-set

asked Feb 19 '12 at 12:58

Lion King

28,712
21
69
128

votes

2 answers

SQL Server: set character set (not collation)

How does one set the default character set for fields when creating tables in SQL Server? In MySQL one does this: CREATE TABLE tableName ( name VARCHAR(128) CHARACTER SET utf8 ) DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci; Note…

sql-server character-encoding collation character-set

asked Oct 15 '11 at 22:35

dotancohen

26,432
30
122
179

votes

2 answers

Does using ASCII/Latin Charset speed up the database?

It would seem that using the ASCII charset for most fields and then specify utf8 only for the fields that need it would reduce the amount of I/O the database must perform by 100%. Anyone know if this is true? Update: The above was not really my…

mysql mariadb utf8mb4 character-set

asked Jul 23 '18 at 23:24

mbalsam

votes

2 answers

Can Excel Sort Differently Than Its Default U.S. Character Set?

My question is basically the opposite of THIS ONE (which had a database-based solution I can't use here). I use SAP, which sorts characters this way: 0-9, A-Z, _ but I'm downloading data into Excel and manipulating ranges dependent on correct SAP…

excel vba sap customization character-set

asked May 08 '20 at 00:42

wiigame

votes

3 answers

vb.net character set

According to MSDN vb.net uses this extended character set. In my experience it actually uses this: What am I missing? Why does it say it uses the one and uses the other? Am I doing something wrong? Is there some sort of conversion tool to the…

vb.net character-encoding character-set

asked Nov 16 '10 at 14:23

Connor Albright

votes

2 answers

Determining ISO-8859-1 vs US-ASCII charset

I am trying to determine whether to use PrintWriter pw = new PrintWriter(outputFilename, "ISO-8859-1"); or PrintWriter pw = new PrintWriter(outputFilename, "US-ASCII"); I was reading All about character sets to determine the character set of an…

java character-encoding ascii iso-8859-1 character-set

asked Jun 10 '15 at 08:02

vikingsteve

34,284
19
101
142

votes

1 answer

Can individual tags override the Character Set in the Specific Character Set (0008,0005)

If I create a DICOM object with a basic single byte Specific Character Set like (0008,0005) = ISO_IR 100, can one of the tags use a different 2-byte Character set? For example can Patient Name (0010,0010) be encoded in Simplified Chinese (ISO 2022…

dicom character-set

asked May 13 '20 at 17:10

tracy hunter

votes

2 answers

Why is there a need to add a '0' to indexes in order to access array values?

I am confused with this line: sum += a[s[i] - '0']; To give some context, this is the rest of the code: #include using namespace std; int main() { int a[5]; for (int i = 1; i <= 4; i++) cin >> a[i]; string s; …

c++ char indices character-set

asked Apr 02 '20 at 14:47

Zachary

votes

1 answer

Checking CharacterSet for single UnicodeScalar yields strange behaviour

While working with CharacterSet I've come across an interesting problem. From what I have gathered so far CharacterSet is based around UnicodeScalar; you can initialise it with scalars and check if a scalar is contained within the set. Querying the…

swift foundation character-set

asked Sep 22 '17 at 22:31

Michael Waterfall

19,942
26
108
161

votes

2 answers

Parsing of CSV file using Node/Express spits out weird \x001 codes

I'm using Node and Express to fetch a .CSV file from a URL that I want to parse. The process of downloading it works just fine. But when I use csv-parser to parse the file the output in the console looks like this: Just tonnes of lines of weird…

node.js csv text-parsing byte-order-mark character-set

asked Apr 21 '21 at 12:09

Johan Carlberger

votes

1 answer

Is there a way to list all categories in perluniprops?

perluniprops lists the Unicode properties of the version of Unicode it supports. For Perl 5.32.1, that's Unicode 13.0.0. You can obtain a list of the characters that match a category using Unicode::Tussle's unichars. unichars '\p{Close_Punctuation}'…

string perl unicode character-set perluniprops

asked Apr 17 '21 at 13:06

alvas

94,813
90
365
641

votes

2 answers

Getting Arabic characters as ??? in PHP from JDE

I am trying to fetch our Arabic values from JDE Database using the following connection string: $dsn = "Driver={SQL Server};Server=10.10.10.27;Database=JDE;charset=utf8"; $username = "username"; $password = "password"; $string =…

php utf-8 arabic jde character-set

asked Feb 28 '21 at 11:17

Sadeem M.K.

2 3 4 5 6 7 Next