Is it really necessary to use unsigned char
to hold binary data as in some libraries which work on character encoding or binary buffers? To make sense of my question, have a look at the code below -
char c[5], d[5];
c[0] = 0xF0;
c[1] = 0xA4;
c[2] = 0xAD;
c[3] = 0xA2;
c[4] = '\0';
printf("%s\n", c);
memcpy(d, c, 5);
printf("%s\n", d);
both the printf's
output correctly, where
f0 a4 ad a2
is the encoding for the Unicode code-point U+24B62 ()
in hex.
Even memcpy
also correctly copied the bits held by a char.
What reasoning could possibly advocate the use of unsigned char
instead of a plain char
?
In other related questions unsigned char
is highlighted because it is the only (byte/smallest) data type which is guaranteed to have no padding by the C-specification. But as the above example showed, the output doesn't seem to be affected by any padding as such.
I have used VC++ Express 2010 and MinGW to compile the above. Although VC gave the warning
warning C4309: '=' : truncation of constant value
the output doesn't seems to reflect that.
P.S. This could be marked a possible duplicate of Should a buffer of bytes be signed or unsigned char buffer? but my intent is different. I am asking why something which seems to be working as fine with char
should be typed unsigned char
?
Update: To quote from N3337,
Section 3.9 Types
2 For any object (other than a base-class subobject) of trivially copyable type T, whether or not the object holds a valid value of type T, the underlying bytes (1.7) making up the object can be copied into an array of char or unsigned char. If the content of the array of char or unsigned char is copied back into the object, the object shall subsequently hold its original value.
In view of the above fact and that my original example was on Intel machine where char
defaults to signed char
, am still not convinced if unsigned char
should be preferred over char
.
Anything else?