I hope this is not a silly question at this time of night, but I can't seem to wrap my mind around it.
UTF-8 is a variable length encoding with a minimum of 8 bits per character. Characters with higher code points will take up to 32 bits.
So UTF-8 can encode unicode characters in a range of 1 to 4 bytes.
Does this mean that in a single UTF-8 encoded string, that one character may be 1 byte and another character may be 3 bytes?
If so, how in this example does a computer, when decoding from UTF-8, not try to treat those two separate characters as one 4 byte character?