65

In. c#

We can use below classes to do encoding:

  • System.Text.Encoding.UTF8
  • System.Text.Encoding.UTF16
  • System.Text.Encoding.ASCII

Why there is no System.Text.Encoding.Base64?

We can only use Convert.From(To)Base64String method, what's special of base64?

Can I say base64 is the same encoding method as UTF-8? Or UTF-8 is one of base64?

Vadim Ovchinnikov
  • 10,848
  • 4
  • 43
  • 73
Zhongmin
  • 1,226
  • 1
  • 14
  • 29

4 Answers4

118

UTF-8 and UTF-16 are methods to encode Unicode strings to byte sequences.

See: The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!)

Base64 is a method to encode a byte sequence to a string.

So, these are widely different concepts and should not be confused.

Things to keep in mind:

  • Not every byte sequence represents an Unicode string encoded in UTF-8 or UTF-16.

  • Not every Unicode string represents a byte sequence encoded in Base64.

dtb
  • 198,715
  • 31
  • 379
  • 417
24

Base64 is a way to encode binary data, while UTF8 and UTF16 are ways to encode Unicode text. Note that in a language like Python 2.x, where binary data and strings are mixed, you can encode strings into base64 or utf8 the same way:

u'abc'.encode('utf16')
u'abc'.encode('base64')

But in languages where there's a more well-defined separation between the two types of data, the two ways of representing data generally have quite different utilities, to keep the concerns separate.

Mike Axiak
  • 10,869
  • 1
  • 28
  • 45
19

UTF-8 is like the other UTF encodings a character encoding to encode characters of the Unicode character set UCS.

Base64 is an encoding to represent any byte sequence by a sequence of printable characters (i.e. AZ, az, 09, +, and /).

There is no System.Text.Encoding.Base64 because Base64 is not a text encoding but rather a base conversion like the hexadecimal that uses 09 and AF (or af) to represent numbers.

Community
  • 1
  • 1
Gumbo
  • 594,236
  • 102
  • 740
  • 814
1

Simply speaking, a charcter enconding, like UTF8 , or UTF16 are useful for to match numbers, i.e. bytes to characters and viceversa, for example in ASCII 65 is matched to "A" , while a base encoding is used mainly to translate bytes to bytes so that the resulting bytes converted from a single byte are printable and are a subset of the ASCII charachter encoding, for that reason you can see Base64 also as a bytes to text encoding mechanism. The main reason to use Base64 is to be trasmit data over a channel that doesn't allow binary data transfer. That said, now it should be clear that you can have a stream encoded in Base64 that rapresent a stream UTF8 encoded.

S.Bozzoni
  • 860
  • 8
  • 15
  • 1
    "bytes to bytes": not really (but it might look like that in a language like C). The intent is to acquire text that can be handled as such downstream, perhaps in a text-based wrapper (e.g. XML, HTML, SMTP). That text would then have to be character encoded using a mutually understood character encoding (or it might be already by a particular library's Base64 implementation). One might say the character encoding should be one of the numerous character encodings for which the bytes would be the same (call it ASCII if you must); or, it could be, say UTF-16, where they would be very different. – Tom Blodget Jul 23 '19 at 22:55