0

I would like to check if my sistem is able to handle file given by my customers. Due my customers can give me files in unknown encoding, i would like to create some test file in order to understand if file can be read without corruption.

my sistem run on nodejs 12.x

i read

i'm not able to understand well the differences between using encoding on writeFile

fs.writeFileSync("ascii.xml", Buffer.from("hello"), {'encoding':'ascii'})

encoding on toString

fs.writeFileSync("ascii.xml", Buffer.from("hello").toString(encoding="ascii"))

encoding on toString and also on File

fs.writeFileSync("ascii.xml", Buffer.from("hello").toString(encoding="ascii"), {'encoding':'ascii'})

encoding on Buffer and so on

fs.writeFileSync("ascii.xml", Buffer.from("hello",'ascii').toString(encoding="ascii"), {'encoding':'ascii'})

for example i'm not able to understand why this code

fs.writeFileSync("base64.xml", Buffer.from("string",'utf8').toString(encoding="base64"),{'encoding':'base64'})

produce a file containing the word "string"

but this

fs.writeFileSync("base64.xml", Buffer.from("string",'utf8').toString(encoding="base64"))

produce a file containing the base64 rappresentation of the word "string" : a base64 encoded file of a base64 encoded string generated from a base64 buffer built same character i put in the string?

thanks for any advice.

Andrea Bisello
  • 722
  • 5
  • 16

1 Answers1

1

Let's get the basics:

  • When you call Buffer.from("hello"), you map (encode) the string hello into bytes.
    (Note that hello is interpreted as a UTF8 encoded string by default, so that the above expression is equivalent to Buffer.from("hello", "utf8")).
    The resultant buffer is:
    <Buffer 68 65 6c 6c 6f>.
  • When you call Buffer.from("hello").toString(), you map (encode) hello into bytes, and then map (decode) those bytes back into a string.
    Again, note that UTF8 is the default, so the above expression is equivalent to Buffer.from("hello").toString("utf8") as well as to Buffer.from("hello", "utf8").toString("utf8").
    The resultant string is unsurprisingly:
    "hello".
    However, change it to Buffer.from("hello").toString("base64"), and you'll get:
    aGVsbG8=.

Now, let's move forward to fs.writeFileSync:

  • fs.writeFileSync(file, data[, options]) writes a file, that is: bytes. In other words, when data is a string, writeFileSync maps (encodes) that string into bytes, and then writes those bytes into the file.
    Again, the default encoding is UTF8, but when you pass some other encoding, data will be mapped (encoded) into bytes using this encoding.
    By the way, when you "open the file", be it using some text editor or even the debugger itself, you probably do it (without knowing) using UTF8 encoding.

Hopefully, You can now understand that:

  • fs.writeFileSync("myFile.xml", Buffer.from("hello","utf8").toString("base64")) maps (encodes) hello into bytes (using UTF8 encoding), and then maps (decodes) those bytes into a string (using Base64 encoding), which is aGVsbG8=.
    This is what you see when you open the file created in your last snippet.

  • fs.writeFileSync("myFile.xml", Buffer.from("hello","utf8").toString("base64"), "base64") maps (encodes) hello into bytes (using UTF8 encoding), then maps (decodes) those bytes into a string (using Base64 encoding), then maps (encodes) this string into bytes (using Base64 encoding). This is what you do in your penultimate snippet.

    To illustrate, note that this series of "mappings" is equivalent to:

    var bytes = Buffer.from(Buffer.from("hello").toString("base64"), "base64")
    

    Which gives:
    <Buffer 68 65 6c 6c 6f>.

    Then, when you open the file, you (i.e., probably your text editor) read those bytes as UTF8 by default, which is equivalent to:

    bytes.toString()  // 'bytes' is defined above
    

    Which gives:
    "hello".

OfirD
  • 4,693
  • 2
  • 22
  • 52