2

A comment by James Kanze on How to copy a .txt file to a char array in c++ makes it sound like in order to be sure that a standard string would get the exact binary contents of a file when iterated through by a standard string constructor, one would have to both:

  • open the file in binary mode,
  • ensure that the file is imbued with the "C" locale.

In code, I'm guessing that means:

std::ifstream in(filename, ios_base::binary);
in.imbue(std::locale("C"));

Is that really necessary? More specifically, why would the locale have any impact when the file is opened in binary mode?

Note that what I am trying to do is more or less what the above mentioned question was about:

std::string contents(std::istreambuf_iterator<char>(in), std::istreambuf_iterator<char>());
Community
  • 1
  • 1
nilo
  • 580
  • 4
  • 17
  • You should do some research what [`ios_base::binary`](http://en.cppreference.com/w/cpp/io/c#Binary_and_text_modes) exactly does. It's more or less only relevant with Windows OS. – πάντα ῥεῖ May 13 '17 at 07:57
  • @πάντα, quoting your link: "a binary stream is an ordered sequence of characters that can transparently record internal data. Data read in from a binary stream always equals to the data that were earlier written out to that stream". Do you mean that imbuing the locale has no impact in this case (which was the point of my - apparently stupid - question...)? – nilo May 13 '17 at 08:18
  • Seems it could be relevant in certain cases: http://stackoverflow.com/a/208431/1413395 – πάντα ῥεῖ May 13 '17 at 08:25
  • @πάντα: thanks for the additional link, but it relates to writing wide characters to a file using a binary stream. I do not see how it applies to my case. I just want the bytes in my file to end up in my string without transformation. Again, my basic assumption is that regardless of locale, opening the file in binary mode would achieve just that. I would be happy to hear actual arguments against that assumption, if there are any. James Kanze made it sound like there were some. – nilo May 13 '17 at 08:44
  • I'm not entirely convinced that Kanze is entirely correct in that comment. The locale's `char -> char` conversion is supposed to be the identity conversion. – molbdnilo May 13 '17 at 09:01
  • @molbdnilo: you on the other hand, make it sound like neither the binary mode, nor the locale imbuing has an impact on the resulting string. Am I correct in that this is what you mean? – nilo May 13 '17 at 09:24
  • @nilo On today's popular platforms, binary vs text only matters on Windows. I don't *think* the locale matters so long as you read `char`s (and don't use formatted extraction, of course), but I'm uncertain enough to not want to commit to an answer. – molbdnilo May 13 '17 at 09:35
  • Some additional thoughts: since the text file is part of my code base, and supposed to contain standard ASCII only (GLSL code), the cleanest solution for my case is probably to open the file in text mode with the "C" locale. I would however still be curious to get an answer to my question. – nilo May 13 '17 at 09:42
  • And also: why would this question be downvoted? I _did_ do some research before posting it, and the discussion here shows that the answer is _not_ trivial. – nilo May 13 '17 at 09:47
  • Anyway, thanks a lot molbdnilo for your insights. – nilo May 13 '17 at 10:51

1 Answers1

3

Based on binary and text modes:

A binary stream is an ordered sequence of characters that can transparently record internal data. Data read in from a binary stream always equals to the data that were earlier written out to that stream. Implementations are only allowed to append a number of null characters to the end of the stream.

I think

std::ifstream in(filename, ios_base::binary);

together with:

in.imbue(std::locale("C"));

does not make sense.

Either the stream is in binary mode, and the locale does not apply, or the programmer chooses to set the locale, but then he/she implicitly means that the stream is open in text mode (ios_base::binary should not be passed to the stream constructor). In that case, the read data may or may not equal to the data in the file, depending on the OS and the contents of the file.

nilo
  • 580
  • 4
  • 17