1

I want to read a binary file into a vector of bytes vector<int8_t>. The function shouldn't be aware of the file's encoding / language / etc, I just want the raw bytes.

I read through the answers to this question and this excellent article, and since my files are not very large the recommended solution seems to be

std::ifstream in("file.txt", std::ios::binary);
auto ss = std::ostringstream{};
ss << in.rdbuf();
auto s = ss.str();

and then creating a vector from s.

However I don't like the idea of reading my raw bytes through an ostringstream which is locale-aware, especially that its default locale depends on your machine's settings.

Are my concerns legit? Will this solution produce different results on different machines with different global locales? Should I call ss.imbue(std::locale::classic()); first? How do I guarantee that the ostringstream won't mess with the content? Or does opening the file itself in a binary mode guarantee that?

Valentin
  • 1,040
  • 7
  • 16
  • Why use a `stringstream` for a binary file? Try using `std::vector` with an `istream` iterator. – Thomas Matthews Feb 01 '18 at 17:42
  • @ThomasMatthews this is listed as "Bad idea #1" in the article I linked due to its terrible performance – Valentin Feb 01 '18 at 17:44
  • If you are looking for performance, get the size of your file, then dynamically create an array and use `std::istream::read` to read the entire file into the array. Otherwise search the internet for "memory mapped files" and include your platform name. – Thomas Matthews Feb 01 '18 at 18:13
  • @ThomasMatthews this is listed as "Bad idea #2" in the article I linked due to the UB it may cause:) Yes, I am looking for performance. It's on Windows. – Valentin Feb 01 '18 at 18:18
  • @Valentin http://en.cppreference.com/w/cpp/io/basic_streambuf says that it does use the locale "Typical implementation of the std::basic_streambuf base class holds only the six CharT* pointers and a copy of std::locale as data members. In addition, implementations may keep cached copies of locale facets, which are invalidated whenever imbue() is called. The concrete buffers such as std::basic_filebuf or std::basic_stringbuf are derived from std::basic_streambuf." – Petr Feb 03 '18 at 16:37

0 Answers0