0

my C++ knowledge is very limited so my apologies beforehand as it may be a very simple question but I haven't been able to find a solution.

I have a binary file that I'm trying to read. The code attempting to read the binary file is shown below:

string readFile2(const string &fileName)
{
    cout << "B1 \n";
    ifstream ifs(fileName.c_str(), ios::in | ios::binary | ios::ate);

    ifstream::pos_type fileSize = ifs.tellg();
    ifs.seekg(0, ios::beg);

    vector<char> bytes(fileSize);
    ifs.read(bytes.data(), fileSize);
    cout << bytes.data();;
    cout << "\n";
    cout << fileSize;
    cout << "\n";
    return bytes.data();
    // return string(bytes.data(), fileSize);
}

Based on the output of cout << fileSize; it shows 744402 bytes but when I print out bytes.data() I only get the first 8 bytes, LIZM 2.9. I used a hexdump tool to look into the binary file and noticed that the 9th byte is a null character. Hex dump of the first 16 bytes with the corresponding ASCIIs are shown below:

 4C 49 5A 4D 20 32 2E 39  00 00 21 C4 00 00 00 00  LIZM 2.9 __!____

As you can see _ corresponds to the null character 00. My question is how do I read every byte instead of stopping at the null character?

Mark
  • 11
  • 2
  • 1
    You have to seek to the end of file before using `tellg()` to get file size. – Arty Mar 28 '21 at 18:37
  • Are you *sure* you aren't reading the whole file? I think your problem might not be where you think it is. – Beta Mar 28 '21 at 18:39
  • Likely duplicate: https://stackoverflow.com/questions/42874699/stdstring-stops-at-0 – Drew Dormann Mar 28 '21 at 18:40
  • @Eljay ["*ate: seek to the end of stream **immediately after open***"](https://en.cppreference.com/w/cpp/io/ios_base/openmode) - nothing conflicting/ambiguous about that. – Remy Lebeau Mar 28 '21 at 18:46
  • @Eljay also see [filebuf::open()](https://en.cppreference.com/w/cpp/io/basic_filebuf/open): "*If the open operation succeeds and `openmode & std::ios_base::ate != 0` (the `ate` bit is set), repositions the file position to the end of file, as if by calling `std::fseek(file, 0, SEEK_END)`, where file is the pointer returned by calling `fopen`. If the repositioning fails, calls `close()` and returns a null pointer to indicate failure.*" – Remy Lebeau Mar 28 '21 at 18:50

2 Answers2

1

The problem is not with how you are reading the file, but with how you are outputting its data. You are treating the data as if it were a null-terminated char* string, which will break on the first nul character encountered. Non-textual binary files tend to have a lot of 0x00 bytes in them.

Replace this:

cout << bytes.data();

With this:

cout.write(bytes.data(), fileSize);

And replace this:

return bytes.data();

With this:

return string(bytes.data(), fileSize);

Alternatively, see How do I read an entire file into a std::string in C++?.

Remy Lebeau
  • 454,445
  • 28
  • 366
  • 620
  • Thanks, that solves the question, but what if I am trying to save the binary data into a database that takes the type "BLOB"? It has to be in binary format so when I return a string instead of a binary data type postgres complains – Mark Mar 28 '21 at 18:50
  • @Mark simply make sure you take the binary's size into account when populating the blob with the binary data. Don't just rely on a pointer to the binary data alone. It doesn't really matter if you store the binary data in a string or a vector or whatever. That is just a matter of memory management. – Remy Lebeau Mar 28 '21 at 18:52
  • could you provide a little more detail? I'm still quite new to this. Thank you. – Mark Mar 28 '21 at 18:55
  • @Mark not without seeing the postgres code you are having trouble with. – Remy Lebeau Mar 28 '21 at 18:56
  • I will create a new question and link it here. Thank you again. – Mark Mar 28 '21 at 19:01
  • https://stackoverflow.com/questions/66845340/how-do-save-the-entire-content-of-a-binary-file-into-postgres-database – Mark Mar 28 '21 at 19:24
0

I suggest to use explicit seeking to the end of file instead of using ios::ate, like I did in code below.

Also notice that you have zero bytes in your data, it means that your string will be truncated by first zero when outputting to cout and returning from function. You have to explicitly initialize returned string given a file size, like you did in commented out last line of your code.

Try it online!

#include <iostream>
#include <fstream>
#include <vector>

using namespace std;

string readFile2(const string &fileName)
{
    ifstream ifs(fileName.c_str(), ios::in | ios::binary);

    ifs.seekg(0, ios::end);
    ifstream::pos_type fileSize = ifs.tellg();
    ifs.seekg(0, ios::beg);

    vector<char> bytes(fileSize);
    ifs.read(bytes.data(), fileSize);
    cout.write(bytes.data(), fileSize);
    cout << "\n";
    cout << fileSize;
    cout << "\n";
    return string(bytes.data(), fileSize);
}

int main() {
    readFile2("readme.txt");
}

Input:

Hello, World!
Hello, again!

Output:

Hello, World!
Hello, again!

28
Arty
  • 8,027
  • 3
  • 16
  • 26