0

I have a binary file packing lots of files (something like a .tar), where I can found both binary and text files.

When processing in memory strings, carriage lines are usually '\n', but if I read the text part from this packed file, I get "\r\n". Therefore processing this text gives me errors.

Here is the code for reading the text from a binary file:

FILE* _fileDescriptor;                        // it's always open to improve performance
fopen_s(&_fileDescriptor, _filePath.string().c_str(), "rb"); 

char* data = new char[size + 1];              // size is a known and correct value
fseek(_fileDescriptor, begin, SEEK_SET);      // begin is another known value, where the file starts inside the packed one
fread(data, sizeof(char), size, _fileDescriptor);
data[it->second.size] = '\0';

This gives me the right text into data, but the following code gives me error when reading an empty line:

istringstream ss(data);      // create a stringstream to process it in another function
delete[] data;               // free the data buffer

// start processing the file
string line;
getline(infile, line);       // read an empty line

if(line.size() > 0) {
    /*
     enters here, because the "empty" line was "\r\n", and now the value of line is '\r', therefore line.size() == 1
    */
    ...

So, any advice to avoid the '\r'?

I edited it on Notepad++. Changing its configuration to use '\n' instead of '\r\n' as line carriage works, but I don't want to depend on this because other people can miss that, and it would be very hard to spot the problem if that happens.

danikaze
  • 1,031
  • 10
  • 25

1 Answers1

1

Probably easiest to trim the '\r' characters out of your string and then discard blank lines. See this answer for approaches to trimming a std::string (I'm assuming that's what 'line' is):

What's the best way to trim std::string?

Community
  • 1
  • 1
HerrJoebob
  • 2,152
  • 13
  • 20
  • so, basically replacing '\r\n' for '\n' in real time. That's the obvious answer, adding boost::trim_right(line); after the getline works, but I expected there was something more efficient, or better... – danikaze Dec 19 '12 at 00:28
  • Not that I know of, at least if you need to handle truly binary data. Of course if your data is all printable, you could open it in text mode ("r" instead of "rb") and this would be handled for you. – HerrJoebob Dec 19 '12 at 00:31