1

Question: When I run my code on several files to extract the contents, on around 20% of the files the first iteration sets both a and b to 255 (each bit is a 1), and the program exits immediately because this should not have happened. What error could be causing this?

Code:

    ifstream file(file_name);
    ofstream output(outfile_name);
    vector<string> dictionary; //these are all properly initialized earlier
    unsigned char a, b;

    while(!file.eof())
    {
        a = file.get();
        if(!file.eof()) b = file.get();
        else break;
        int id = b + 256*a;
        if(id < 0 || id > dictionary.size()) //this shouldn't happen
        {
            cout << a << ' ' << b << ' ' << id <<  endl;
            break;
        }
        output << dictionary[id] << ' ';
    }

Background: I had several large files, and decided to iterate through all of them and generate a dictionary. Using the ordered dictionary, I assigned each word a unique ID (of two unsigned chars, a and b). I then replaced all words with their unique IDs, which is what file now is, a long sequence of unsigned chars, two for each word. Unfortunately converting from the IDs back to the words does not seem to be working properly. Why are a and b being set to 255 when traversing certain files?

TimD1
  • 903
  • 11
  • 25

0 Answers0