Question:
When I run my code on several files to extract the contents, on around 20% of the files the first iteration sets both a
and b
to 255
(each bit is a 1), and the program exits immediately because this should not have happened. What error could be causing this?
Code:
ifstream file(file_name);
ofstream output(outfile_name);
vector<string> dictionary; //these are all properly initialized earlier
unsigned char a, b;
while(!file.eof())
{
a = file.get();
if(!file.eof()) b = file.get();
else break;
int id = b + 256*a;
if(id < 0 || id > dictionary.size()) //this shouldn't happen
{
cout << a << ' ' << b << ' ' << id << endl;
break;
}
output << dictionary[id] << ' ';
}
Background: I had several large files, and decided to iterate through all of them and generate a dictionary. Using the ordered dictionary, I assigned each word a unique ID (of two unsigned char
s, a
and b
). I then replaced all words with their unique IDs, which is what file
now is, a long sequence of unsigned char
s, two for each word. Unfortunately converting from the IDs back to the words does not seem to be working properly. Why are a
and b
being set to 255
when traversing certain files?