I'm reading huge XML files in C++ with rapidxml and trying to optimize the reading, because that part consums most of the time (I've measured it with std::chrono
).
I.e. I have a XML file with around 40 MB - the actual parsing from rapidxml takes just approx. ~2300 milliseconds (which is absolutly fine). But copying the file from my std::ifstream
to an buffer takes around ~30000 milliseconds. I wonder if the bottleneck is the speed of my HDD or if there is anything I could do to save up the buffer copy.
std::ifstream file(filename);
if(file == nullptr){
throw std::runtime_error("File "+filename+" not found!");
}
rapidxml::xml_document<> doc;
std::vector<char> buffer((std::istreambuf_iterator<char>(file)), std::istreambuf_iterator<char>( ));
buffer.push_back('\0');
doc.parse<0>(&buffer[0]);
rapidxml::xml_node<>* root = doc.first_node();
The problem is the line: std::vector<char> buffer((std::istreambuf_iterator<char>(file)), std::istreambuf_iterator<char>( ));
which takes about 30 seconds on an 40MB file.
Any ideas how I could optimize the reading here?