I'm working with C++ in Visual Studio 2010. I have an STL set, which I'm saving to file when my program shuts down. The next time the program starts up, I load the (sorted) data back into a set. I'm trying to optimize the loading process, and I'm having trouble. I suspect the problem is with frequent re-balancing, and I'm looking for a way to avoid that.
First, I did it with no optimization, using "set->insert (const value_type& x )"
Time: ~5.5 minutes
Then I tried using the version of insert() where you pass in a hint for the location of the insert():
iterator insert ( iterator position, const value_type& x );
Roughly, I did this:
set<int> My_Set;
set<int>::iterator It;
It = My_Set.insert (0);
for (int I=1; I<1000; I++) {
It = My_Set.insert (It, I); //Remember the previous insertion's iterator
}
Time: ~5.4 minutes
Barely any improvement! I don't think the problem is with overhead in reading from file--commenting out the insert() reduces the time to 2 seconds. I don't think the problem is with overhead in copying my object--it's a Plain Old Data object with an int and a char.
The only thing I can think of is that the set is constantly re-balancing.
1.) Do you agree with my guess?
2.) Is there a way to "pause" the rebalancing while I load the set, and then rebalance once at the end? (Or... Would that even help?)
3.) Is there a smarter way to load the sorted data, i.e. not simply moving from lowest to highest? Perhaps alternating my insertions so that it doesn't have to balance often? (Example: Insert 1, 1000, 2, 999, 3, 998,...)