How can I make my C++ program for Trie structure construction faster?

Question

I'm using C++.

My program reads 200 thousand lines of text file and makes a Trie structure.

Can I save Trie or make it faster than now? Here is the code of a function that reads data from file and builds the structure.

void buildDictionary(pTrie* root, string name) {    
    wifstream r_dic;
    r_dic.imbue(locale("kor"));
    r_dic.open(name,ios::binary);
    if (r_dic.fail()) {
        cout << name << " open failed" << endl;
        exit(-1);
    }
    wchar_t wch[256];
    wstring p1, p2;
    while (r_dic >> wch >> p1 >> p2) {
        pTrie* pt = (*root).insert(splitJamo(wch).c_str(), p1+L' '+p2);
        pt->addArche(wch);
    }
    r_dic.close();
}

Below are results of a profiling run.

Unrelated, but helpful reading: https://stackoverflow.com/questions/5605125/why-is-iostreameof-inside-a-loop-condition-considered-wrong — user4581301, Feb 17 '18 at 07:33
You've shown us a bunch of file opening and reading code which is probably not very relevant, but have not shown us the actual code which matters: the `insert()` and `addArche()` functions. Want to make it faster? Profile the code, and show us the code which is slow. Also, tell us how long it takes to run now, and how long you want it to take. — John Zwinck, Feb 17 '18 at 07:33
This is wrong `r_dic.eof()` https://stackoverflow.com/questions/5605125/why-is-iostreameof-inside-a-loop-condition-considered-wrong — Ed Heal, Feb 17 '18 at 07:41
Are those results with a Debug build or optimized Release build? Speed tests on Debug builds are not pointless, but not as useful as speed tests run on optimized code. A decent optimizer can do amazing things for performance with minimal effort on your part. — user4581301, Feb 17 '18 at 07:43
Also unrelated to your question: please use the more idiomatic and readable `pt->addArche(wch)` instead of `(*pt).addArche(wch)` — Michael Burr, Feb 17 '18 at 07:47
It's important how the trie is implemented. Or the strings, or ... On modern desktop computers, cache misses are everything. It might also be possible to speed things up with multiple threads. Profile it. — Jive Dadson, Feb 17 '18 at 08:05
Where does the input file come from? Is it a file you write with another program and therefore could change the format? Do you read the same file many times (i.e. by running this program repeatedly)? — John Zwinck, Feb 17 '18 at 08:06

score 2 · Answer 1 · answered Feb 17 '18 at 08:17

Your profile output suggests that the first area to optimize is the file reading. Specifically:

wchar_t wch[256];
wstring p1, p2;
while (r_dic >> wch >> p1 >> p2) {
    pTrie* pt = (*root).insert(splitJamo(wch).c_str(), p1+L' '+p2);
    pt->addArche(wch);
}

This reads three strings repeatedly. wch is read into a character array, but then passed to splitJamo() which returns a wstring, which requires memory allocation. That might be a bit slow, but I can't tell because you haven't shown the code for splitJamo().

You read p1 and p2 and immediately concatenate them with a space. This is inefficient: they were separated by whitespace in the input file, and you read them separately, allocating memory for them, but then put them back together again.

Assuming the three strings appear on each line of the input file, I'd read it like this:

wchar_t wch[256];
wstring p1p2;
while (r_dic >> wch && std::getline(r_dic, p1p2)) {
    pTrie* pt = root->insert(splitJamo(wch), p1p2);
    pt->addArche(wch);
}

This reads p1 and p2 together, which should be an improvement. A further improvement might be to use getline() to read the entire line at once, but we can't tell without seeing the code for splitJamo() and insert().

Also note I removed c_str() from the first argument to insert() because I assume it probably takes a wstring, so we avoid constructing a new one this way. But if it requires wchar_t*, you can put back the c_str().

@ikam: Did you end up resolving your problem? If this answer was helpful you may "accept" it by clicking the checkmark on the left. — John Zwinck, Feb 25 '18 at 01:51

score 1 · Answer 2 · answered Feb 17 '18 at 07:41

A general rule about software performance assertions says: whatever you guessed to be a reason of program's performance issues, you are wrong. Use a tool instead of guessing.

In the domain of performance optimization the first tool to use is a profiler. Choose one, run a program under its control, then analyze profiler's report on hotspots (ask on SO if such report is hard to grasp, that is expected), make a hypothesis based on the profiler's data, change your program accordingly to the hypothesis, rerun and remeasure, rinse and repeat until you are satisfied with improvements.

There are a number of profilers out there, integrated into IDEs (in MS visual Studio, maybe smth in XCode), integrated into OSes (Linux perf) or standalone (Intel VTune).

As far as I can tell, you suspect IO to be the reason of slowness, but you are very likely to be wrong. It may be memory allocation inefficiency, locale transformations, string operations overuse, etc. etc. Only hard evidence of profiler is the safest way to having a progress with optimization.

This is more of a comment- but I give you +1 nonetheless - as it would be a long comment — Ed Heal, Feb 17 '18 at 07:42

How can I make my C++ program for Trie structure construction faster?

2 Answers2