Why is istream/ostream slow

Question

At 50:40 of http://channel9.msdn.com/Events/GoingNative/2013/Writing-Quick-Code-in-Cpp-Quickly Andrei Alexandrescu makes a joke about how not efficient/slow istream is.

I had an issue in the past with ostream being slow and fwrite being significantly faster (reducing many seconds when running the main loop once) but I never understood why nor looked into it.

What makes istream and ostream slow in C++? or at least slow compared to other things (like fread/fget, fwrite) which would equally satisfied the needs.

IIRC the C++ streams have to sync with the C i/o "constructs" if you will (for compatibility reasons). I believe you can make them faster by turning that syncing off (granted you'll have to restrain from doing things like printf afterwards) — Borgleader, Sep 08 '13 at 21:23
@Borgleader: What C "constructs" would ostream sync to (it was a file output stream not std::out) and why is it slower then C fwrite? — , Sep 08 '13 at 21:25
Take a look at this answer: http://stackoverflow.com/a/9371717/583833 — Borgleader, Sep 08 '13 at 21:30
Related: http://stackoverflow.com/questions/4340396/does-the-c-standard-mandate-poor-performance-for-iostreams-or-am-i-just-deali — Ben Voigt, Sep 08 '13 at 21:50
possible duplicate of [Why is reading lines from stdin much slower in C++ than Python?](http://stackoverflow.com/questions/9371238/why-is-reading-lines-from-stdin-much-slower-in-c-than-python) — 7hi4g0, Mar 07 '15 at 19:28

score 50 · Accepted Answer · edited Jul 25 '15 at 16:06

Actually, IOStreams don't have to be slow! It is a matter of implementing them in a reasonable way to make them fast, though. Most standard C++ library don't seem to pay too much attention to implement IOStreams. A long time ago when my CXXRT was still maintained it was about as fast as stdio - when used correctly!

Note that there are few performance traps for users laid out with IOStreams, however. The following guidelines apply to all IOStream implementations but especially to those which are tailored to be fast:

When using std::cin, std::cout, etc. you need to call std::sync_with_stdio(false)! Without this call, any use of the standard stream objects is required to synchronize with C's standard streams. Of course, when using std::sync_with_stdio(false) it is assumed that you don't mix std::cin with stdin, std::cout with stdout, etc.
Do not use std::endl as it mandates many unnecessary flushes of any buffer. Likewise, don't set std::ios_base::unitbuf or use std::flush unnecessarily.
When creating your own stream buffers (OK, few users do), make sure they do use an internal buffer! Processing individual characters jumps through multiple conditions and a virtual function which makes it hideously slow.

+1 For pointing out that it's mostly a problem with the implementation, not the library itself. Efficient iostreams implementation is also one of the main concerns in the [C++ Performance Report](http://www.open-std.org/jtc1/sc22/wg21/docs/18015.html) published by the ISO committee in 2006. — ComicSansMS, Sep 09 '13 at 07:18
@ComicSansMS: As it happens, much of the material on performance of IOStreams is based on my contributions :-) (the contributions are not attributed to their respective authors; the contributors are listed on page 6, however). — Dietmar Kühl, Sep 09 '13 at 08:52

vitaut · Answer 2 · 2020-12-16T15:20:28.390

There are several reasons why [i]ostreams are slow by design:

Shared formatting state: every formatted output operation has to check all formatting state that might have been previously mutated by I/O manipulators. For this reason iostreams are inherently slower than printf-like APIs (especially with format string compilation like in Rust or {fmt} that avoid parsing overhead) where all formatting information is local.
Uncontrolled use of locales: all formatting goes through an inefficient locale layer even if you don't want this, for example when writing a JSON file. See N4412: Shortcomings of iostreams.
Inefficient codegen: formatting a message with iostreams normally consists of multiple function calls because arguments and I/O manipulators are interleaved with parts of the message. For example, there are three function calls (godbolt) in
```
std::cout << "The answer is " << answer << ".\n";
```
compared to just one (godbolt) in the equivalent printf call:
```
printf("The answer is %d.\n", answer);
```
Extra buffering and synchronization. This can be disabled with sync_with_stdio(false) at the cost of poor interoperability with other I/O facilities.

Jerry Coffin · Answer 3 · 2017-07-12T00:44:21.903

Perhaps this can give some idea of what you're dealing with:

#include <stdio.h>
#include <iomanip>
#include <iostream>
#include <iterator>
#include <fstream>
#include <time.h>
#include <string>
#include <algorithm>

unsigned count1(FILE *infile, char c) { 
    int ch;
    unsigned count = 0;

    while (EOF != (ch=getc(infile)))
        if (ch == c)
            ++count;
    return count;
}

unsigned int count2(FILE *infile, char c) { 
    static char buffer[8192];
    int size;
    unsigned int count = 0;

    while (0 < (size = fread(buffer, 1, sizeof(buffer), infile)))
        for (int i=0; i<size; i++)
            if (buffer[i] == c)
                ++count;
    return count;
}

unsigned count3(std::istream &infile, char c) {    
    return std::count(std::istreambuf_iterator<char>(infile), 
                    std::istreambuf_iterator<char>(), c);
}

unsigned count4(std::istream &infile, char c) {    
    return std::count(std::istream_iterator<char>(infile), 
                    std::istream_iterator<char>(), c);
}

unsigned int count5(std::istream &infile, char c) {
    static char buffer[8192];
    unsigned int count = 0;

    while (infile.read(buffer, sizeof(buffer)))
        count += std::count(buffer, buffer+infile.gcount(), c);
    count += std::count(buffer, buffer+infile.gcount(), c);
    return count;
}

unsigned count6(std::istream &infile, char c) {
    unsigned int count = 0;
    char ch;

    while (infile >> ch)
        if (ch == c)
            ++count;
    return count;
}

template <class F, class T>
void timer(F f, T &t, std::string const &title) { 
    unsigned count;
    clock_t start = clock();
    count = f(t, 'N');
    clock_t stop = clock();
    std::cout << std::left << std::setw(30) << title << "\tCount: " << count;
    std::cout << "\tTime: " << double(stop-start)/CLOCKS_PER_SEC << "\n";
}

int main() {
    char const *name = "equivs2.txt";

    FILE *infile=fopen(name, "r");

    timer(count1, infile, "ignore");

    rewind(infile);
    timer(count1, infile, "using getc");

    rewind(infile);
    timer(count2, infile, "using fread");

    fclose(infile);

    std::ifstream in2(name);
    timer(count3, in2, "ignore");

    in2.clear();
    in2.seekg(0);
    timer(count3, in2, "using streambuf iterators");

    in2.clear();
    in2.seekg(0);
    timer(count4, in2, "using stream iterators");

    in2.clear();
    in2.seekg(0);
    timer(count5, in2, "using istream::read");

    in2.clear();
    in2.seekg(0);
    timer(count6, in2, "using operator>>");

    return 0;
}

Running this, I get results like this (with MS VC++):

ignore                          Count: 1300     Time: 0.309
using getc                      Count: 1300     Time: 0.308
using fread                     Count: 1300     Time: 0.028
ignore                          Count: 1300     Time: 0.091
using streambuf iterators       Count: 1300     Time: 0.091
using stream iterators          Count: 1300     Time: 0.613
using istream::read             Count: 1300     Time: 0.028
using operator>>                Count: 1300     Time: 0.619

and this (with MinGW):

ignore                          Count: 1300     Time: 0.052
using getc                      Count: 1300     Time: 0.044
using fread                     Count: 1300     Time: 0.036
ignore                          Count: 1300     Time: 0.068
using streambuf iterators       Count: 1300     Time: 0.068
using stream iterators          Count: 1300     Time: 0.131
using istream::read             Count: 1300     Time: 0.037
using operator>>                Count: 1300     Time: 0.121

As we can see in the results, it's not really a matter of iostreams being categorically slow. Rather, a great deal depends on exactly how you use iostreams (and to a lesser extent FILE * as well). There's also a pretty substantial variation just between these to implementations.

Nonetheless, the fastest versions with each (fread and istream::read) are essentially tied. With VC++ getc is quite a bit slower than either istream::read or and istreambuf_iterator.

Bottom line: getting good performance from iostreams requires a little more care than with FILE * -- but it's certainly possible. They also give you more options: convenience when you don't care all that much about speed, and performance directly competitive with the best you can get from C-style I/O, with a little extra work.

Since my [edit](http://stackoverflow.com/review/suggested-edits/2929159) got rejected: your `istream::read`-version has a bug. The last chunk of characters isn’t checked, [see here](http://coliru.stacked-crooked.com/a/8376d576889f7628). — Darklighter, Sep 15 '13 at 16:01
Handy. Also, if you copy count6 to a new count7 with "while (infile.get(ch))" and you'll see that it is twice as fast as operator>> but still twice as slow as getc. — Nick Westgate, Jul 12 '17 at 00:36
@NickWestgate: Yeah--no matter how many I add, there are at least three more that could be added. If (for example) another method were faster than anything else, I'd probably add it--but another that's more or less in the middle of the pack just doesn't seem like it's worth bothering... — Jerry Coffin, Jul 12 '17 at 00:48
Well it would be useful for those (like me) who are comparing the current state of some code to the other options. I'm pretty disappointed that istream::get spends a lot of time entering and exiting critical sections in some single-threaded code I maintain. ; - ) Anyway, thanks for the handy test suite. — Nick Westgate, Jul 12 '17 at 00:59
File I/O is inherently noisy on Windows and probably Linux as well due to caching. — gast128, Jul 14 '20 at 10:12

score 1 · Answer 4 · answered Mar 14 '18 at 19:58

While this question is quite old, I'm amazed nobody has mentioned iostream object construction.

That is, whenever you create an STL iostream (and other stream variants), if you step into the code, the constructor calls an internal Init function. In there, operator new is called to create a new locale object. And likewise, is destroyed upon destruction.

This is hideous, IMHO. And certainly contributes to slow object construction/destruction, because memory is being allocated/deallocated using a system lock, at some point.

Further, some of the STL streams allow you to specify an allocator, so why is the locale created NOT using the specified allocator?

Using streams in a multithreaded environment, you could also imagine the bottleneck imposed by calling operator new every time a new stream object is constructed.

Hideous mess if you ask me, as I am finding out myself right now!

[Karl Knechtel](https://stackoverflow.com/users/523612/karl-knechtel) says [here](https://stackoverflow.com/questions/9371238/why-is-reading-lines-from-stdin-much-slower-in-c-than-python?noredirect=1&lq=1#comment11835795_9371717): _"(...) This task is almost certainly I/O bound and there is way too much FUD going around about the cost of creating std::string objects in C++ or using in and of itself."_ — Marc.2377, Mar 16 '18 at 04:36

score 0 · Answer 5 · edited Apr 12 '19 at 07:08

0

On a similar topic, STL says: "You can call setvbuf() to enable buffering on stdout."

https://web.archive.org/web/20170329163751/https://connect.microsoft.com/VisualStudio/feedback/details/642876/std-wcout-is-ten-times-slower-than-wprintf-performance-bug-in-c-library

edited Apr 12 '19 at 07:08

Yusuf Tarık Günaydın

2,698
2
23
38

answered Sep 09 '13 at 01:03

AndrewDover

17
1

Why is istream/ostream slow

5 Answers5

Linked