63

I'd like to read whole content of a text file to a std::string object with c++.

With Python, I can write:

text = open("text.txt", "rt").read()

It is very simple and elegant. I hate ugly stuff, so I'd like to know - what is the most elegant way to read a text file with C++? Thanks.

silkfire
  • 20,433
  • 12
  • 70
  • 93
Fang-Pen Lin
  • 10,892
  • 13
  • 59
  • 90

5 Answers5

133

There are many ways, you pick which is the most elegant for you.

Reading into char*:

ifstream file ("file.txt", ios::in|ios::binary|ios::ate);
if (file.is_open())
{
    file.seekg(0, ios::end);
    size = file.tellg();
    char *contents = new char [size];
    file.seekg (0, ios::beg);
    file.read (contents, size);
    file.close();
    //... do something with it
    delete [] contents;
}

Into std::string:

std::ifstream in("file.txt");
std::string contents((std::istreambuf_iterator<char>(in)), 
    std::istreambuf_iterator<char>());

Into vector<char>:

std::ifstream in("file.txt");
std::vector<char> contents((std::istreambuf_iterator<char>(in)),
    std::istreambuf_iterator<char>());

Into string, using stringstream:

std::ifstream in("file.txt");
std::stringstream buffer;
buffer << in.rdbuf();
std::string contents(buffer.str());

file.txt is just an example, everything works fine for binary files as well, just make sure you use ios::binary in ifstream constructor.

Nicol Bolas
  • 378,677
  • 53
  • 635
  • 829
Milan Babuškov
  • 55,232
  • 47
  • 119
  • 176
  • 1
    I like your answer even better than mine, which is not something I say often. Good job! +1 – Chris Jester-Young Oct 12 '08 at 11:11
  • 8
    you actually need an extra set of parentheses around the first argument to contents' constructor with istreambuf_iterator<> to prevent it from being treated as a function declaration. – Greg Rogers Oct 12 '08 at 12:41
  • delete [] missing from char* version? – Shadow2531 Oct 12 '08 at 14:28
  • memblock in the first version should probably be contents. – Roskoto Oct 12 '08 at 16:31
  • 2
    @Shadow2531: I figured it should not be deleted until you're done doing something with it. – Milan Babuškov Oct 12 '08 at 17:00
  • FWIW, I think ios_base format flags like 'binary' and 'ate' etc. should be referenced by ios_base::binary and ios_base::ate etc. I think using ios::binary and ios::ate etc. is deprecated. – Shadow2531 Oct 12 '08 at 20:51
  • @Shadow2531: I tried with fairly recent GCC (4.2.3) but it does not give any deprecation warning. Care to give some URL that talks about it? – Milan Babuškov Oct 13 '08 at 06:00
  • 'Deprecated' might not be the correct term. But, I've been told that ios_base::binary is the proper way and that ios::binary is a left-over pre-Standardization. – Shadow2531 Oct 13 '08 at 11:13
  • To find out for sure, I think you'd have to look in a copy of ISO/IEC 14882. However, fwiw, binary and such is defined under the ios_base class in include\c++\4.2.1-dw2\bits\ios_base.h – Shadow2531 Oct 13 '08 at 11:14
  • Great examples! I'm confused about the second & third examples. If I remove what looks like an extra set of parentheses around the first parameter to the string constructor, it fails to compile. Why are they necessary? – Ferruccio Oct 14 '08 at 01:37
  • 1
    @Ferruccio: please see above comment by Greg Rogers. – Milan Babuškov Oct 14 '08 at 21:07
  • Use std::vector contents(size) rather than char* contents; – Martin York Oct 19 '08 at 18:55
  • To improve your answer, I guess it would be nice to point out the methods loading up the whole file in memory, and those reading it iteratively. I guess the first case is applied to all of the methods you've pointed, right? – Rubens Mar 01 '13 at 02:36
  • I got this warning from icpc: remark #981: operands are evaluated in unspecified order vector vm((std::istreambuf_iterator(vmifs)), std::istreambuf_iterator()); – fchen Mar 16 '13 at 04:59
  • The first solution has two problems. First, you don't give the type of size (should be int, I assume?), and second the char* isn't terminated properly. Here's a tested version: ifstream file("..\\TESAITest\\data\\BlueprintManagerTestData.json", ios::in | ios::binary | ios::ate); if (file.is_open()) { file.seekg(0, ios::end); int size = file.tellg(); char *contents = new char[size+1]; file.seekg(0, ios::beg); file.read(contents, size); file.close(); contents[size] = '\0'; // do something delete[] contents; } – Kevin Dill May 17 '17 at 21:01
  • Make sure to [`#include `](https://stackoverflow.com/a/32654464/7032856) – Nae Nov 26 '19 at 22:00
12

There's another thread on this subject.

My solutions from this thread (both one-liners):

The nice (see Milan's second solution):

string str((istreambuf_iterator<char>(ifs)), istreambuf_iterator<char>());

and the fast:

string str(static_cast<stringstream const&>(stringstream() << ifs.rdbuf()).str());
Community
  • 1
  • 1
Konrad Rudolph
  • 482,603
  • 120
  • 884
  • 1,141
  • actually, the first is faster because it operates on the istream buffer directly, and the latter relies on the first but adds some failure status bits. – t.g. Mar 28 '11 at 12:29
  • @t.g. The first uses a very inefficient copy to construct the string without prior allocation, which leads to a lot of re-allocations. The second pre-allocates a buffer of the required size. – Konrad Rudolph Mar 28 '11 at 12:39
  • 1
    I just tested it with VC++10. It actually depends. It depends on the file size, the first is faster for smaller files and the the second is faster for the larger files, which seems to prove what your said. – t.g. Jun 21 '11 at 03:48
  • `string str((istreambuf_iterator(ifs)));` works fine for me, what is the problem with that? – Thomas E Sep 03 '16 at 16:39
  • @ThomasE It uses a non existent [constructor overload](http://en.cppreference.com/w/cpp/string/basic_string/basic_string). No idea why it works on your compiler, or what exactly it does. – Konrad Rudolph Sep 03 '16 at 21:42
  • https://stackoverflow.com/questions/195323/what-is-the-most-elegant-way-to-read-a-text-file-with-c#comment104359393_195350 – Nae Nov 26 '19 at 22:03
4

You seem to speak of elegance as a definite property of "little code". This is ofcourse subjective in some extent. Some would say that omitting all error handling isn't very elegant. Some would say that clear and compact code you understand right away is elegant.

Write your own one-liner function/method which reads the file contents, but make it rigorous and safe underneath the surface and you will have covered both aspects of elegance.

All the best

/Robert

sharkin
  • 11,386
  • 19
  • 83
  • 119
  • 3
    Corollary: Elegance is as elegance does; notions of elegant code differ between languages and paradigms. What a C++ programmer might consider elegant could be horrific for a Ruby or Python programmer, and vice-versa. – Rob Oct 12 '08 at 15:50
2

But beware that a c++-string (or more concrete: An STL-string) is as little as a C-String capable of holding a string of arbitraty length - of course not!

Take a look at the member max_size() which gives you the maximum number of characters a string might contain. This is an implementation definied number and may not be portable among different platforms. Visual Studio gives a value of about 4gigs for strings, others might give you only 64k and on 64Bit-platforms it might give you something really huge! It depends and of course normally you will run into a bad_alloc-exception due to memory exhaustion a long time before reaching the 4gig limit...

BTW: max_size() is a member of other STL-containers as well! It will give you the maximum number of elements of a certain type (for which you instanciated the container) which this container will (theoretically) be able to hold.

So, if you're reading from a file of unknow origin you should:
- Check its size and make sure it's smaller than max_size()
- Catch and process bad_alloc-exceptions

And another point: Why are you keen on reading the file into a string? I would expect to further process it by incrementally parsing it or something, right? So instead of reading it into a string you might as well read it into a stringstream (which basically is just some syntactic sugar for a string) and do the processing. But then you could do the processing directly from the file as well. Because if properly programmed the stringstream could seamlessly be replaced by a filestream, i. e. by the file itself. Or by any other input stream as well, they all share the same members and operators and can thus be seamlessly interchanged!

And for the processing itself: There's also a lot you can have automated by the compiler! E. g. let's say you want to tokenize the string. When defining a proper template the following actions:
- Reading from a file (or a string or any other input stream)
- Tokenizing the content
- pushing all found tokens into an STL-container
- sort the tokens alphabetically
- eleminating any double values
can all(!!) be achived in one single(!) line of C++-code (let aside the template itself and the error handling)! It's just a single call of the function std::copy()! Just google for "token iterator" and you'll get an idea of what I mean. So this appears to me to be even more "elegant" than just reading from a file...

Don Pedro
  • 21
  • 2
  • Of note, `max_size()` is defined relative to the size of `size_t`, which is relative to the bit size of your platform. It's defined this way to allow for a string to be as large as your platform can address. – Devin Lane Apr 24 '12 at 23:15
0

I like Milan's char* way, but with std::string.


#include <iostream>
#include <string>
#include <fstream>
#include <cstdlib>
using namespace std;

string& getfile(const string& filename, string& buffer) {
    ifstream in(filename.c_str(), ios_base::binary | ios_base::ate);
    in.exceptions(ios_base::badbit | ios_base::failbit | ios_base::eofbit);
    buffer.resize(in.tellg());
    in.seekg(0, ios_base::beg);
    in.read(&buffer[0], buffer.size());
    return buffer;
}

int main(int argc, char* argv[]) {
    if (argc != 2) {
        cerr << "Usage: this_executable file_to_read\n";
        return EXIT_FAILURE;
    }
    string buffer;
    cout << getfile(argv[1], buffer).size() << "\n";
}

(with or without the ios_base::binary, depending on whether you want newlines tranlated or not. You could also change getfile to just return a string so that you don't have to pass a buffer string in. Then, test to see if the compiler optimizes the copy out when returning.)

However, this might look a little better (and be a lot slower):


#include <iostream>
#include <string>
#include <fstream>
#include <cstdlib>
using namespace std;

string getfile(const string& filename) {
    ifstream in(filename.c_str(), ios_base::binary);
    in.exceptions(ios_base::badbit | ios_base::failbit | ios_base::eofbit);
    return string(istreambuf_iterator<char>(in), istreambuf_iterator<char>());
}

int main(int argc, char* argv[]) {
    if (argc != 2) {
        cerr << "Usage: this_executable file_to_read\n";
        return EXIT_FAILURE;
    }
    cout << getfile(argv[1]).size() << "\n";
}
Shadow2531
  • 11,352
  • 5
  • 29
  • 40