92

Possible Duplicate:
What is the best way to slurp a file into a std::string in c++?

In scripting languages like Perl, it is possible to read a file into a variable in one shot.

    open(FILEHANDLE,$file);
    $content=<FILEHANDLE>;

What would be the most efficient way to do this in C++?

Community
  • 1
  • 1
sonofdelphi
  • 1,786
  • 3
  • 18
  • 24

7 Answers7

207

Like this:

#include <fstream>
#include <string>

int main(int argc, char** argv)
{

  std::ifstream ifs("myfile.txt");
  std::string content( (std::istreambuf_iterator<char>(ifs) ),
                       (std::istreambuf_iterator<char>()    ) );

  return 0;
}

The statement

  std::string content( (std::istreambuf_iterator<char>(ifs) ),
                       (std::istreambuf_iterator<char>()    ) );

can be split into

std::string content;
content.assign( (std::istreambuf_iterator<char>(ifs) ),
                (std::istreambuf_iterator<char>()    ) );

which is useful if you want to just overwrite the value of an existing std::string variable.

Martijn Pieters
  • 889,049
  • 245
  • 3,507
  • 2,997
Maik Beckmann
  • 5,138
  • 1
  • 20
  • 18
  • 6
    +1 Very C++ idiomatic. In Linux with gcc 4.4 the resulting system calls are efficient, the file is read 8k at a time. – piotr May 26 '10 at 12:00
  • 3
    If the file size is known, the `std::string::reserve` method can be called before reading the file to allocate space. This should speed up the execution. Much time is lost by reallocating memory for the string. – Thomas Matthews May 26 '10 at 17:11
  • 6
    +1, but why do the iterators *have* to be enclosed in parenthesis? They seem innocuous but it won't compile without them. – Qix - MONICA WAS MISTREATED Jan 23 '14 at 04:57
  • 9
    Qix: it's the "classic" c++ parsing problem called Most Vexing Parse: http://en.wikipedia.org/wiki/Most_vexing_parse – fileoffset Feb 25 '14 at 01:15
  • I tried the above content.assign( (std::istreambuf_iterator(ifs) ), command. Works well read all the file but when i display the contents it shows invalid character at the beginning. why ?? Help. 'code' string content; ifstream myfile("textFile.txt"); content.assign( (istreambuf_iterator(myfile) ), (istreambuf_iterator() ) ); cout< – user1155921 Jun 06 '15 at 07:22
  • In VS2010, how to pass this "content" to a function definition for example AtoB(std::string); Any further example? – AskMe Jan 14 '17 at 08:50
  • @MartijnPieters why the second parameter is an empty iterator and not related to ifs? I guess you are using this: [basic_string& assign (InputIterator first, InputIterator last)](http://www.cplusplus.com/reference/string/basic_string/assign/) – aburbanol Mar 20 '17 at 13:34
  • @aburbanol: I'm not the author of this post; I only dealt with cleaning up a bad edit here. – Martijn Pieters Mar 20 '17 at 13:36
  • Is there any way to check the read was successful (or not) with this method? – j b Jul 25 '17 at 11:26
  • It seems \r\n is translated into \n, how to avoid this? – user1633272 Nov 07 '17 at 13:03
  • Why the extra parentheses around iterators? Makes the code more messy IMO :) – juzzlin Oct 18 '18 at 13:27
  • This compiles without parentheses at least on `GCC 5.5.0` and `GCC 7.3.0` on my Ubuntu 18.04. – juzzlin Oct 18 '18 at 13:48
  • i was woundering about `std::istreambuf_iterator()`, but this is the `end-of-stream iterator`. https://en.cppreference.com/w/cpp/iterator/istreambuf_iterator – Markus Dutschke Nov 13 '20 at 08:23
  • It will throw exception `basic_filebuf:underflow`, if the file from udisk, and the udisk is pulled out. How to deal with it? – Shun Dec 22 '20 at 07:37
42

The most efficient, but not the C++ way would be:

   FILE* f = fopen(filename, "r");

   // Determine file size
   fseek(f, 0, SEEK_END);
   size_t size = ftell(f);

   char* where = new char[size];

   rewind(f);
   fread(where, sizeof(char), size, f);

   delete[] where;

#EDIT - 2

Just tested the std::filebuf variant also. Looks like it can be called the best C++ approach, even though it's not quite a C++ approach, but more a wrapper. Anyway, here is the chunk of code that works almost as fast as plain C does.

   std::ifstream file(filename, std::ios::binary);
   std::streambuf* raw_buffer = file.rdbuf();

   char* block = new char[size];
   raw_buffer->sgetn(block, size);
   delete[] block;

I've done a quick benchmark here and the results are following. Test was done on reading a 65536K binary file with appropriate (std::ios:binary and rb) modes.

[==========] Running 3 tests from 1 test case.
[----------] Global test environment set-up.
[----------] 4 tests from IO
[ RUN      ] IO.C_Kotti
[       OK ] IO.C_Kotti (78 ms)
[ RUN      ] IO.CPP_Nikko
[       OK ] IO.CPP_Nikko (106 ms)
[ RUN      ] IO.CPP_Beckmann
[       OK ] IO.CPP_Beckmann (1891 ms)
[ RUN      ] IO.CPP_Neil
[       OK ] IO.CPP_Neil (234 ms)
[----------] 4 tests from IO (2309 ms total)

[----------] Global test environment tear-down
[==========] 4 tests from 1 test case ran. (2309 ms total)
[  PASSED  ] 4 tests.
M. Williams
  • 4,805
  • 2
  • 24
  • 27
  • +1 this was going to be my answer. Although, you should add the allocation of `where` for the example to be more clear: `char *where = malloc(sizeof(char) * size);` – Felix May 26 '10 at 11:56
  • As far as I know, there is a chance that std::filebuf in the C++ library is more efficient – Nikko May 26 '10 at 11:57
  • 2
    Nice benchmark, I'm not surprised by the numbers. For max performance on plain ascii files using good old C io is the way to go. C++ streams are just no match. However, they are less error prone. As long as they are not showing up when profiling I'd prefer using them. – Maik Beckmann May 26 '10 at 12:40
  • You have to try with std::filebuf s :) – Nikko May 26 '10 at 12:56
  • 2
    Wow, it's actually cool. I don't know why, but at first I didn't trust you. Looks like the best way to combine `iostream` functionality and raw C file reading speed. – M. Williams May 26 '10 at 13:13
  • 1
    Heh.. plain C is STILL faster ;) – Felix May 26 '10 at 17:47
  • Can you compare using `getline(file, string, string::traits_type::to_char_type(string::traits_type::eof())`? – Steven Lu Jul 03 '13 at 00:04
  • Awesome speed with the C method! Thanks. – Raj Dec 03 '13 at 21:38
  • 1
    @Constantino, you method of determining file length is improper. Although fstat/rewing combination works, the proper way is filling stat struct and extracting st_size member. It is better to be on the safe side. – Bulat M. Oct 27 '16 at 05:34
  • 3
    how do you find size here?! – Urvashi Gupta Jul 30 '18 at 23:44
13

The most efficient is to create a buffer of the correct size and then read the file into the buffer.

#include <fstream>
#include <vector>

int main()
{
    std::ifstream       file("Plop");
    if (file)
    {
        /*
         * Get the size of the file
         */
        file.seekg(0,std::ios::end);
        std::streampos          length = file.tellg();
        file.seekg(0,std::ios::beg);

        /*
         * Use a vector as the buffer.
         * It is exception safe and will be tidied up correctly.
         * This constructor creates a buffer of the correct length.
         * Because char is a POD data type it is not initialized.
         *
         * Then read the whole file into the buffer.
         */
        std::vector<char>       buffer(length);
        file.read(&buffer[0],length);
    }
}
Martin York
  • 234,851
  • 74
  • 306
  • 532
  • Benchmarks? Or even strace... (not that I don't believe this is the fastest, I do wonder whether it is actually any different from the iterator-based approach) – Tronic Feb 15 '11 at 21:31
  • 3
    This method is not guaranteed to work. `tellg` is not specified to return the offset into the file in bytes - it's just an opaque token. [See this answer](http://stackoverflow.com/a/22986486/1505939) for a more detailed explanation. – M.M Nov 26 '15 at 05:09
  • In text mode, on an operating system that performs file translations, it is likely that the result of `tellg` will not match the number of characters available to be read – M.M Nov 26 '15 at 05:10
8

There should be no \0 in text files.

#include<iostream>
#include<fstream>

using namespace std;

int main(){
  fstream f(FILENAME, fstream::in );
  string s;
  getline( f, s, '\0');

  cout << s << endl;
  f.close();
}
Draco Ater
  • 19,891
  • 8
  • 60
  • 85
  • 7
    The question didn't mention text files. –  May 26 '10 at 11:51
  • 3
    -1 This example only reads one line, I have to wonder about the moderations. – piotr May 26 '10 at 11:58
  • 4
    @piotr This example reads the whole text file, it is tested. – Draco Ater May 26 '10 at 12:03
  • I think everybody assume it would be a text file but it's true it may not be the case. As far as the code goes: maybe it's clearer to directly use ifstream( "filename" ). You don't need to close the file, it is done automatically. And it does read the text file. – Nikko May 26 '10 at 12:04
  • @Draco Ater I tested with a binary file, probably it did read only until \0. The point is that this example is going to process every character, I prefer iterator based solutions that may be more efficient. – piotr May 27 '10 at 05:10
  • This works beautifully. It's the most relevant answer in this entire thread in my opinion. Stores the whole shebang in a C++ string, not a char array. Thanks Draco! – Martyn Chamberlin Apr 23 '15 at 05:03
  • @MartynChamberlin the other solutions which store in a `vector` could be modified to store in a `string` by changing the text `vector` to `string` – M.M Nov 26 '15 at 05:15
  • This is the simplest (and best) solution here. +1 – UserX May 30 '19 at 19:55
3

maybe not the most efficient, but reads data in one line:

#include<iostream>
#include<vector>
#include<iterator>

main(int argc,char *argv[]){
  // read standard input into vector:
  std::vector<char>v(std::istream_iterator<char>(std::cin),
                     std::istream_iterator<char>());
  std::cout << "read " << v.size() << "chars\n";
}
catwalk
  • 5,768
  • 22
  • 16
3

This depends on a lot of things, such as what is the size of the file, what is its type (text/binary) etc. Some time ago I benchmarked the following function against versions using streambuf iterators - it was about twice as fast:

unsigned int FileRead( std::istream & is, std::vector <char> & buff ) {
    is.read( &buff[0], buff.size() );
    return is.gcount();
}

void FileRead( std::ifstream & ifs, string & s ) {
    const unsigned int BUFSIZE = 64 * 1024; // reasoable sized buffer
    std::vector <char> buffer( BUFSIZE );

    while( unsigned int n = FileRead( ifs, buffer ) ) {
        s.append( &buffer[0], n );
    }
}
2

Here's an iterator-based method.

ifstream file("file", ios::binary);
string fileStr;

istreambuf_iterator<char> inputIt(file), emptyInputIt
back_insert_iterator<string> stringInsert(fileStr);

copy(inputIt, emptyInputIt, stringInsert);
academicRobot
  • 6,029
  • 1
  • 28
  • 28