685

The contents of file.txt are:

5 3
6 4
7 1
10 5
11 6
12 3
12 4

Where 5 3 is a coordinate pair. How do I process this data line by line in C++?

I am able to get the first line, but how do I get the next line of the file?

ifstream myfile;
myfile.open ("file.txt");
MM1
  • 856
  • 1
  • 6
  • 20
dukevin
  • 19,591
  • 32
  • 77
  • 107

8 Answers8

1018

First, make an ifstream:

#include <fstream>
std::ifstream infile("thefile.txt");

The two standard methods are:

  1. Assume that every line consists of two numbers and read token by token:

    int a, b;
    while (infile >> a >> b)
    {
        // process pair (a,b)
    }
    
  2. Line-based parsing, using string streams:

    #include <sstream>
    #include <string>
    
    std::string line;
    while (std::getline(infile, line))
    {
        std::istringstream iss(line);
        int a, b;
        if (!(iss >> a >> b)) { break; } // error
    
        // process pair (a,b)
    }
    

You shouldn't mix (1) and (2), since the token-based parsing doesn't gobble up newlines, so you may end up with spurious empty lines if you use getline() after token-based extraction got you to the end of a line already.

Michael Mrozek
  • 149,906
  • 24
  • 156
  • 163
Kerrek SB
  • 428,875
  • 83
  • 813
  • 1,025
  • 1
    @EdwardKarak: I don't understand what "commas as the token" means. Commas don't represent integers. – Kerrek SB Oct 18 '14 at 14:22
  • 8
    the OP used a space to delimit the two integers. I wanted to know if while (infile >> a >> b) would work if the OP used a as a comma a delimiter, because that is the scenario in my own program – Edward Karak Oct 18 '14 at 14:46
  • 30
    @EdwardKarak: Ah, so when you said "token" you meant "delimiter". Right. With a comma, you'd say: `int a, b; char c; while ((infile >> a >> c >> b) && (c == ','))` – Kerrek SB Oct 18 '14 at 15:25
  • 1
    @KerrekSB: That would only work if the comma was surrounded by spaces, i.e., "1 , 2". If the line contained "1,2", then your code would try to convert "1,2" into an integer (storing it in a) while c and b would get the tokens/delimiters on the next line. With anything besides whitespace delimiters, you really need to use std::getline() and parse the line. – Mark H Jan 06 '15 at 01:13
  • 12
    @KerrekSB: Huh. I was wrong. I didn't know it could do that. I might have some code of my own to rewrite. – Mark H Jan 06 '15 at 15:00
  • 4
    For an explanation of the `while(getline(f, line)) { }` construct and regarding error handling please have a look at this (my) article: http://gehrcke.de/2011/06/reading-files-in-c-using-ifstream-dealing-correctly-with-badbit-failbit-eofbit-and-perror/ (I think I do not need to have bad conscience posting this here, it even slightly pre-dates this answer). – Dr. Jan-Philip Gehrcke Jan 18 '15 at 14:15
  • @galois: thatched :-) – Kerrek SB Aug 31 '17 at 17:10
  • enjoy your cold one – galois Aug 31 '17 at 17:55
  • What's the best way to skip "#" commented lines use the first or second approach? Thanks. – elgnoh Oct 20 '17 at 16:24
  • @elgnoh: You can't do it in the first approach, which assumes you're parsing tokens and doesn't know what a "line" is. It's trivial in the second approach, where you just check the first character of the line string (potentially skipping whitespace). – Kerrek SB Oct 20 '17 at 21:42
  • @KerrekSB: One clarification please, in the first approach, as i understand, `>>` returns the reference to the stream object. So the question is when the stream reaches eof what will be returned that makes the while loop break. – Vivek Maran Feb 15 '18 at 21:50
  • @VivekMaran: the stream reaches EOF while reading digits to form the last element. It is in the next round that there are no digits left, and the attempt to read past the end of the stream makes the stream "fail", which is the exit condition for the loop. Here's a [demo](https://wandbox.org/permlink/GpHHJRPsqRT06McN). – Kerrek SB Feb 15 '18 at 22:08
  • @VivekMaran: If you're reading in a way that doesn't "read ahead", like just getting individual characters out, then you never trigger EOF until you actually step over the end of the stream: https://wandbox.org/permlink/oFaYFTFtnEfucaMv – Kerrek SB Feb 15 '18 at 22:10
  • @KerrekSB: Thanks got the EOF part, just figured out that directly using ifstream for a condition check will trigger the `bool` operator that will return `false` if eof is hit, and cause the loop to break. – Vivek Maran Feb 15 '18 at 22:17
  • @VivekMaran: No, the boolean conversion checks for `!fail()`, not `good()`, which differs in the treatment of EOF ([see here](http://en.cppreference.com/w/cpp/io/basic_ios/operator_bool)). – Kerrek SB Feb 15 '18 at 22:20
  • @KerrekSB, what if I need to read input one by one after reading the whole line using getline(). I mean in python language I can read the input from a file line by line in a list(a.k.a array) and then I can iterate over this list to pick the item one at a time and do whatever I want!. how should I handle such condition in c++? any suggestions? – Anu Jan 13 '19 at 05:55
  • @anu: Sure, you can store each line in a container (e.g. a `std::vector`) and then process that container after the loop has finished. This means of course that you need to be able to consume the entire file before proceeding, e.g. you can't be reading lines interactively. (But that's the same in Python I expect.) If performance is a concern, it might be better to read the entire *file* into memory in one step and then just store the location of the line breaks, e.g. as a `std::vector`, but I'd only do that if this is a performance bottleneck. – Kerrek SB Jan 13 '19 at 10:38
  • @anu: Maybe ask a new question? – Kerrek SB Jan 13 '19 at 16:22
  • @KerrekSB, here is the [question](https://stackoverflow.com/q/54174445/6484358), I am trying to solve? Using the above post, but didn't get a clue, how to do it? Any suggestions? – Anu Jan 14 '19 at 00:20
  • @KokHowTeh: can you be more precise? Do you want to parse out a string and an integer? That's indeed beyond iostreams' formatted input (at least in any nice and maintainable way); better to use a regular expression. – Kerrek SB May 19 '20 at 17:48
  • @KerrekSB, I was referring to this sample string: "HelloWorld, 123" – Kok How Teh May 20 '20 at 12:39
  • @KokHowTeh: Well, to parse that, fhe first variable needs to be a `std::string`: https://wandbox.org/permlink/LFdI2eyYF9klT9HN – Kerrek SB May 20 '20 at 13:32
197

Use ifstream to read data from a file:

std::ifstream input( "filename.ext" );

If you really need to read line by line, then do this:

for( std::string line; getline( input, line ); )
{
    ...for each line in input...
}

But you probably just need to extract coordinate pairs:

int x, y;
input >> x >> y;

Update:

In your code you use ofstream myfile;, however the o in ofstream stands for output. If you want to read from the file (input) use ifstream. If you want to both read and write use fstream.

K-ballo
  • 76,488
  • 19
  • 144
  • 164
  • 8
    Your solution is a bit improved: your line variable is not visible after file read-in in contrast to Kerrek SB's second solution which is good and simple solution too. – DanielTuzes Jul 23 '13 at 14:24
  • 7
    `getline` is in `string` [see](http://www.cplusplus.com/reference/string/string/getline/), so don't forget the `#include ` – mxmlnkn Jul 12 '17 at 23:02
96

Reading a file line by line in C++ can be done in some different ways.

[Fast] Loop with std::getline()

The simplest approach is to open an std::ifstream and loop using std::getline() calls. The code is clean and easy to understand.

#include <fstream>

std::ifstream file(FILENAME);
if (file.is_open()) {
    std::string line;
    while (std::getline(file, line)) {
        // using printf() in all tests for consistency
        printf("%s", line.c_str());
    }
    file.close();
}

[Fast] Use Boost's file_description_source

Another possibility is to use the Boost library, but the code gets a bit more verbose. The performance is quite similar to the code above (Loop with std::getline()).

#include <boost/iostreams/device/file_descriptor.hpp>
#include <boost/iostreams/stream.hpp>
#include <fcntl.h>

namespace io = boost::iostreams;

void readLineByLineBoost() {
    int fdr = open(FILENAME, O_RDONLY);
    if (fdr >= 0) {
        io::file_descriptor_source fdDevice(fdr, io::file_descriptor_flags::close_handle);
        io::stream <io::file_descriptor_source> in(fdDevice);
        if (fdDevice.is_open()) {
            std::string line;
            while (std::getline(in, line)) {
                // using printf() in all tests for consistency
                printf("%s", line.c_str());
            }
            fdDevice.close();
        }
    }
}

[Fastest] Use C code

If performance is critical for your software, you may consider using the C language. This code can be 4-5 times faster than the C++ versions above, see benchmark below

FILE* fp = fopen(FILENAME, "r");
if (fp == NULL)
    exit(EXIT_FAILURE);

char* line = NULL;
size_t len = 0;
while ((getline(&line, &len, fp)) != -1) {
    // using printf() in all tests for consistency
    printf("%s", line);
}
fclose(fp);
if (line)
    free(line);

Benchmark -- Which one is faster?

I have done some performance benchmarks with the code above and the results are interesting. I have tested the code with ASCII files that contain 100,000 lines, 1,000,000 lines and 10,000,000 lines of text. Each line of text contains 10 words in average. The program is compiled with -O3 optimization and its output is forwarded to /dev/null in order to remove the logging time variable from the measurement. Last, but not least, each piece of code logs each line with the printf() function for consistency.

The results show the time (in ms) that each piece of code took to read the files.

The performance difference between the two C++ approaches is minimal and shouldn't make any difference in practice. The performance of the C code is what makes the benchmark impressive and can be a game changer in terms of speed.

                             10K lines     100K lines     1000K lines
Loop with std::getline()         105ms          894ms          9773ms
Boost code                       106ms          968ms          9561ms
C code                            23ms          243ms          2397ms

enter image description here

incarnadine
  • 654
  • 4
  • 16
HugoTeixeira
  • 3,879
  • 2
  • 18
  • 28
  • 3
    What happens if you remove C++'s synchronization with C on the console outputs? You might be measuring a known disadvantage of the default behavior of `std::cout` vs `printf`. – user4581301 Jul 30 '18 at 20:41
  • 2
    Thanks for bringing this concern. I've redone the tests and the performance is still the same. I have edited the code to use the `printf()` function in all cases for consistency. I have also tried using `std::cout` in all cases and this made absolutely no difference. As I have just described in the text, the output of the program goes to `/dev/null` so the time to print the lines is not measured. – HugoTeixeira Jul 31 '18 at 02:11
  • 6
    Groovy. Thanks. Wonder where the slowdown is. – user4581301 Jul 31 '18 at 04:34
  • 6
    Hi @HugoTeixeira I know this is an old thread, I tried to replicate your results and could not see any significant difference between c and c++ https://github.com/simonsso/readfile_benchmarks – Simson Feb 03 '19 at 05:24
  • @Fareanor That's not correct. It only affects the *standard* C++ streams, `std::ifstream file` is not one of them. https://en.cppreference.com/w/cpp/io/ios_base/sync_with_stdio – user202729 Jun 15 '20 at 01:16
  • @HugoTeixeira Please, may you tell me what function/method used to calculate and time this. And what graphing plot system used? Any advice to re-create this graph and timing method would be helpful. – ABC Mar 17 '21 at 00:37
  • If reading from the file is not the bottleneck, then the C++ version will do fine. In my code the bottleneck is updating statistics. – StudySmarterNotHarder May 12 '21 at 02:49
14

Since your coordinates belong together as pairs, why not write a struct for them?

struct CoordinatePair
{
    int x;
    int y;
};

Then you can write an overloaded extraction operator for istreams:

std::istream& operator>>(std::istream& is, CoordinatePair& coordinates)
{
    is >> coordinates.x >> coordinates.y;

    return is;
}

And then you can read a file of coordinates straight into a vector like this:

#include <fstream>
#include <iterator>
#include <vector>

int main()
{
    char filename[] = "coordinates.txt";
    std::vector<CoordinatePair> v;
    std::ifstream ifs(filename);
    if (ifs) {
        std::copy(std::istream_iterator<CoordinatePair>(ifs), 
                std::istream_iterator<CoordinatePair>(),
                std::back_inserter(v));
    }
    else {
        std::cerr << "Couldn't open " << filename << " for reading\n";
    }
    // Now you can work with the contents of v
}
Martin Broadhurst
  • 8,717
  • 2
  • 26
  • 34
  • 1
    What happens when it's not possible to read two `int` tokens from the stream in `operator>>`? How can one make it work with a backtracking parser (i.e. when `operator>>` fails, roll back the stream to previous position end return false or something like that)? – fferri Dec 01 '16 at 13:31
  • If it's not possible to read two `int` tokens, then the `is` stream will evaluate to `false` and the reading loop will terminate at that point. You can detect this within `operator>>` by checking the return value of the individual reads. If you want to roll back the stream, you would call `is.clear()`. – Martin Broadhurst Jan 07 '17 at 14:10
  • in the `operator>>` it is more correct to say `is >> std::ws >> coordinates.x >> std::ws >> coordinates.y >> std::ws;` since otherwise you are assuming that your input stream is in the whitespace-skipping mode. – Darko Veberic Mar 27 '17 at 17:55
8

Expanding on the accepted answer, if the input is:

1,NYC
2,ABQ
...

you will still be able to apply the same logic, like this:

#include <fstream>

std::ifstream infile("thefile.txt");
if (infile.is_open()) {
    int number;
    std::string str;
    char c;
    while (infile >> number >> c >> str && c == ',')
        std::cout << number << " " << str << "\n";
}
infile.close();
gsamaras
  • 66,800
  • 33
  • 152
  • 256
3

Although there is no need to close the file manually but it is good idea to do so if the scope of the file variable is bigger:

    ifstream infile(szFilePath);

    for (string line = ""; getline(infile, line); )
    {
        //do something with the line
    }

    if(infile.is_open())
        infile.close();
Vijay Bansal
  • 509
  • 5
  • 9
  • Not sure this deserved a down vote. OP asked for a way to get each line. This answer does that and gives a great tip of making sure the file closes. For a simple program it may not be needed but at minimum a GREAT habit to form. It could maybe be improved by adding in a few lines of code to process the individual lines it pulls but overall is the simplest answer to the OPs question. – Xandor Sep 18 '19 at 18:22
3

This answer is for visual studio 2017 and if you want to read from text file which location is relative to your compiled console application.

first put your textfile (test.txt in this case) into your solution folder. After compiling keep text file in same folder with applicationName.exe

C:\Users\"username"\source\repos\"solutionName"\"solutionName"

#include <iostream>
#include <fstream>

using namespace std;
int main()
{
    ifstream inFile;
    // open the file stream
    inFile.open(".\\test.txt");
    // check if opening a file failed
    if (inFile.fail()) {
        cerr << "Error opeing a file" << endl;
        inFile.close();
        exit(1);
    }
    string line;
    while (getline(inFile, line))
    {
        cout << line << endl;
    }
    // close the file stream
    inFile.close();
}
Universus
  • 376
  • 1
  • 12
1

This is a general solution to loading data into a C++ program, and uses the readline function. This could be modified for CSV files, but the delimiter is a space here.

int n = 5, p = 2;

int X[n][p];

ifstream myfile;

myfile.open("data.txt");

string line;
string temp = "";
int a = 0; // row index 

while (getline(myfile, line)) { //while there is a line
     int b = 0; // column index
     for (int i = 0; i < line.size(); i++) { // for each character in rowstring
          if (!isblank(line[i])) { // if it is not blank, do this
              string d(1, line[i]); // convert character to string
              temp.append(d); // append the two strings
        } else {
              X[a][b] = stod(temp);  // convert string to double
              temp = ""; // reset the capture
              b++; // increment b cause we have a new number
        }
    }

  X[a][b] = stod(temp);
  temp = "";
  a++; // onto next row
}
mjr2000
  • 92
  • 5