12

I gave an answer which I wanted to check the validity of stream each time through a loop here.

My original code used good and looked similar to this:

ifstream foo("foo.txt");

while (foo.good()){
    string bar;
    getline(foo, bar);
    cout << bar << endl;
}

I was immediately pointed here and told to never test good. Clearly this is something I haven't understood but I want to be doing my file I/O correctly.

I tested my code out with several examples and couldn't make the good-testing code fail.

First (this printed correctly, ending with a new line):

bleck 1
blee 1 2
blah
ends in new line

Second (this printed correctly, ending in with the last line):

bleck 1
blee 1 2
blah
this doesn't end in a new line

Third was an empty file (this printed correctly, a single newline.)

Fourth was a missing file (this correctly printed nothing.)

Can someone help me with an example that demonstrates why good-testing shouldn't be done?

Community
  • 1
  • 1
Jonathan Mee
  • 35,107
  • 16
  • 95
  • 241
  • You should use `while (getline(foo, bar)) {..}` instead. – Neil Kirk Feb 03 '15 at 15:14
  • @NeilKirk That depends on preference really. Using `getline` for the loop condition will output nothing for an empty file, and chop the last trailing newline. So examples 2 and 3 would print differently. If *n* is the number of `\n`s in a file, then my personal preference is that the file be represented as having *n + 1* lines. The comments on [Nathan Oliver](http://stackoverflow.com/users/4342498/nathanoliver)'s answer address this in some depth. – Jonathan Mee Feb 03 '15 at 15:21
  • 1
    I assumed that you didn't want an extra new line output. If you do, it looks correct. – Neil Kirk Feb 03 '15 at 15:26

5 Answers5

6

They were wrong. The mantra is 'never test .eof()'.

Even that mantra is overboard, because both are useful to diagnose the state of the stream after an extraction failed.

So the mantra should be more like

Don't use good() or eof() to detect eof before you try to read any further

Same for fail(), and bad()

Of course stream.good can be usefully employed before using a stream (e.g. in case the stream is a filestream which has not been successfully opened)

However, both are very very very often abused to detect the end of input, and that's not how it works.


A canonical example of why you shouldn't use this method:

std::istringstream stream("a");
char ch;
if (stream >> ch) {
   std::cout << "At eof? " << std::boolalpha << stream.eof() << "\n";
   std::cout << "good? " << std::boolalpha << stream.good() << "\n";
}

Prints

false
true

See it Live On Coliru

Community
  • 1
  • 1
sehe
  • 328,274
  • 43
  • 416
  • 565
  • @doc Subtle minimalism. I agree, although people tend to think "obviously `good()` returns true here, it's brand new fresh stream". So, that's why I did one read. – sehe Feb 03 '15 at 13:55
2

This is already covered in other answers, but I'll go over it briefly for completeness. The only functional difference with

while(foo.good()) { // effectively same as while(foo) {
    getline(foo, bar);
    consume(bar); // consume() represents any operation that uses bar
}

And

while(getline(foo, bar)){
    consume(bar);
}

Is that the former will do an extra loop when there are no lines in the file, making that case indistinguishable from the case of one empty line. I would argue that this is not typically desired behaviour. But I suppose that's matter of opinion.

As sehe says, the mantra is overboard. It's a simplification. What really is the point is that you must not consume() the result of reading the stream before you test for failure or at least EOF (and any test before the read is irrelevant). Which is what people easily do when they test good() in the loop condition.

However, the thing about getline(), is that it tests EOF internally, for you and returns an empty string even if only EOF is read. Therefore, the former version could maybe be roughly the similar to following pseudo c++:

while(foo.good()) {
    // inside getline
    bar = "";               // Reset bar to empty
    string sentry;
    if(read_until_newline(foo, sentry)) {
        // The streams state is tested implicitly inside getline
        // after the value is read. Good
        bar = sentry        // The read value is used only if it's valid.
    // ...                  // Otherwise, bar is empty.
    consume(bar);
}

I hope that illustrates what I'm trying to say. One could say that there is a "correct" version of the read loop inside getline(). This is why the rule is at least partially satisfied by the use of readline even if the outer loop doesn't conform.

But, for other methods of reading, breaking the rule hurts more. Consider:

while(foo.good()) {
    int bar;
    foo >> bar;
    consume(bar);
}

Not only do you always get the extra iteration, the bar in that iteration is uninitialized!

So, in short, while(foo.good()) is OK in your case, because getline() unlike certain other reading functions, leaves the output in a valid state after reading EOF bit. and because you don't care or even do expect the extra iteration when the file is empty.

eerorika
  • 181,943
  • 10
  • 144
  • 256
  • 1
    "echo" always adds a newline so you aren't just printing what you input: http://stackoverflow.com/q/15728703/2642059 In my mind I want what would be printed to mirror what I'd see in a text editor. An empty file does show one empty line. That's what I want. A file that ends in a newline shows a final empty line. That's what I want. I don't get those when I use `getline` as my loop condition. Now as you say it's a matter of opinion. I think that since we know this we're safe from goofing. – Jonathan Mee Feb 03 '15 at 18:56
1

both good() and eof() will both give you an extra line in your code. If you have a blank file and run this:

std::ifstream foo1("foo1.txt");
std::string line;
int lineNum = 1;

std::cout << "foo1.txt Controlled With good():\n";
while (foo1.good())
{
    std::getline(foo1, line);
    std::cout << lineNum++ << line << std::endl;
}
foo1.close();
foo1.open("foo1.txt");
lineNum = 1;

std::cout << "\n\nfoo1.txt Controlled With getline():\n";
while (std::getline(foo1, line))
{
    std::cout << line << std::endl;
}

The output you will get is

foo1.txt Controlled With good():
1

foo1.txt Controlled With getline():

This proves that it isn't working correctly since a blank file should never be read. The only way to know that is to use a read condition since the stream will always be good the first time it reads.

NathanOliver
  • 150,499
  • 26
  • 240
  • 331
  • That's actually the behavior I was expecting though? You see that in my question: "Third was an empty file (this printed correctly, a single newline.)" But maybe that was the pitfall I was being warned of? – Jonathan Mee Feb 03 '15 at 13:32
  • 1
    But you don't want a single newline. If you were adding the data to a vector you would have a vector with a size of 1. You should not have that though with an empty file. – NathanOliver Feb 03 '15 at 13:33
  • I guess this is more of a, "What did you expect?"-thing. I expect a loop controlled with `good` to give me an empty line if there's nothing on the last line of the file. I expect the `getline` method to not give me an empty line for an empty file **or** a file that ends in a newline. That makes example 1 print differently when controlled by `getline`. – Jonathan Mee Feb 03 '15 at 13:39
  • But you will always get a line. If you have a file with 20 lines and you add them to a vector the vector will actually have 21 elements in it with the last one being blank when using `good()` – NathanOliver Feb 03 '15 at 13:56
  • That's not true. Example 2 shows a file with 4 lines the loop runs 4 times (This is true for both loops controlled with `good` and `getline`.) Example 1 shows a file with 5 lines loops controlled with `good` will run 5 times (loops controlled with `getline` will only run 4.) – Jonathan Mee Feb 03 '15 at 14:04
1

Using foo.good() just tells you that the previous read operation worked just fine and that the next one might as well work. .good() checks the state of the stream at a given point. It does not check if the end of the file is reached. Lets say something happened while the file was being read (network error, os error, ...) good will fail. That does not mean the end of the file was reached. Nevertheless .good() fails when end of file is reached because the stream is not able to read anymore.

On the other hand, .eof() checks if the end of file was truly reached.

So, .good() might fail while the end of file was not reached.

Hope this helps you understand why using .good() to check end of file is a bad habit.

7hi4g0
  • 3,239
  • 3
  • 18
  • 32
  • Ummm... Wouldn't I want to break out of the loop if there is a network error or something? Seems like you're answer is saying that using `good` is the right thing to do, but then you say: "using `.good()` to check end of file is a bad habit." Can you clarify? Also, `good` is set to false when the EOF is encountered: http://www.cplusplus.com/reference/ios/ios/good/ – Jonathan Mee Feb 03 '15 at 14:13
0

Let me clearly say that sehe's answer is the correct one.

But the option proposed by, Nathan Oliver, Neil Kirk, and user2079303 is to use readline as the loop condition rather than good. Needs to be addressed for the sake of posterity.

We will compare the loop in the question to the following loop:

string bar;

while (getline(foo, bar)){
    cout << bar << endl;
}

Because getline returns the istream passed as the first argument, and because when an istream is cast to bool it returns !(fail() || bad()), and since reading the EOF character will set both the failbit and the eofbit this makes getline a valid loop condition.

The behavior does change however when using getline as a condition because if a line containing only an EOF character is read the loop will exit preventing that line from being outputted. This doesn't occur in Examples 2 and 4. But Example 1:

bleck 1
blee 1 2
blah
ends in new line

Prints this with the good loop condition:

bleck 1
blee 1 2
blah
ends in new line

But chops the last line with the getline loop condition:

bleck 1
blee 1 2
blah
ends in new line

Example 3 is an empty file:

Prints this with the good condition:

Prints nothing with the getline condition.

Neither of these behaviors are wrong. But that last line can make a difference in code. Hopefully this answer will be helpful to you when deciding between the two for coding purposes.

Community
  • 1
  • 1
Jonathan Mee
  • 35,107
  • 16
  • 95
  • 241