Why is iostream::eof inside a loop condition (i.e. `while (!stream.eof())`) considered wrong?

Question

I just found a comment in this answer saying that using iostream::eof in a loop condition is "almost certainly wrong". I generally use something like while(cin>>n) - which I guess implicitly checks for EOF.

Why is checking for eof explicitly using while (!cin.eof()) wrong?

How is it different from using scanf("...",...)!=EOF in C (which I often use with no problems)?

`scanf(...) != EOF` won't work in C either, because `scanf` returns the number of fields successfully parsed and assigned. The correct condition is `scanf(...) < n` where `n` is the number of fields in the format string. — Ben Voigt, Apr 05 '12 at 16:50
@Ben Voigt, it will return a negative number (which EOF usually is defined as such) in case EOF is reached — Sebastian, Nov 23 '12 at 23:44
@SebastianGodelet: Actually, it will return `EOF` if end of file is encountered before the first field conversion (successful or not). If end-of-file is reached between fields, it will return the number of fields succcessfully converted and stored. Which makes comparison to `EOF` wrong. — Ben Voigt, Nov 24 '12 at 15:06
@SebastianGodelet: No, not really. He errs when he says that "past the loop there is no (easy) way to distinguish a proper input from an improper one". In fact it's as easy as checking `.eof()` after the loop exits. — Ben Voigt, Nov 24 '12 at 16:52
@Ben Yes, for this case (reading a simple int). But one can easily come up with a scenario where `while(fail)` loop terminates with both an actual failure and an eof. Think about if you require 3 ints per iteration (say you are reading an x-y-z point or something), but there is, erroneously, only two ints in the stream. — sly, Nov 24 '12 at 19:47
This issue is analogous to and has the same answer as the C question: [Why `while(!feof(file))` is always wrong?](http://stackoverflow.com/questions/5431941/while-feof-file-is-always-wrong). Because the flag is set only _after_ hitting the EOF. — legends2k, Mar 21 '14 at 09:25
[Here's C++ FAQ's take](http://www.parashift.com/c++-faq/istream-and-eof.html) on the same question. — legends2k, Mar 21 '14 at 09:33
@sly: is that "3-ints" scenario not correctly handled by `while (in >> x) { if (in >> y >> z) use(x, y, z); else FATAL("got an int not followed by 2 more!"); } if (!eof()) FATAL("didn't get integer where expected");`? If not, for what stream content would that *not* work nicely? — Tony Delroy, Nov 07 '14 at 10:36
This question was posted without attribution on Quora: https://www.quora.com/unanswered/Why-is-iostream-eof-inside-a-loop-condition-considered-wrong (the Quora question title exactly matches the title this question had before it was edited yesterday). — Keith Thompson, May 06 '19 at 09:13

score 570 · Accepted Answer · edited Mar 15 '16 at 13:57

570

Because iostream::eof will only return true after reading the end of the stream. It does not indicate, that the next read will be the end of the stream.

Consider this (and assume then next read will be at the end of the stream):

while(!inStream.eof()){
  int data;
  // yay, not end of stream yet, now read ...
  inStream >> data;
  // oh crap, now we read the end and *only* now the eof bit will be set (as well as the fail bit)
  // do stuff with (now uninitialized) data
}

Against this:

int data;
while(inStream >> data){
  // when we land here, we can be sure that the read was successful.
  // if it wasn't, the returned stream from operator>> would be converted to false
  // and the loop wouldn't even be entered
  // do stuff with correctly initialized data (hopefully)
}

And on your second question: Because

if(scanf("...",...)!=EOF)

is the same as

if(!(inStream >> data).eof())

and not the same as

if(!inStream.eof())
    inFile >> data

edited Mar 15 '16 at 13:57

Some programmer dude

363,249
31
351
550

answered Apr 09 '11 at 12:58

Xeo

123,374
44
277
381

12

Worth mentioning is that if (!(inStream >> data).eof()) doesn't do anything useful either. Fallacy 1: It'll not enter the condition if there was no whitespace after the last piece of data (last datum will not be processed). Fallacy 2: It will enter the condition even if reading data failed, as long as EOF was not reached (infinite loop, processing the same old data over and over again). – Tronic Jan 20 '13 at 16:20
Slightly off-topic but let me put it. If one uses lazy evaluation, would this approach succeed without any problem? – Dilawar Feb 25 '13 at 03:58
4

I think it's worth pointing out that this answer is slightly misleading. When extracting `int`s or `std::string`s or similar, the EOF bit *is* set when you extract the one right before the end and the extraction hits the end. You do not need to read again. The reason it doesn't get set when reading from files is because there's an extra `\n` at the end. I've covered this in [another answer](http://stackoverflow.com/a/14615673/150634). Reading `char`s is a different matter because it only extracts one at a time and doesn't continue to hit the end. – Joseph Mansfield Apr 06 '13 at 16:59
83

The main problem is that **just because we haven't reached the EOF, doesn't mean the next read will succeed**. – Joseph Mansfield Apr 06 '13 at 17:03
1

@sftrabbit: all true but not very useful... even if there's no trailing '\n' it's reasonable to want other trailing whitespace to be handled consistently with other whitespace throughout the file (i.e. skipped). Further, a subtle consequence of "when you extract the one right before" is that `while (!eof())` won't "work" on `int`s or `std::string`s when the input is totally empty, so even knowing there's no trailing `\n` care is needed. – Tony Delroy Apr 23 '13 at 03:34
2

@TonyD Totally agree. The reason I'm saying it is because I think most people when they read this and similar answers will think that if the stream contains `"Hello"` (no trailing whitespace or `\n`) and a `std::string` is extracted, it will extract the letters from `H` to `o`, stop extracting, and then *not* set the EOF bit. In fact, it would set the EOF bit because it was the EOF that stopped the extraction. Just hoping to clear that up for people. – Joseph Mansfield Apr 23 '13 at 08:23
1

This answer should be wiki'd, not a day goes when someone doesn't post this antipattern. – user657267 Jul 27 '14 at 12:27
1

Also worth mentioning. If there is invalid data on the input the read may cause the stream to go into a bad state. This will prevent further reading (unless explicitly cleared). The test `while(!inStream.eof())` will then go into an infinite loop (as no data read and never reaches eof). – Martin York May 13 '15 at 20:26
1

`// do stuff with (now uninitialized) data` That's no longer true as of C++11, see http://stackoverflow.com/a/13379073/3002139 – Baum mit Augen Aug 21 '16 at 18:20
1

@BaummitAugen It's still left uninitialized if there were no more characters on the stream for example (see my comment on the linked answer). The setting of zero only happens when characters are successfully extracted and then cannot be parsed according to the expected format – M.M Nov 19 '19 at 00:04
Perhaps I was doing something wrong, but on MSVC, if I do `while (std::cin >> word)`, my program gets stuck in an infinite loop. But it's probably better that I just use `std::getline()`. convert the input to a string stream then do `while (myStringStream >> word)`, which doesn't get stuck in an infinite loop. – apokaliptis Dec 17 '19 at 22:13
since C++11 one of your comments is wrong/misleading: `// do stuff with (now uninitialized) data`.Since C++11, if extraction fails then `0` will be written to `data`. It doesn't alter the point of the question, but readers might get confused when they get `0` and expect "garbage" (not saying that this is a meaningful expectation, but such confusion does happen) – 463035818_is_not_a_number Nov 17 '20 at 12:50

sly · Answer 2 · 2016-08-13T19:54:54.867

Bottom-line top: With proper handling of white-space, the following is how eof can be used (and even, be more reliable than fail() for error checking):

while( !(in>>std::ws).eof() ) {  
   int data;
   in >> data;
   if ( in.fail() ) /* handle with break or throw */; 
   // now use data
}

(Thanks Tony D for the suggestion to highlight the answer. See his comment below for an example to why this is more robust.)

The main argument against using eof() seems to be missing an important subtlety about the role of white space. My proposition is that, checking eof() explicitly is not only not "always wrong" -- which seems to be an overriding opinion in this and similar SO threads --, but with proper handling of white-space, it provides for a cleaner and more reliable error handling, and is the always correct solution (although, not necessarily the tersest).

To summarize what is being suggested as the "proper" termination and read order is the following:

int data;
while(in >> data) {  /* ... */ }

// which is equivalent to 
while( !(in >> data).fail() )  {  /* ... */ }

The failure due to read attempt beyond eof is taken as the termination condition. This means is that there is no easy way to distinguish between a successful stream and one that really fails for reasons other than eof. Take the following streams:

1 2 3 4 5<eof>
1 2 a 3 4 5<eof>
a<eof>

while(in>>data) terminates with a set failbit for all three input. In the first and third, eofbit is also set. So past the loop one needs very ugly extra logic to distinguish a proper input (1st) from improper ones (2nd and 3rd).

Whereas, take the following:

while( !in.eof() ) 
{  
   int data;
   in >> data;
   if ( in.fail() ) /* handle with break or throw */; 
   // now use data
}

Here, in.fail() verifies that as long as there is something to read, it is the correct one. It's purpose is not a mere while-loop terminator.

So far so good, but what happens if there is trailing space in the stream -- what sounds like the major concern against eof() as terminator?

We don't need to surrender our error handling; just eat up the white-space:

while( !in.eof() ) 
{  
   int data;
   in >> data >> ws; // eat whitespace with std::ws
   if ( in.fail() ) /* handle with break or throw */; 
   // now use data
}

std::ws skips any potential (zero or more) trailing space in the stream while setting the eofbit, and not the failbit. So, in.fail() works as expected, as long as there is at least one data to read. If all-blank streams are also acceptable, then the correct form is:

while( !(in>>ws).eof() ) 
{  
   int data;
   in >> data; 
   if ( in.fail() ) /* handle with break or throw */; 
   /* this will never fire if the eof is reached cleanly */
   // now use data
}

Summary: A properly constructed while(!eof) is not only possible and not wrong, but allows data to be localized within scope, and provides a cleaner separation of error checking from business as usual. That being said, while(!fail) is inarguably a more common and terse idiom, and may be preferred in simple (single data per read type of) scenarios.

"_So past the loop there is no (easy) way to distinguish a proper input from an improper one._" Except that in one case both `eofbit` and `failbit` are set, in the other only `failbit` is set. You only need to test that _once_ after the loop has terminated, not on every iteration; it will only leave the loop once, so you only need to check _why_ it left the loop once. `while (in >> data)` works fine for all blank streams. — Jonathan Wakely, Feb 25 '13 at 14:09
What you are saying (and a point made earlier) is that a bad formatted stream can be identified as `!eof & fail` past loop. There are cases in which one can not rely on this. See above comment (http://goo.gl/9mXYX). Eitherway, I am not proposing `eof`-check as *the-always-better* alternative. I am merely saying, it *is* a possible and (in some cases more appropriate) way of doing this, rather than "most certainly wrong!" as it tends to be claimed around here in SO. — sly, Feb 25 '13 at 15:58
*"As an example, consider how you'd check for errors where the data is a struct with overloaded operator>> reading multiple fields at once"* - a much simpler case supporting your point is `stream >> my_int` where the stream contains e.g. "-": `eofbit` and `failbit` are set. That's worse than the `operator>>` scenario, where the user-provided overload at least has the option of clearing `eofbit` before returning to help support `while (s >> x)` usage. More generally, this answer could use a clean-up - only the final `while( !(in>>ws).eof() )` is generally robust, and it's buried at the end. — Tony Delroy, Feb 25 '15 at 06:09

score 75 · Answer 3 · edited Oct 20 '16 at 17:58

Because if programmers don't write while(stream >> n), they possibly write this:

while(!stream.eof())
{
    stream >> n;
    //some work on n;
}

Here the problem is, you cannot do some work on n without first checking if the stream read was successful, because if it was unsuccessful, your some work on n would produce undesired result.

The whole point is that, eofbit, badbit, or failbit are set after an attempt is made to read from the stream. So if stream >> n fails, then eofbit, badbit, or failbit is set immediately, so its more idiomatic if you write while (stream >> n), because the returned object stream converts to false if there was some failure in reading from the stream and consequently the loop stops. And it converts to true if the read was successful and the loop continues.

Apart from the mentioned "undesired result" with doing work on the undefined value of `n`, the program might also fall into an **infinite loop**, if the failing stream operation doesn't consume any input. — mastov, Apr 27 '18 at 13:33

melpomene · Answer 4 · 2019-08-20T14:09:16.840

The other answers have explained why the logic is wrong in while (!stream.eof()) and how to fix it. I want to focus on something different:

why is checking for eof explicitly using iostream::eof wrong?

In general terms, checking for eof only is wrong because stream extraction (>>) can fail without hitting the end of the file. If you have e.g. int n; cin >> n; and the stream contains hello, then h is not a valid digit, so extraction will fail without reaching the end of the input.

This issue, combined with the general logic error of checking the stream state before attempting to read from it, which means for N input items the loop will run N+1 times, leads to the following symptoms:

If the stream is empty, the loop will run once. >> will fail (there is no input to be read) and all variables that were supposed to be set (by stream >> x) are actually uninitialized. This leads to garbage data being processed, which can manifest as nonsensical results (often huge numbers).

(If your standard library conforms to C++11, things are a bit different now: A failed >> now sets numeric variables to 0 instead of leaving them uninitialized (except for chars).)
If the stream is not empty, the loop will run again after the last valid input. Since in the last iteration all >> operations fail, variables are likely to keep their value from the previous iteration. This can manifest as "the last line is printed twice" or "the last input record is processed twice".

(This should manifest a bit differently since C++11 (see above): Now you get a "phantom record" of zeroes instead of a repeated last line.)
If the stream contains malformed data but you only check for .eof, you end up with an infinite loop. >> will fail to extract any data from the stream, so the loop spins in place without ever reaching the end.

To recap: The solution is to test the success of the >> operation itself, not to use a separate .eof() method: while (stream >> n >> m) { ... }, just as in C you test the success of the scanf call itself: while (scanf("%d%d", &n, &m) == 2) { ... }.

this is the most accurate answer, although as of c++11, i dont believe the variables are uninitialized anymore(the first bullet pt) — csguy, Aug 20 '19 at 13:52

score 0 · Answer 5 · answered Apr 25 '21 at 05:36

The iostream::eof in a loop is considered as wrong because we haven’t reached the EOF. So it does not mean that the next read will succeed.

I'll explain my statement by two sample codes, that will definitely help you understanding the concept in better manners. Let's say, when we want to read a file using file streams in C++. And when we use a loop to write in a file, if we check the end of file using stream.eof(), we are actually checking whether the file has reached end or not.

Example Code

#include<iostream>
#include<fstream>
using namespace std;
int main() {
   ifstream myFile("myfile.txt");
   string x;
   while(!myFile.eof()) {
      myFile >> x;
     // Need to check again if x is valid or eof
     if(x) {
        // Do something with x
     }
   }
}

When we are using the stream directly in a loop, We will not check the condition again.

Example Code

#include<iostream>
#include<fstream>
using namespace std;
int main() {
   ifstream myFile("myfile.txt");
   string x;
   while(myFile >> x) {
      // Do something with x
      // No checks needed!
   }
}

Why is iostream::eof inside a loop condition (i.e. `while (!stream.eof())`) considered wrong?

5 Answers5

Linked

Related