1

In the POSIX API, read() returns 0 to indicate that the end-of-file has been reached. Why isn't there a separate function that tells you that read() would return zero -- without requiring you to actually call read()?


Reason for asking: Since you have to call read() in order to discover that it will fail, this makes file reading algorithms more complicated and maybe slightly less efficient since they have to allocate a target buffer that may not be needed.

What we might like to do...

while ( !eof )
   {
   allocate buffer
   read to buffer
   process buffer
   }

What we have to do instead...

while ( true )
   {
   allocate buffer
   read to buffer
   if ( eof ) release buffer, break;
   process buffer
   }

Additionally, it seems like this behavior propagates itself into higher-level APIs such as fread() and feof() in C -- and creates a lot of confusion about how to use feof() correctly:

Community
  • 1
  • 1
Brent Bradburn
  • 40,766
  • 12
  • 126
  • 136
  • The most interesting answer that I have seen (to a similar question): http://stackoverflow.com/a/5605161/86967. Need to think about whether or not similar logic is applicable to the POSIX API. – Brent Bradburn Nov 20 '14 at 19:10
  • The case where the input stream size is an exact multiple of the buffer size is generally rare. Therefore, end-of-stream isn't particularly helpful under typical scenarios. I don't recall exactly, but at the time I asked this question, I may have been working with a case in which the stream was made up of packets/chunks of a very specific size (which was also large enough to be significant). – Brent Bradburn Sep 06 '18 at 14:54

1 Answers1

6

To gain perspective on why this might be the case, understand that end-of-stream is not inherently a permanent situation. A file's read pointer could be at the end, but if more data is subsequently appended by a write operation, then subsequent reads will succeed.

Example: In Linux, when reading from the console, a new line followed by ^D will cause posix::read() to return zero (indicating "end of file"). However, if the program isn't terminated, the program can continue to read (assuming additional lines are typed).

Since end-of-stream is not a permanent situation, perhaps it makes sense to not even have an is_at_end() function (POSIX does not). Unfortunately, this does put some additional burden on the programmer (and/or a wrapper library) to elegantly and efficiently deal with this complexity.

Brent Bradburn
  • 40,766
  • 12
  • 126
  • 136