14

I need to read numbers which are listed in a file from bottom up. How can I do that using C?

The file is like:

4.32
5.32
1.234
0.123
9.3
6.56
8.77

For example, I want to read the last three numbers. They have to be float type.

8.77
6.56
9.3

PS.: Actually I need a solution which is playing with the file pointer position using fseek, etc.

Peter Mortensen
  • 28,342
  • 21
  • 95
  • 123
Erol Guzoğlu
  • 356
  • 5
  • 21

5 Answers5

27

It's important to understand that no modern operating system tracks the position of line breaks within a file. (VMS could, and I'm pretty sure so could some IBM mainframe operating systems, but you're probably not using any of those.) So it's not possible to seek to a line boundary. It is also not possible to read byte-by-byte in reverse order.

Therefore, the simplest way to read the last three numbers in the file, in reverse, is to read the entire file in forward order, keeping the most recently seen three numbers in a buffer. When you hit EOF, just process that buffer backward.

A more efficient, but significantly more complicated, technique is to guess a position close to but before the last three numbers in the file; seek to that position, then discard characters till you hit a line break; and use the technique in the previous paragraph from that point. If you guessed wrong and the buffer winds up with fewer than three numbers in it, guess again.

And a third approach would be to use fseek (with SEEK_END) and fread to read the last 1024 or so bytes of the file, set a pointer to the end of the block, and parse it backward. This would be quite efficient, but would have even more headache-inducing corner cases to get right than the previous suggestion. (What exactly do you do if the last three lines of the file, collectively, are more than 1024 bytes long?)

FYI, the correct way to read floating-point numbers in C is to use fgets and strtod. DO NOT use either atof or scanf for this; atof doesn't tell you about syntax errors, and scanf triggers undefined behavior on overflow.

P.S. If you have the shell utility tac (which is a GNUism), the easiest option of all would be to write your program to process the first three numbers on standard input, and then invoke it as tac < input.file | ./a.out. Skimming the code leads me to believe that tac implements my "third approach", with some additional cleverness.

zwol
  • 121,956
  • 33
  • 219
  • 328
  • Would you mind expanding (or give a relevant link) on the undefined behavior of `scanf()`? I tried searching on SO and on Google but couldn't find any. – user12205 Apr 26 '15 at 23:09
  • 3
    @ace My canned rant on the subject is here: https://stackoverflow.com/questions/24302160/scanf-on-an-istream-object/24318630#24318630 (Read all of the comments!) The offending sentence of the standard is C99 7.19.6.2p10: "If [the object that will receive the result of conversion] does not have an appropriate type, *or if the result of the conversion cannot be represented in the object*, the behavior is undefined." Emphasis mine. – zwol Apr 26 '15 at 23:12
  • `tac`? Not `tail -n 3` – kay Apr 26 '15 at 23:14
  • 2
    @kay OP did say they wanted the numbers in reverse order, but your idea is good too (can always reverse it themselves). – zwol Apr 26 '15 at 23:14
6

Well, the obvious way is to read them all, put them into an array and then get the last three.

Peter Mortensen
  • 28,342
  • 21
  • 95
  • 123
Joshua Byer
  • 509
  • 4
  • 11
3

The notion of reading backwards from a file does not exist.

One solution is to read all the numbers and store only the last three read.

float numbers[3];
char line[100]; // Make it large enough
int = 0;
char* end;
for ( ; ; ++i )
{
    i %= 3; // Make it modulo 3.
    if ( fgets(line, 100, stdin) == NULL )
    {
       // No more input.
       break;
    }

    float n = strtof(line, &end);
    if ( line == end )
    {
       // Problem converting the string to a float.
       // Deal with error
       break;
    }

    if ( errno == ERANGE )
    {
       // Problem converting the string to a float within range.
       // Deal with error
       break;
    }

    numbers[i] = n;
}

If there are at least three numbers in the file, the last three numbers are numbers[i], numbers[(i+2)%3], and numbes[(i+1)%3].

R Sahu
  • 196,807
  • 13
  • 136
  • 247
  • 3
    −1 Never use `scanf` for anything. – zwol Apr 26 '15 at 22:58
  • I am aware of the dangers of using `scanf` with `%s`. Do yo have any reasons to believe there are dangers of using `scanf` with `%f`? – R Sahu Apr 26 '15 at 23:18
  • Yes: as I said in my answer, input overflow triggers undefined behavior. – zwol Apr 26 '15 at 23:19
  • My canned rant on the subject is here: https://stackoverflow.com/questions/24302160/scanf-on-an-istream-object/24318630#24318630 (Read all of the comments!) The offending sentence of the standard is C99 7.19.6.2p10: "If [the object that will receive the result of conversion] does not have an appropriate type, *or if the result of the conversion cannot be represented in the object*, the behavior is undefined." Emphasis mine. – zwol Apr 26 '15 at 23:25
  • Actually i need a solution which is play with file pointer position using fseek etc. Because i try to write multi-threading example. – Erol Guzoğlu Apr 26 '15 at 23:31
  • @zwol, R Sahu: with quick google search I also found this at C-FAQs: [ Why does everyone say not to use scanf? What should I use instead?](http://www.c-faq.com/stdio/scanfprobs.html) I believe this is also a good reference. – Grijesh Chauhan Apr 27 '15 at 07:02
  • @ErolGuzoğlu How is this a good problem for a multi-threading example? What are you trying to parallelize here? – BlackJack Apr 27 '15 at 09:45
  • @zwol Thanks for the link. Learned something today. I updated the answer to address the problems of scanf. – R Sahu Apr 27 '15 at 15:11
  • 1
    @GrijeshChauhan, thanks for the link. I updated the answer to address the problems of scanf. – R Sahu Apr 27 '15 at 15:11
  • I will only retract my downvote if you completely remove the `scanf`-using code samples and correct your explanation of how to check for errors from `strtof` (see under EXAMPLES at http://www.openbsd.org/cgi-bin/man.cgi/OpenBSD-current/man3/strtoimax.3?query=strtol -- you are not checking for either overflow or junk on the line after the number). – zwol Apr 27 '15 at 16:33
  • You're still not checking for errors comprehensively enough. You have to set errno to 0 before each call to any `strto*` function, and you have to check for junk on the line after the number. – zwol Apr 27 '15 at 22:04
  • @zwol, I like your perseverance w.r.t. writing robust code. This use case doesn't call for such rigor, not that I can see. I'm going to leave my answer where it is now. – R Sahu Apr 27 '15 at 22:15
1

First, open the file:

FILE* fp = fopen(..., "r");

Then, skip to EOF:

fseek(fp, 0, SEEK_END);

Now, go back X lines:

int l = X, ofs = 1;
while (l && fseek(fp, ofs++, SEEK_END) == 0) {
    if (fgetc(fp) == '\n' && ofs > 2) l--;
}

And finally, read X numbers from the current position:

float numbers[X];
for(int p = 0; p < X; p++) fscanf(fp, "%f", &numbers[p];
rodrigovr
  • 409
  • 2
  • 7
0

I solved my problem with the following code. I read the second half of the file.

  FILE *fp = fopen("sample.txt","r");

  if( fp == NULL )
  {
    perror("Error while opening the file.\n");
    exit(EXIT_FAILURE);
  }

  int size=0;
  char ch;

  //Count lines of file
  while(( ch = fgetc(fp) ) != EOF )
  {
    if (ch=='\n') { size++; }
  }

  int i;
  float value;

  //Move the pointer to the end of the file and calculate the size of the file.
  fseek(fp, 0, SEEK_END);
  int size_of_file = ftell(fp);

  for (i=1; i<=size/2; i++)
  {
    //Set pointer to previous line for each i value.
    fseek(fp, (size_of_file-1)-i*5, SEEK_SET);
    fscanf(fp, "%f", &value);
  }
Peter Mortensen
  • 28,342
  • 21
  • 95
  • 123
Erol Guzoğlu
  • 356
  • 5
  • 21
  • 2
    This reads the _whole file_ to determine the size and seeks and reads the second half. This is a "worst of both worlds" approach and you gain no benefit at all from the additional complexity. I would suggest you revert to the [first approach here](http://stackoverflow.com/a/29884555/2071828) as it will not only be simpler, but also faster. – Boris the Spider Apr 27 '15 at 07:17
  • 1
    One day, you will want to read data from a pipe/socket. Unfortunately, they don't have sizes and can't be seeked in. – Joker_vD Apr 27 '15 at 08:41
  • @Joker_vD, the question is explicit : "I need a solution which is playing with the file pointer position using fseek" – Guillaume Apr 27 '15 at 12:15