K&R answer book - exercise 1.16 - getline function

Question

"Revise the main routine of the longest-line program so it will correctly print the length of arbitrarily long input lines, and as much as possible of the text."

Here is the full code in the K&R answer book for exercise 1.16:

#include <stdio.h>
#define MAXLINE 1000

int getline (char line[], int maxline);
void copy (char to[], char from[]);
main()
{


int len;
int max; 
char line[MAXLINE];
char longest[MAXLINE];

max = 0;
while ((len = getline(line, MAXLINE)) > 0 ){
    printf("%d %s", len, line);
    if (len >max){
        max = len;
        copy (longest, line);
    }
}
if (max > 0)
    printf("%s", longest);

getchar();
return 0;
}
int getline(char s[], int lim)
{
int c, i, j;

j = 0;
for (i = 0; (c = getchar()) != EOF && c != '\n'; ++i)
    if (i < lim-2){
        s[j] = c;
        ++j;
    }
    if (c == '\n'){
        s[j] = c;
        ++j;
        ++i;
    }
    s[j] = '\0';
    return i;
}
void copy (char to[], char from[])
{
int i;

i = 0;
while ((to[i] = from[i]) != '\0')
    ++i;
}

Looking at the getline function, what is the purpose of having lim-2 instead of lim-1 in the following line:

if (i < lim-2){

This doesn't seem to serve any purpose at all. Shouldn't the '\0' character marker occur at s[lim] and not s[lim-1] or s[lim-2]?

Also, the function skips whatever the latest character was, if it goes over the character limit, until it finds a new line character, and it adds the newline character to the char string, but skips whatever the next "fittable" character coming from the input stream in the process. What is this purpose of this?

I'm not really sure what the author was intending here, and the book offers no explanation.

edit: Using an array such as line[5], I was under the impression that the NULL character in a character array goes in line[5], and we get to put something in the array subscripts 0-4. Is this assumption false?

Array lengths are specified as '`limit` units (bytes) in total'. Indexes go from `array[0]` to `array[limit-1]`. In this context, `s[lim-1]` needs to hold the `'\0'` byte. Note that POSIX now has a [`getline()`](http://pubs.opengroup.org/onlinepubs/9699919799/functions/getline.html) function with a completely different prototype from the `getline()` in K&R. — Jonathan Leffler, Dec 28 '13 at 17:20

Giuseppe Pes · Answer 1 · 2013-12-29T10:40:21.957

This doesn't seem to serve any purpose at all. Shouldn't the '\0' character marker occur at s[lim] and not s[lim-1] or s[lim-2]?

No. lim is the size of the buffer, therefore the last index of the array is lim-1. The code uses lim-2 because, it reserves the space for the the end string character ('\0') and for the new line ("\n").

Also, the function skips whatever the latest character was, if it goes over the character limit, until it finds a new line character, and it adds the newline character to the char string, but skips whatever the next "fittable" character coming from the input stream in the process. What is this purpose of this?

The getline function reads the input string divided in different chunks of size MAX_LINE and then copies these chunks into the longest buffer. Since the longest buffer has a fixed size, the maximum length of the string which can be printed is MAX_LINE (i.e. only the first chunk). Therefore, the size of the longest buffer is the real limit of the string that you can print out, even though you insert a longer string in input. The exercise consist of making the longest buffer dynamic so that the application is able to read and print an arbitrary long string. You must use dynamic memory because you don't know the size of the input string. A possible solution could be to save each chunk into a temporary buffer and when all buffers are in memory, you can compute the size of the input string and then copy all chunks into a new buffer whose size is the length of the input string.

I'm sorry, this is kind of late, but what do you mean when you say that "The exercise consist of making the longest buffer dynamic so that the application is able to read and print an arbitrary long string." I'm guessing that if the buffer was dynamic, aka not fixed, then I wouldn't have to worry about buffer size, but that's not what the exercise is doing. You also said the following: "You must use dynamic memory because you don't know the size of the input string." Are you saying that my options are limited given the tools I have to work with so far and need Dynamic memory allocation? — Spellbinder2050, Jan 02 '14 at 14:24

score 0 · Answer 2 · edited Jun 20 '20 at 09:12

Yea I just happened to read the same page and also confused. I think it's only buggy when:

1. input line exceeds `lim`;

2. input line ends directly with EOF without new-line character `\n`

so that it only saves meaningful characters up to s[lim-3], puts \0 in s[lim-2] and leaves s[lim-1] unused.

However in practise(i'm using OSx), EOF can only be triggered by typing control-D at the beginning of line - meaning that all lines end up with \n (especially the last line). So no matter it exceeds lim or not, in the end the last two characters saved in s[] are always \n and \0.

When the input is some trunk of file rather that typings on terminal you can read this one: Why should text files end with a newline? which I've also found today and I think these two questions are highly related.

Hope my answer helps! :)

K&R answer book - exercise 1.16 - getline function

2 Answers2

1. input line exceeds lim;

2. input line ends directly with EOF without new-line character \n

1. input line exceeds `lim`;

2. input line ends directly with EOF without new-line character `\n`