Two children of same parent are not communicating using pipe if parent do not call wait()

Question

Please see the code below:

#include<stdio.h>

main(){
        int pid, fds[2], pid1;
        char buf[200];
        pipe(fds);
        pid = fork();

        if(pid==0)
        {
                close(fds[0]);
                scanf("%s", &buf);
                write(fds[1], buf, sizeof(buf)+1);
        }
        else
        {
                pid1 = fork();

                if(pid1==0)
                {
                        close(fds[1]);
                        read(fds[0], buf, sizeof(buf)+1);
                        printf("%s\n", buf);
                }
                else
                {
       Line1:              wait();
                }
        }
}

If I do not comment out Line1, it is working fine. Please see below:

hduser@pc4:~/codes/c/os$ ./a.out
hello //*Entry from keyboard*
hello //Output
hduser@pc4:~/codes/c/os$

But if I comment out Line1, two child processes are not communicating:

hduser@pc4:~/codes/c/os$ ./a.out
hduser@pc4:~/codes/c/os$ 
hi //*Entry from keyboard*
hi: command not found
hduser@pc4:~/codes/c/os$

I cannot understand significance of wait() here.

`read(fds[0], buf, sizeof(buf)+1);` is undefined behavior. It reads 201 bytes into a 200-byte array. — Andrew Henle, Sep 08 '15 at 14:17
The first step in debugging would be to test every system call to see what's failing. Write to stderr. I'd write progress reports too. You should write in semi-modern C (C99 at least); that requires `` and a proper declaration for [`main()`](http://stackoverflow.com/questions/204476/what-should-main-return-in-c-and-c/18721336#18721336). — Jonathan Leffler, Sep 08 '15 at 14:23
to start, a 'pid' is defined as a 'pid_t' which is defined in the `unistd.h` header file, which the posted code is missing. — user3629249, Sep 08 '15 at 17:48
these two line: `scanf("%s", &buf); write(fds[1], buf, sizeof(buf)+1);` have a few problems: 1) always check the returned value from scanf() (not the parameter value) to assure the operation was successful. 2) when using the %s input/format parameter, always include a length modifier (sizeof(buf)-1) so the user cannot overflow the buffer. 3) the number of bytes to send is never longer than the buffer (and is usually shorter) suggest: `write(fds[1], buf, strlen(buf)+1);' — user3629249, Sep 08 '15 at 17:54
the posted code has a significant logic problem. the `fork()` function can return 3 kinds of values: -1 when an error occurs, 0 when executing in the child and some positive number when executing in the parent. The posted code fails to check for the error condition on both the calls to fork() — user3629249, Sep 08 '15 at 17:56
there are two valid ways to declare the main() function (and one optional way) All three ways have a 'int' return type. — user3629249, Sep 08 '15 at 17:59
The posted code needs to `wait()` for each child before exiting. suggest using `waitpid()` rather than wait() with one call to waitpid() for each child. — user3629249, Sep 08 '15 at 18:01
regarding this line: `read(fds[0], buf, sizeof(buf)+1);` read does not append a NUL byte to the incoming message, so you need to allow room for that (and append the NUL byte in the code so printf can output the message via %s) so the length parameter passed to 'read()' needs to be 1 LESS that the sizeof(buf) rather than one greater than. Appending the NUL byte in the correct location means saving the returned value from 'read()' (you should also be checking that returned value for <= 0) then `buf[returnedValue] = '\0';` — user3629249, Sep 08 '15 at 18:06
reading (and understanding) the man page for each system function that you use is mandatory. The main/parent function should close both ends of the pipe after the two calls to fork(). Each child process, before exiting, should close the end of the pipe that they have not already closed — user3629249, Sep 08 '15 at 18:11

Ghazanfar · Answer 1 · 2015-09-08T15:05:08.880

What's happening here is that the parent process completes execution before the child processes finish. Causing the children to lose access to the terminal.

Let us have a closer look at all this.

What does wait() do ?

The wait() system call suspends execution of the calling process until one of its children terminates.

Your program is like this

Your main Process forks 2 child processes. The first one writes to a pipe while the other one reads from a pipe. All this happens while the main process continues to execute.

What happens when the main process has executed it's code ? It terminates. When it terminates, it gives up its control on the terminal. Which causes the children to lose access to the terminal.

This explains why you get command not found -- what you have typed is not on the stdin of your program but on the shell prompt itself.

There were a couple of other issues with your code too,

1) In this part of your code,

            scanf("%s", &buf);

This is wrong. You were unlucky and didn't get a segmentation fault. Since buf is already an address, this should have been

            scanf("%s", buf);

2) Notice this,

            read(fds[0], buf, sizeof(buf)+1);

This is undefined behavior as was pointed out in the comments section. You are trying to read more data and store it in a lesser memory space. This should have been,

            read(fds[0], buf, sizeof(buf));

3) Calling wait(). You have created two child processes, you should wait for both of them to finish, so you should call wait() twice.

Jonathan Leffler · Answer 2 · 2015-09-08T15:04:22.487

After fixing some infelicities in the code, I came up with a semi-instrumented version of your program like this:

#include <unistd.h>
#include <stdio.h>
#include <string.h>

int main(void)
{
    int pid, fds[2], pid1;
    char buf[200];
    pipe(fds);
    pid = fork();

    if (pid == 0)
    {
        close(fds[0]);
        printf("Prompt: "); fflush(0);
        if (scanf("%199s", buf) != 1)
            fprintf(stderr, "scanf() failed\n");
        else
            write(fds[1], buf, strlen(buf) + 1);
    }
    else
    {
        pid1 = fork();

        if (pid1 == 0)
        {
            close(fds[1]);
            if (read(fds[0], buf, sizeof(buf)) > 0)
                printf("%s\n", buf);
            else
                fprintf(stderr, "read() failed\n");
        }
        else
        {
/*Line1:              wait();*/
        }
    }
    return 0;
}

That compiles cleanly under stringent options (GCC 5.1.0 on Mac OS X 10.10.5):

gcc -O3 -g -std=c11 -Wall -Wextra -Werror p11.c -o p11

When I run it, the output is:

$ ./p11
Prompt: scanf() failed
read() failed
$

The problem is clear; the scanf() fails. At issue: why?

The wait() version needs an extra header #include <sys/wait.h> and the correct calling sequence. I used the paragraph:

        else
        {
            printf("Kids are %d and %d\n", pid, pid1);
            int status;
            int corpse = wait(&status);
            printf("Parent gets PID %d status 0x%.4X\n", corpse, status);
        }

When compiled and run, the output is now:

$ ./p11
Kids are 20461 and 20462
Prompt: Albatross
Albatross
Parent gets PID 20461 status 0x0000
$

So, the question becomes: how or why is the standard input of the child process closed when the parent doesn't wait? It is Bash doing some job control that wreaks havoc.

I upgraded the program once more, using int main(int argc, char **argv) and testing whether the command was passed any arguments:

        else if (argc > 1 && argv != 0) // Avoid compilation warning for unused argv
        {
            printf("Kids are %d and %d\n", pid, pid1);
            int status;
            int corpse = wait(&status);
            printf("Parent gets PID %d status 0x%.4X\n", corpse, status);
        }

I've got an Heirloom Shell, which is close to an original Bourne shell. I ran the program under that, and it behaved as I would expect:

$ ./p11
Prompt: $ Albatross
Albatross

$ ./p11 1
Kids are 20483 and 20484
Prompt: Albatross
Albatross
Parent gets PID 20483 status 0x0000
$

Note the $ after the Prompt: in the first run; that's the shell prompt, but when I type Albatross, it is (fortunately) read by the child of the p11 process. That's not guaranteed; it could have been the shell that read the input. In the second run, we get to see the parent's output, then the children at work, then the parents exiting message.

So, under classic shells, your code would work as expected. Bash is somehow interfering with the normal operation of child processes. Korn shell behaves like Bash. So does C shell (tcsh). Attempting dash, I got interesting behaviour (3 runs):

$ ./p11
Prompt: $ Albatross
scanf() failed
read() failed
dash: 2: Albatross: not found
$ ./p11
Prompt: $ Albatross
scanf() failed
dash: 4: Albatross: not found
$ read() failed

$ ./p11
Prompt: scanf() failed
$ read() failed

$

Note that the first two runs shows dash reading the input, but the children did not detect problems until after I hit return after typing Albatross. The last time, the children detected problems before I typed anything.

And, back with Bash, redirecting standard input works 'sanely':

$ ./p11 <<< Albatross
Prompt: Albatross
$ ./p11 1 <<< Albatross
Kids are 20555 and 20556
Prompt: Albatross
Parent gets PID 20555 status 0x0000
$

The output Albatross comes from the second child, of course.

The answer is going to be lurking somewhere in behaviour of job control shells, but it's enough to make me want to go back to life before that.

Two children of same parent are not communicating using pipe if parent do not call wait()

2 Answers2

Linked