38

I have a program with a parent and a child process. Before the fork(), the parent process called malloc() and filled in an array with some data. After the fork(), the child needs that data. I know that I could use a pipe, but the following code appears to work:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>

int main( int argc, char *argv[] ) {
    char *array;
    array = malloc( 20 );
    strcpy( array, "Hello" );
    switch( fork() ) {
    case 0:
        printf( "Child array: %s\n", array );
        strcpy( array, "Goodbye" );
        printf( "Child array: %s\n", array );
        free( array );
        break;
    case -1:
        printf( "Error with fork()\n" );
        break;
    default:
        printf( "Parent array: %s\n", array );
        sleep(1);
        printf( "Parent array: %s\n", array );
        free( array );
    }
    return 0;
}

The output is:

Parent array: Hello
Child array: Hello
Child array: Goodbye
Parent array: Hello

I know that data allocated on the stack is available in the child, but it appears that data allocated on the heap is also available to the child. And similarly, the child cannot modify the parent's data on the stack, the child cannot modify the parent's data on the heap. So I assume the child gets its own copy of both stack and heap data.

Is this always the case in Linux? If so, where the is the documentation that supports this? I checked the fork() man page, but it didn't specifically mention dynamically allocated memory on the heap.

syntagma
  • 20,485
  • 13
  • 66
  • 121
pcd6623
  • 656
  • 2
  • 9
  • 15

4 Answers4

39

Each page that is allocated for the process (be it a virtual memory page that has the stack on it or the heap) is copied for the forked process to be able to access it.

Actually, it is not copied right at the start, it is set to Copy-on-Write, meaning once one of the processes (parent or child) try to modify a page it is copied so that they will not harm one-another, and still have all the data from the point of fork() accessible to them.

For example, the code pages, those the actual executable was mapped to in memory, are usually read-only and thus are reused among all the forked processes - they will not be copied again, since no one writes there, only read, and so copy-on-write will never be needed.

More information is available here and here.

abyx
  • 61,118
  • 16
  • 86
  • 113
  • Just to be clear, there is *no* race condition in the OP's code, correct? – SiegeX Jan 04 '11 at 20:11
  • 3
    There can't be - the processes can not communicate using this memory any more. – abyx Jan 04 '11 at 20:13
  • 1
    Nitpick: consider `mmap(MAP_SHARED)` and `shmat`. Which is outside the scope of C's "stack" and "heap", but you did say "each page"... – ephemient Jan 04 '11 at 22:17
  • @ephemient - Well, there always has to be a nitpick. You are correct, but I didn't want to start documenting kernel-code edge cases :) – abyx Jan 05 '11 at 04:26
  • Thanks! This makes sense to me. As far as the nitpick is concerned, shared memory is a special case since it has to be read and write accessible across multiple processes. In my program, I only need read access, which is why regular malloc works for me, even if I'm reading from a copied page. – pcd6623 Jan 05 '11 at 20:58
8

After a fork the child is completely independent from the parent, but may inherit certain things that are copies of the parent. In the case of the heap, the child will conceptually have a copy of the parents heap at the time of the fork. However, modifications to the head in the child's address space will only modify the child's copy (e.g. through copy-on-write).

As for the documentation: I've noticed that documentation will usually state that everything is copied, except for blah, blah blah.

Noah Watkins
  • 5,046
  • 3
  • 29
  • 42
3

The short answer is 'dirty on write' - the longer answer is .. a lot longer.

But for all intends and purposes - the working model which at C level is safe to assume is that just after the fork() the two processes are absolutely identical -- i.e. the child gets a 100% exact copy -- (but for a wee bit around the return value of fork()) - and then start to diverge as each side modifies its memory, stack and heaps.

So your conclusion is slightly off - child starts off with the same data as parent copied into its own space - then modifies it - and see s it as modified - while the parent continues with its own copy.

In reality things are bit more complex - as it tries to avoid a complete copy by doing something dirty; avoiding to copy until it has to.

Dw.

Dirk-Willem van Gulik
  • 7,005
  • 2
  • 31
  • 39
0

The example doesn't work because the parent is not updated by the child. Here's a solution:

#include <sys/types.h>
#include <sys/ipc.h>
#include <sys/shm.h>
#include <sys/stat.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <sys/wait.h>

typedef struct
{
  int id;
  size_t size;
} shm_t;

shm_t *shm_new(size_t size)
{
  shm_t *shm = calloc(1, sizeof *shm);
  shm->size = size;

  if ((shm->id = shmget(IPC_PRIVATE, size, IPC_CREAT | IPC_EXCL | S_IRUSR | S_IWUSR)) < 0)
  {
    perror("shmget");
    free(shm);
    return NULL;
  }

  return shm;
}

void shm_write(shm_t *shm, void *data)
{
  void *shm_data;

  if ((shm_data = shmat(shm->id, NULL, 0)) == (void *) -1)
  {
    perror("write");
    return;
  }

  memcpy(shm_data, data, shm->size);
  shmdt(shm_data);
}

void shm_read(void *data, shm_t *shm)
{
  void *shm_data;

  if ((shm_data = shmat(shm->id, NULL, 0)) == (void *) -1)
  {
    perror("read");
    return;
  }
  memcpy(data, shm_data, shm->size);
  shmdt(shm_data);
}

void shm_del(shm_t *shm)
{
  shmctl(shm->id, IPC_RMID, 0);
  free(shm);
}

int main()
{
  void *array = malloc(20);
  strcpy((char *) array, "Hello");
  printf("parent: %s\n", (char *) array);
  shm_t *shm = shm_new(sizeof array);

  int pid;
  if ((pid = fork()) == 0)
  { /* child */
    strcpy((char *) array, "Goodbye");
    shm_write(shm, array);
    printf("child: %s\n", (char *) array);
    return 0;
  }
  /* Wait for child to return */
  int status;
  while (wait(&status) != pid);
  /* */
  shm_read(array, shm);
  /* Parent is updated by child */
  printf("parent: %s\n", (char *) array);
  shm_del(shm);
  free(array);
  return 0;
}
OS2
  • 373
  • 2
  • 10