Heap Copying from Parent to Child

We recently had a problem in a pre-lab that tried to answer the question of whether dynamically allocated data would be shared between parent and child processes. The actual heart of the question is whether, after a fork, a parent and its child processes would share a single heap. If this were true, any changes one process would make to malloc’ed data would become immediately visible to all other processes.

To verify what happens in Linux, one could put together a program like this:

#include 
#include 
#include 
#include 
#include 
#include 
#include 

int main(int argc, char *argv[]) {

  int status;
  int *p, *q;
  p = malloc(sizeof(int));
  *p = 123;
  printf("parent - pre-fork: %016p\n", p);

  if (0 == fork()) {
    printf("child: %016p\n", (void*) p);
    printf("child: p = %d\n", *p);
    *p = 4;
    sleep(1);
    printf("child: p = %d\n", *p);
    printf("child terminating\n");
  } else {
    printf("parent - post-fork: %016p\n", (void*) p);
    printf("parent: p = %d\n", *p);
    sleep(4);
    *p = 0;
    printf("parent: p = %d\n", *p);
    wait(&status);
    printf("parent terminating\n");
  }
}

Running this program, we see the following output like this:

parent - pre-fork: 0x00000001dfe010
parent - post-fork: 0x00000001dfe010
parent: p = 123
child: 0x00000001dfe010
child: p = 123
child: p = 4
child terminating
parent: p = 0
parent terminating

From the start of the output, we learn that the child inherited “something” from the parent: it got a heap with the same original content. The open question is whether this heap is a copy or if it’s shared memory. Looking at what happens to the values printed after parent and child change the contents of (*p), we see that we must be talking about disjoint areas of memory. That is, the heap is copied after fork, but parent and child have their own separate heaps.

Now, that “conclusion” gets muddied when we look at the memory addresses of the malloc’ed integer, that is the address stored in pointer p: it is exactly the same for both parent and child. It’s fair for one to ask how it’s possible that with addresses being the same, we get the behavior of having different memory locations for the two heaps. One thing we can be sure of: this doesn’t happen by magic. The answer involves copy-on-write semantics and virtual memory, as we see in this StackOverflow post. This goes back to what we should understand as the “context” of a running process.

The calls to sleep are a bit of overzealousness, but they are there to demonstrate to the observer that the child has a chance to store value 4 in the supposedly shared memory area well before the parent has chance to set it to 0.

Leave a Reply

Your email address will not be published.

This site uses Akismet to reduce spam. Learn how your comment data is processed.