Elliot Wasem - Software Engineer

TITLE

Multiprocessing and Basic Process Management in C

DESCRIPTION

Prior to reading this, a common working knowledge of the C programming language is very helpful. Additionally, it will be helpful to have read through the man pages for fork(2), and wait(2).

A process is a running set of instructions that you might commonly call a "program". However, a program (which is loosely defined as a piece of software created to solve a problem) can have many processes, all of which must be uniquely identified. To achieve this, each process (regardless of its position as a standalone process in a program or a process among many in a program) has a procces identifier (PID).

Each process consists of its own memory space, which means it has its own stack, heap, and all other segments of memory. One process does not have access to another process's address space. This is important for what comes later.

When you run a program, the operating system creates a processes, and starts your program with the main() function. Now, if the operating system can create new processes, why can't you? Well, it turns out you can! This can be achieved with the system call fork(). fork() takes no parameters, and its functionality is simple: it creates an exact copy of the process from which it was called. This includes the stack, the heap, and any additional associated data. Now, it should be noted that modern implementations of fork() don't create a full copy, but rather use a technique called copy-on-write, but this is beyond the scope of this document. For all intents and purposes, you can consider the new process as having its own complete copy of the address space.

When a process is successfully created, the new process is called the child process, and the original process is called the parent process.

A call to int pid = fork(); can end in the following situations:

pid < 0: An error has occurred, and no process has been created.
pid >= 0: A process has been created.

if pid > 0, then the current process is the parent process, and the PID of the child process is stored in pid.
if pid == 0, then the current process is the child process.

The following snippet of code describes the behavior:

int pid = fork();

if (pid < 0) {
    fprintf(stderr, "An error occurred!\n");
    return EXIT_FAILURE;
} else if (pid > 0) {
    printf("This is the parent process, and the child process's PID is %d.\n", pid);
} else {
    // pid == 0
    printf("This is the child process.\n");
}

If the above code is run, either it will print "An error occurred!", or it will print one of the following:

This is the parent process, and the child process's PID is <child PID>.
This is the child process.

This is the child process.
This is the parent process, and the child process's PID is <child PID>.

Huh? Why could it be either order? That's because, once the child process is created, there is no guarantee which will run first, the parent or the child. Don't believe me? Try it! Compile it to some program. Say the program is called fork_test. Execute the following in Bash:

for _ in {1..100}; do
    echo "---------------------------------------------";
    ./fork_test;
done

Chances are, you will see each of the two orders at least once, and if nothing goes wrong, you will never see the error message.

Now, there is one thing we have seen that remains shared between the parent and the child, and that is file descriptors. If the parent opens a file descriptor before calling fork, after the call to fork the child will have access to the opened file. This can lead to some fun synchronicity issues, and must be taken into account. What happens if the parent and child processes both read from the file at the same time? Write yourself a test program to see what happens.

But wait, there's more! There's a bit of a problem that can occur in the above program snippet. That problem arises from one simple fact. When a parent process dies, all its child processes will be killed as well. So, if the parent prints its piece first, and then exits before the child, there's a (admittedly slim) chance that the child process may never print its piece. To avoid this, we can have the parent process wait() on the child process to complete. We can do this as follows. Much of the code is duplicated from above:

int pid = fork();

if (pid < 0) {
    fprintf(stderr, "An error occurred!\n");
} else if (pid > 0) {
    printf("This is the parent process, and the child process's PID is %d.\n", pid);
    return EXIT_FAILURE;
} else {
    // pid == 0
    printf("This is the child process.\n");
    return EXIT_SUCCESS;
}

int status;
if (wait(&status) < 0) {
    fprintf(stderr, "Failed to wait on the child process.\n");
    return EXIT_FAILURE;
}

printf("Child process exited with status code %d.\n", WEXITSTATUS(status));

return EXIT_SUCCESS;

The following changes were made:

The child process explicitly exits with a successful status after it prints its piece.
The parent process continues execution outside of the if-statement block.
The parent waits on the child process to return, capturing its exit status.
The parent prints the exit status of the child.

There are a few more pieces here. wait() takes a pointer to an integer, which it fills with the return status of the child process which exits. We then must call WEXITSTTATUS(status) to get the actual numeric return code from the returned integer. Furthermore, if the call to wait() fails, it will return -1. As long as the call to wait() does not fail, the parent process will block until there is a child process to collect. Blocking in this case means that it will pause execution and, well, wait.

There are numerous ways in which a software developer might want to use fork() and wait(), and the limit of these uses are your imagination and creativity!

AUTHOR

Elliot Wasem, <elliotbielwasem@gmail.com>