CodingBison - POSIX Threads: Pthread Basics

POSIX threads (Pthreads) is a widely used standard for writing multi-threaded programs. Although, technically not a part of C, POSIX threads are commonly used for both C and C++ applications. Hence, we include discussion on Pthreads.

An operating system process can have multiple threads. Multiple threads allow the process to do tasks concurrently. Even better, if there are multiple processors, then these threads can run on different multiple processors, and thereby, reach (almost) true parallelism. Hence, an application with tasks that can be done independently, or with little (bounded) dependency on each other, can greatly benefit from using threads.

Let us consider an example of a web-server that handles multiple HTTP requests. The web-server can spawn multiple threads and each thread can handle a single request. At the same time, another thread can listen for incoming HTTP requests. Thus, threads can enable a web-server to not only listen for incoming requests, but also to process existing requests -- and all of these happen concurrently.

Threads belonging to a process share the address space, file descriptors, etc belonging to the parent process with each other. However, each thread does maintain its own stack so that it can execute common lines of code independently of other threads. Since threads do not have to maintain their own address space etc, they are also referred to as light weight processes!

One naive approach may be to run concurrent tasks as multiple processes. But, switching between different processes (aka scheduling) has a lot more overhead than switching between threads within one process. No less important, communication between different processes would require some form of inter process communication (IPC) which again has a lot more overhead, not to mention more complicated. By comparison, communication between threads is simpler since they share the address space, file descriptors belonging to the parent process.

Of course, when we have multiple threads working together, there is always a risk of them overwriting any common memory location (or critical data). Therefore, isolating the critical data in the memory and making threads access them in a synchronized manner goes hand in hand with creating threads. Pthreads offer mutex -- short for mutual exclusion -- to provide synchronization for multiple threads. Lastly, threads also need to communicate with each other. To facilitate inter-thread communication, Pthreads provide conditional variables (or condvars).

Getting started

Pthreads provide a new variable type for threads: pthread_t. Thus, if we have to define a Pthread variable, x, then we can do it as "pthread_t x". pthread_t type is simply an integer (in most of the platforms) that stores an identifier value of the thread. More importantly, pthread_t is an opaque value, so we should not make any assumption about its internals, else, the resulting code may cease to be portable.

Let us get started by providing the basic APIs that create threads and allows the parent threads to wait for them till they are done executing their tasks. Here is the signatures of these APIs:

 int pthread_create(pthread_t *t, 
                    const pthread_attr_t *a, 
                    void *(*target_func)(void *), 
                    void *a);
 int pthread_join(pthread_t t, void **value_ptr);

The pthread_create() API is truly the mother of all Pthread APIs, since it is this API that brings a thread into existence! More specifically, pthread_create() creates a new (child) thread and puts the thread identifier into the first parameter, which is a pointer to a pthread_t variable. The function also takes three additional parameters. The attr parameter allows us to specify attributes for the new thread -- if we choose default attributes, then we can keep this parameter as NULL. The target_func parameter specifies a target function and once created, the newly minted thread's job will be to run this function.

The last parameter of pthread_create() takes a pointer to a value that can be passed as an argument to the target function. This will work easily, if we need to pass a single value to the target_func -- basically pass a pointer to the value (aka variable) and of course cast it as a void pointer. However, when we have more than one value to pass to the target_func, then this approach would not work. In that case, one solution is define a data structure and use that as a container to hold multiple values and then pass the (void-casted) pointer of that data structure.

Function pthread_create() returns 0 if is successful, and a non-zero error code, if not. It is a good idea to always check the return value and if error happens, then we should print the returned value for debugging purposes -- multi-threaded programs are notoriously difficult to debug and therefore, so we should go the extra mile to get all the error information logged.

While the child thread finishes its task, the caller thread can (and often should) wait for the child thread to complete its task. In most cases, completing the task means returning from the target function. POSIX provides pthread_join() API that suspends the caller thread till the child thread is done; when the child thread returns from the target function, it is put in a terminated state. If the main thread finishes its completion and does not wait for the (child) threads, then the child thread (along with other resources of the main thread) would be destroyed immediately -- so, in many cases, the polite thing would be for the caller thread to simply wait.

The second API in the above set, pthread_join() takes pthread_t identifier of the thread for which we wish to wait. It returns 0 if the thread is done with its task and the join succeeds. If not, then it returns an error. Once again, the good thing to do is to report the status, if it is not zero. One common reason why pthread_join() can fail is if the child thread ceases to exist. This error can happen, if the child thread detaches itself (we will talk about detaching threads in a moment).

Function pthread_join() also takes a second argument, value_ptr. For cases, where the child thread returns from the target function using pthread_exit(), the thread can pass a return value to pthread_exit() call and that value gets stored in the value_ptr. This is one way to provide communication between threads, albeit rather limited.

Lastly, only one thread should wait for the termination of a given child thread. If multiple threads were to call pthread_join for a given thread, then it can also lead to an error.

Let us now get our feet wet and write our first threaded program. So, here it goes:

 #include <stdio.h>
 #include <pthread.h>
 #include <time.h>

 char *arrPaintings[] = {"The Last Supper", "Mona Lisa", "Potato Eaters",
                         "Cypresses", "Starry Night", "Water Lilies"};

 void *selectPainting (void *arg) {
     int index = *(int *)arg;

     printf("\tPassed index is %d\n", index);
     printf("\tGoing to sleep..\n");
     sleep(10);
     printf("\tWoke up\n", arrPaintings[index]);
     printf("\tPainting is %s\n", arrPaintings[index]);
 }

 int main () {
     pthread_t t;
     int status;
     int arrLen = sizeof(arrPaintings)/sizeof(arrPaintings[0]);
     int index = 2;

     printf("Starting the child thread..\n");
     status = pthread_create(&t, NULL, selectPainting, (void*)&index);
     if (status != 0) {
         fprintf(stderr, "pthread_create() failed [status: %d]\n", status);
         return 0;
     }

     printf("Waiting for the child thread..\n");
     status = pthread_join(t, NULL);
     if (status != 0) {
         fprintf(stderr, "pthread_join() failed [status: %d]\n", status);
     }
     printf("Child thread is done\n");
     return 0;
 }

Let us now understand the above program. The program begins by including <pthread.h> header file (besides <stdio.h> and <time.h>) since <pthread.h> contains definitions of Pthread APIs and the related constants. This header file also includes definitions for Pthread mutexes and conditional variables.

The main() function calls pthread_create() to create a new thread, identified by the variable t. We choose to use default attributes for the thread and pass second argument as NULL. The third argument is the target function: selectPainting() and we pass an index as the parameter to this target function. Since index is an integer variable, we pass its address after casting it as (void *). Accordingly, selectPainting casts its back to an (int *) to retrieve its value.

We have kept the logic of this program simple so as to illustrate the concepts more clearly. The program has a global array, arrPaintings that contains strings for some well-known paintings. The main() function passes an index and the target function uses that index to locate the element in the array. To demonstrate the waiting period for pthread_join(), we add a sleep() statement in the target function.

Also note that the original process runs the main() function as a thread as well. So, in the above example, we actually have two threads. And since main thread is also a thread, it can also use Pthread APIs, just like every other thread!

However, there are differences between the main thread and a child thread. First, if the main thread returns, then all the threads that are still running would be terminated and their resources would be freed. Second, the arguments passed to main are typically "void *argv" and "int argc", where as the child thread's argument is only a "void *arg".

When we compile and run the program, the output (provided below) shows that the main thread suspends as long as the child thread does not return from the target function. And when it does, the pthread_join() wait is over and the main() function returns as well. Please note that we use pthread option with gcc to add support for Pthreads library.

 [user@codingbison]$ gcc -pthread single_thread.c -o single-thread
 [user@codingbison]$ 
 [user@codingbison]$ ./single-thread 
 Starting the child thread..
 Waiting for the child thread..
     Passed index is 2
     Going to sleep..
     Woke up
     Painting is Potato Eaters
 Child thread is done
 [user@codingbison]$

While the above program is a good start, real-life threaded programs are rarely single-threaded! So, let us crank it up a notch and rewrite the above example to spawn multiple threads.

Our new program is same as before, but with a few differences. First, it uses an array to hold multiple Pthreads. Second, it uses another array to hold various index values. Third, it generates a random number as an index for each thread -- so that each thread can pick a random array element.

We need to hold index values in an array because, that way, we can have different values and thus different pointers, when passing them to the target function. If we were to use index as a single variable and change its value in the loop, then it is possible that different threads might refer to the same index since the pointer would be the same. The reason why this is possible is because we cannot guarantee the order of execution of different threads. It is possible that multiple threads start running the target function at the same time and so would end up using the same value of index!

The program uses a for loop for both creating new threads and for waiting till they return. In the second loop, the main thread first wait for the thread identified by thread[0], and then for the thread identified by thread[1], and so on. Thus, even if thread[1] finishes first, the main thread would continue to wait for thread[0] to finish! Needless to say, we can wait for the threads in the reverse order as well.

 #include <stdio.h>
 #include <pthread.h>
 #include <time.h>

 #define MAX_THREADS 2

 char *arrPaintings[] = {"The Last Supper", "Mona Lisa", "Potato Eaters",
                         "Cypresses", "Starry Night", "Water Lilies"};

 void *selectPainting (void *arg) {
     int index = *(int *)arg;

     printf("\t[Array Index: %d] Going to sleep..\n", index);
     sleep(10);
     printf("\t[Array Index: %d] Woke up. Painting: %s\n", 
             index, arrPaintings[index]);
 }

 int main () {
     pthread_t t[MAX_THREADS];
     int index[MAX_THREADS];
     int status, arrLen, i;

     arrLen = sizeof(arrPaintings)/sizeof(arrPaintings[0]);

     srand(time(NULL));                  /* initialize random seed */
     for (i = 0; i < MAX_THREADS; i++) {
         index[i] = rand() % arrLen;     /* Generate a random number less than arrLen */

         printf("[Array Index: %d] Starting the child thread..\n", index[i]);
         status = pthread_create(&t[i], NULL, 
                             selectPainting, (void*)&index[i]);
         if (status != 0) {
             fprintf(stderr, "pthread_create() failed [status: %d]\n", status);
             return 0;
         }
     }

     for (i = 0; i < MAX_THREADS; i++) {
         printf("[Array Index: %d] Waiting for the child thread..\n", index[i]);
         status = pthread_join(t[i], NULL);
         if (status != 0) {
             fprintf(stderr, "pthread_join() failed [status: %d]\n", status);
         }
         printf("[Array Index: %d] Child thread is done\n", index[i]);
     }
 }

We provide the output below. We see that the main thread chooses two random array indices (2 and 5). Here is the sequence of how these threads run: the first thread (working on array index 2) runs first, the main thread starts to wait, the second thread starts to run. Once done, both threads return to the main thread and it is then that the main thread return.

 [Array Index: 2] Starting the child thread..
 [Array Index: 5] Starting the child thread..
     [Array Index: 2] Going to sleep..
 [Array Index: 2] Waiting for the child thread..
     [Array Index: 5] Going to sleep..
     [Array Index: 2] Woke up. Painting: Potato Eaters
     [Array Index: 5] Woke up. Painting: Water Lilies
 [Array Index: 2] Child thread is done
 [Array Index: 5] Waiting for the child thread..
 [Array Index: 5] Child thread is done

An important point worth mentioning is that in the above program, even though we have multiple threads sleeping for 10 seconds each (we call the sleep() function in the target function), we would see the main thread waiting (approximately) for 10 seconds and not a multiple of 10 seconds since the threads are running concurrently. You are free to take a stopwatch and verify it!

Miscellaneous APIs

Let us now look at the next set of Pthread APIs that provide miscellaneous functions. As usual, we start with the signature of these functions:

 int pthread_exit(void *value_ptr);
 int pthread_detach(pthread_t t);
 int pthread_self(void);
 int pthread_equal(pthread_t t1, pthread_t t2);
 int sched_yield(void);

The first API in the above set, pthread_exit() terminates the current thread without terminating the process. This is helpful, if the current thread has child threads that still need to do some work. Of course, the process would automatically terminate when the last running thread terminates. Note that pthread_exit() is different than an exit() call since the exit() call is more extreme and terminates the process along with all of its threads. An important caveat is that if the main thread passes a pointer (as ( *void) argument) to the target function and then exits using pthread_exit(), the pointer may access a dangling data. So, for such cases, using pthread_join() to suspend the main thread is a safer bet.

It is worth mentioning that when a child thread terminates (let us say, by returning from the target function), its memory resources are not released until a thread performs pthread_join on it. More specifically, pthread_join() call transitions the child thread from terminated to detached state. It is in detached state, that the we recover all the resources held by that thread. Therefore, pthread_join is also a good style to avoid memory leaks from the spawned threads. Of course, if the main thread terminates itself, then all of the resources are automatically recovered and that is a whole different story.

A child thread can also pass a value to pthread_exit() that gets copied to the pthread_join() parameter. This is one way for a child thread to pass a return value back to the parent thread.

The next API, pthread_detach() allows us to decouple a child thread from the thread that created it. There could be use cases, where once a thread is created, it can do independent things and the creator thread does not need to communicate with it or manage (read pthread_join()) it. Once a detached thread completes its task, its resources are automatically freed.

Since pthread_join() moves a thread to a detached state, we cannot call pthread_join() on a thread that is already detached; if we do, pthread_join() is likely to greet us with an error! One likely error is ESRCH that means pthread_join() could not find the (detached) thread. Thus, we should avoid calling pthread_join() twice on the same thread!

Yet another way to detach a thread is to set the detachable attribute of the thread to true; we can do so by passing an attribute to pthread_create(). Here is a small snippet that updates the pthread_create() call from the earlier examples to include pthread attribute, indicating that the thread is detached.

 pthread_t;
 int status;
 pthread_attr_t attr;

 status = pthread_attr_init(&attr);
 if (status != 0) {
     fprintf(stderr, "pthread_attr_init() failed [status: %d]\n", status);
     return 0;
 }

 status = pthread_attr_setdetachstate(&attr, PTHREAD_CREATE_DETACHED);
 if (status != 0) {
     fprintf(stderr, "pthread_attr_setdetachstate() failed [status: %d]\n", status);
     return 0;
 }

 status = pthread_create(&t, &attr, selectPainting, (void*)&index);
 ...
 ...
 /* In the end, call pthread_attr_destroy() to destroy the attr object */
 status = pthread_attr_destroy(&attr); 
 if (status != 0) {
     fprintf(stderr, "pthread_attr_destroy() failed [status: %d]\n", status);
     return 0;
 }

The next three functions in the above list are pthread_self(), pthread_equal(), and sched_yield().

The pthread_self() returns the thread identifier (a pthread_t variable) to the current thread. In other words, if the current thread needs to get a handle of its own, then it call pthread_self().

The next function, pthread_equal() compares two pthread identifiers (phread_t variables) and if belong to the same thread, then it returns 0.

The last function, sched_yield() allows us to yield the processor to another thread/process. This provides other threads/processes a chance to run when (heavily) contended resources (e.g., mutexes) have been released by the caller. For example, if the current thread takes too long to finish its task, then other threads/processes can potentially starve. Needless to say, the threshold of deciding when it is too long is application specific. Note that if the current thread is the only thread running, then sched_yield() would continue to run the current thread since there is nobody else to yield to!

If we look at all the Pthread APIs described on this page, then we might notice that all of the APIs accept pthread_t variables by value; the only exception is pthread_create(), where we pass a pointer to a pthread_t variable. The reason is that pthread_create() needs the thread identifier of the new thread and one way to do that is by making the caller thread pass a pointer to the pthread_t variable.

Now, that we have understood the above APIs, let us look at two examples that help us understand the behavior pthread_exit().

The first example demonstrate that if we call pthread_exit() from the main thread, then the child thread continues to run even though the main thread does not call pthread_join(). This is because pthread_exit() terminates the thread (in this case, the main thread), but does not terminate the process. If we were to remove the pthread_exit() call in the following example, then the child thread would terminate immediately.

 #include <stdio.h>
 #include <pthread.h>
 #include <time.h>

 char *arrPaintings[] = {"The Last Supper", "Mona Lisa", "Potato Eaters",
                         "Cypresses", "Starry Night", "Water Lilies"};

 void *selectPainting (void *arg) {
     int index = *(int *)arg;

     printf("\tPassed index is %d\n", index);
     printf("\tGoing to sleep..\n");
     sleep(10);
     printf("\tWoke up\n", arrPaintings[index]);
     printf("\tPainting is %s\n", arrPaintings[index]);
 }

 int main () {
     pthread_t t;
     int status;
     int arrLen = sizeof(arrPaintings)/sizeof(arrPaintings[0]);
     int index = 2;

     printf("Starting the child thread..\n");
     status = pthread_create(&t, NULL, selectPainting, (void*)&index);
     if (status != 0) {
         fprintf(stderr, "pthread_create() failed [status: %d]\n", status);
         return 0;
     }

     printf("Calling pthread_exit() from the main thread\n");
     pthread_exit(NULL);
     printf("This should not be printed\n");
     return 0;
 }

The output (provided below) confirms that even though the main thread calls pthread_exit() and returns, the child thread continues to run! In fact, it would not matter if we make the child thread detached or not since in either case, the child thread would run to completion. Note that the printf statement after pthread_exit() does not run since the main thread exits from the pthread_exit() call itself.

 Starting the child thread..
 Calling pthread_exit() from the main thread
     Passed index is 2
     Going to sleep..
     Woke up
     Painting is Potato Eaters

In the second example, we show communication between the child thread and the main thread using the pthread_exit() and pthread_join() combination. This time, we call pthread_exit() from the child thread and pass a pointer to a global variable. To "catch" the return value, we use a pthread_join() call in the main thread. Once the child thread returns, the pthread_join() call from the main thread unblocks and accesses the global data.

 #include <stdio.h>
 #include <pthread.h>
 #include <time.h>
 #include <string.h>

 char *arrPaintings[] = {"The Last Supper", "Mona Lisa", "Potato Eaters",
                         "Cypresses", "Starry Night", "Water Lilies"};

 char globalStr[100];

 void *selectPainting (void *arg) {
     int index = *(int *)arg;

     printf("\tPassed index is %d\n", index);
     printf("\tGoing to sleep..\n");
     sleep(10);
     printf("\tWoke up\n", arrPaintings[index]);
     printf("\tPainting is %s\n", arrPaintings[index]);

     memmove(globalStr, arrPaintings[index], strlen(arrPaintings[index])); 
     pthread_exit(&globalStr);
 }

 int main () {
     pthread_t t;
     int status;
     int arrLen = sizeof(arrPaintings)/sizeof(arrPaintings[0]);
     int index = 2;
     char *str;

     printf("Starting the child thread..\n");
     status = pthread_create(&t, NULL, selectPainting, (void*)&index);
     if (status != 0) {
         fprintf(stderr, "pthread_create() failed [status: %d]\n", status);
         return 0;
     }

     printf("Waiting for the child thread..\n");
     status = pthread_join(t, (void **)&str);
     if (status != 0) {
         fprintf(stderr, "pthread_join() failed [status: %d]\n", status);
     }
     printf("Child thread is done\n", str);
     printf("Returned value from the Child thread: %s \n", str);
     return 0;
 }

While the above example shows the case of the thread returning a value to the main thread using pthread_exit() and pthread_join(), we should emphasize that this approach is a trivial and lightweight form of communication among threads. For many applications, we are more likely to use a global data synchronized by mutex.

Here is the output:

 Starting the child thread..
 Waiting for the child thread..
     Passed index is 2
     Going to sleep..
     Woke up
     Painting is Potato Eaters
 Child thread is done
 Returned value from the Child thread: Potato Eaters

States of a Thread

Before we move on, let us talk briefly about various states in the life of a thread. A thread spends its entire lifetime in (mainly) four states: ready, running, blocked, and terminated.

When a thread is created using the pthread_create(), it starts its life-cycle in the ready state and would start running as and when the processor becomes available. Once the processor gives the green light, the thread runs the target function provided by the pthread_create() call.

When running, a thread can move to a blocked state, it it needs to wait for a resource (e.g. trying to acquire a mutex or waiting for an I/O that is not available immediately). Once the resource becomes available, the thread once again moves to the ready state -- this means, it is all set to run as soon as the processor becomes available.

When a thread returns from its target job or has been canceled, then it sits in the terminated state. A thread can return from the target function, either by completing the function or by pthread_exit().

In the terminated state, a thread still holds the resources that were allocated to it. If we detach the thread using pthread_join(), then its resources are recovered. Alternatively, if the thread was already detached before finishing its job or before being canceled, then its resources are recovered immediately after it reaches the terminated state. Thus, we should make sure to call pthread_join() on a thread that has not been detached.