What Are Threads?

A program is a set of instructions executed by the CPU. The CPU executes instructions one by one. A programming language needs to define its starting point (often a function called main), so the first instruction can be found. A program will run indefinitely until an instruction is reached which terminates it, such as an exit.

When you run a program, the OS schedules it to be run on a certain CPU. The programmer has no control over which CPU is used. Programs typically have a very short amount of time on a CPU before they are paused. This is how multiple programs can be running at once on a single-CPU machine. Each program runs for a short time slice and yields to another program and then gets rescheduled again later.

The OS saves all of the context needed to resume a program from where it left off, such as what instruction is to be executed next, where its memory resides, the values of registers, etc.. Since these time slices are so small, and OS scheduling has come a long way, it gives the illusion that your program is running smoothly and uninterrupted.

We can think of the program context as having an execution context - the contextual information associated with instruction execution. This would be things like the next instruction to run and a pointer to the top of the stack. From there, the CPU has the info it needs to continue executing code, and we may think of this contextual information and the sequence of instructions that will be run as a thread of execution. The thread being a string of instructions, with a past present and future.

Modern computers often come with more than one CPU (also called a core). This means the OS can schedule one program to run on one CPU and another program to run on a different CPU, and the execution of these programs will legitimately happen at the same time. But frequently it is desirable for a single program itself to want to split its own work up to be run on different cores to take advantage of this parallelism.

This is where threads come in. Think of the execution context as holding the information needed to run some code, and the larger program context as encapsulating that execution context along with additional information such as where its heap and global memory resides, etc.. With this model in mind, we can think of the program context as representing the Program and the execution context as representing a Thread within the program.

In a normal program, let's say execution begins at the main function, the default "main" thread is the one responsible for executing that function. Execution continues until the thread terminates, at which point the program terminates since its main thread is done.

In a multi-threaded program, we can tell the main thread to create other threads. These threads need somewhere to start, so they are typically given a function to begin execution from.

The OS is responsible for scheduling program execution, as mentioned above. And program execution is really encapsulated by a thread. This means if your program has two threads and there are two CPUs available, the OS will likely schedule one thread on one core and the other thread on the other. They belong to the same program, but they get to run on two separate CPUs simultaneously.

Let's take an example of a single-threaded program that will perform three "jobs". To simulate actual work, we use functions that just sleep for a certain amount of time:

#include <unistd.h>


static void job1() {
	sleep(3);
}

static void job2() {
	sleep(4);
}

static void job3() {
	sleep(5);
}

int main(int argc, const char **argv) {
	job1();
	job2();
	job3();
	return 0;
}

Assume the above code is in a file named work.c, then we can compile it with: gcc -Wall -o work work.c. If we run it we can see it takes about 3+4+5=12 seconds:

$ time ./work real 0m12.004s user 0m0.003s sys 0m0.000s

Now let's create three threads and assign each thread to one of the three jobs. To do this, we use the pthread_create function in the pthread library to create and start a "p" (POSIX) thread:

#include <unistd.h>
#include <pthread.h>


static void *job1(void *arg) {
	sleep(3);
	return NULL;
}

static void *job2(void *arg) {
	sleep(4);
	return NULL;
}

static void *job3(void *arg) {
	sleep(5);
	return NULL;
}

int main(int argc, const char **argv) {
	pthread_t thread1, thread2, thread3;

	pthread_create(&thread1, NULL, &job1, NULL);
	pthread_create(&thread2, NULL, &job2, NULL);
	pthread_create(&thread3, NULL, &job3, NULL);

	pthread_join(thread1, NULL);
	pthread_join(thread2, NULL);
	pthread_join(thread3, NULL);

	return 0;
}

I've skipped error handling so we can focus on the logic. Let's analyze what's going on first before running the program.

On lines 23-25 we launch the three threads. The second argument is for specifying advanced attributes for our threads - we pass NULL because we don't care. The third argument is the thread's start function, which is the function where the thread will start its execution. The final argument is a pointer that will be passed into the start function.

The start function needs to be tweaked slightly from what we had before. It must return void *, so that a thread can return an arbitrary pointer when it finishes execution. It must also take a single void * argument, which is the pointer passed into the last argument of pthread_create.

After line 25 runs, the three threads will begin execution. We need to prevent our main thread from returning, otherwise the program will end as soon as it does. We want to wait for the three threads to finish their jobs before returning. The pthread_join function blocks the main thread until the corresponding thread exits. That's how we accomplish that.

Assuming this code is in a file named work.c, we can compile it with: gcc -Wall -o work work.c -lpthread. Running it, we see the program now takes about 5 seconds:

$ time ./work real 0m5.004s user 0m0.000s sys 0m0.004s

This makes sense. Our three threads are all executing at the same time. The thread running job1 will finish after 3 seconds. The job2 thread after 4, and job3 after 5. This means all the work is done after 5 seconds.

< Other Articles | What Is Mutual Exclusion? >