You are on page 1of 11

Pthreads on Linux Mark Veltzer mark.veltzer@gmail.

com

Thread identity

Linux id of threads

Linux uses the pid space to give ids both to threads and processes. The operating system itself does know which ids belong to threads and which to processes. The first threads thread id also doubles as a process id. This does create some confusion. getpid() is used to get a process id. gettid() is used to get a thread id. These may return the same value depending on whether you are the primary thread or not.

pthread_t

To complicate matters pthread has it's own notion of thread ids. This is represented by an opaque pthread_t structure. This structure could be gotten when creating a thread using pthread_create() or from within the thread using pthread_self()

pthread_t

The idea is that pthread should be a cross operating system threading API and as such such should not expose the programmer to a particular operating systems concept of thread id. There is no simple tid hiding in that structure and the pthread_t is not cast able to it. In Linux pthread_t is just an unsigned long int (see /usr/include/bits/pthreadtypes.h ).

gettid()

Returns the current thread id Glibc does not have a wrapper for this one. Call it using syscall(2). Make your wrapper efficient (cache the result since tid never changes...).

CPU sets

In order to adjust affinity for threads you will need to be able to understand and build CPU sets. A CPU set is a data structure that one can pass to/from the kernel which represents a collection of CPUs. The size of CPU sets in recent Linux kernels is 1024. This means that unless you're running on machines that have more than 1024 CPUs you're good to go.

Stopping threads

Stopping threads - theory

Stopping threads is one of the most difficult parts of correct multi-threading programming. This sounds funny since it sounds like one should be able to stop a thread simply by asking the operating system to stop the thread. In that case one would expect the operating system to signal the CPU that is currently running the thread (if indeed that thread is currently running) and then kill it's own representation of the thread. This is wrong!

Stopping threads - problems

The idea of stopping a thread by having the operating system kill it is wrong because of many reasons:

On UNIX systems some of the threading issues (like allocating a stack for the thread) are handled in user space so killing the thread may leak resources. Locks acquired by the thread would possibly not be released resulting in deadlocking other threads. The thread does not get a chance to release other resources (memory, files etc) that it allocated. The thread may leave data structures in unusable state because it may be killed while manipulating them. Threads that are waiting on locks held by the thread that was killed really should be notified that this thread was killed because this may affect the way they tread data integrity.

Stopping threads - problem

As a result of the above the operating system does not support killing threads. The kill(1) command line and the underlying kill(2) system call can only be used to target processes and not threads. If you try to use kill(1) or kill(2) with a thread pid you will wind up killing the entire process the thread is a member of.

You might also like