CP: Concurrency and parallelism

We often want our computers to do many tasks at the same time (or at least appear to do them at the same time). The reasons for doing so vary (e.g., waiting for many events using only a single processor, processing many data streams simultaneously, or utilizing many hardware facilities) and so do the basic facilities for expressing concurrency and parallelism. Here, we articulate principles and rules for using the ISO standard C++ facilities for expressing basic concurrency and parallelism.

Threads are the machine-level foundation for concurrent and parallel programming. Threads allow running multiple sections of a program independently, while sharing the same memory. Concurrent programming is tricky, because protecting shared data between threads is easier said than done. Making existing single-threaded code execute concurrently can be as trivial as adding std::async or std::thread strategically, or it can necessitate a full rewrite, depending on whether the original code was written in a thread-friendly way.

The concurrency/parallelism rules in this document are designed with three goals in mind:

To help in writing code that is amenable to being used in a threaded environment
To show clean, safe ways to use the threading primitives offered by the standard library
To offer guidance on what to do when concurrency and parallelism aren't giving the performance gains needed

It is also important to note that concurrency in C++ is an unfinished story. C++11 introduced many core concurrency primitives, C++14 and C++17 improved on them, and there is much interest in making the writing of concurrent programs in C++ even easier. We expect some of the library-related guidance here to change significantly over time.

This section needs a lot of work (obviously). Please note that we start with rules for relative non-experts. Real experts must wait a bit; contributions are welcome, but please think about the majority of programmers who are struggling to get their concurrent programs correct and performant.

Concurrency and parallelism rule summary:

See also:

CP: Concurrency and parallelism

CP.1: Assume that your code will run as part of a multi-threaded program#

CP.2: Avoid data races#

CP.3: Minimize explicit sharing of writable data#

CP.4: Think in terms of tasks, rather than threads#

CP.8: Don't try to use volatile for synchronization#

CP.9: Whenever feasible use tools to validate your concurrent code#

CP.con: Concurrency#

CP.20: Use RAII, never plain lock()/unlock()#

CP.21: Use std::lock() or std::scoped_lock to acquire multiple mutexes#

CP.22: Never call unknown code while holding a lock (e.g., a callback)#

CP.23: Think of a joining thread as a scoped container#

CP.24: Think of a thread as a global container#

CP.25: Prefer gsl::joining_thread over std::thread#

CP.26: Don't detach() a thread#

CP.31: Pass small amounts of data between threads by value, rather than by reference or pointer#

CP.32: To share ownership between unrelated threads use shared_ptr#

CP.40: Minimize context switching#

CP.41: Minimize thread creation and destruction#

CP.42: Don't wait without a condition#

CP.43: Minimize time spent in a critical section#

CP.44: Remember to name your lock_guards and unique_locks#

CP.50: Define a mutex together with the data it guards. Use synchronized_value<T> where possible#

CP.coro: Coroutines#

CP.51: Do not use capturing lambdas that are coroutines#

CP.52: Do not hold locks or other synchronization primitives across suspension points#

CP.53: Parameters to coroutines should not be passed by reference#

CP.par: Parallelism#

CP.mess: Message passing#

CP.60: Use a future to return a value from a concurrent task#

CP.61: Use async() to spawn concurrent tasks#

CP.vec: Vectorization#

CP.free: Lock-free programming#

CP.100: Don't use lock-free programming unless you absolutely have to#

CP.101: Distrust your hardware/compiler combination#

CP.1: Assume that your code will run as part of a multi-threaded program

CP.2: Avoid data races

CP.3: Minimize explicit sharing of writable data

CP.4: Think in terms of tasks, rather than threads

CP.8: Don't try to use `volatile` for synchronization

CP.9: Whenever feasible use tools to validate your concurrent code

CP.con: Concurrency

CP.20: Use RAII, never plain `lock()`/`unlock()`

CP.21: Use `std::lock()` or `std::scoped_lock` to acquire multiple `mutex`es

CP.22: Never call unknown code while holding a lock (e.g., a callback)

CP.23: Think of a joining `thread` as a scoped container

CP.24: Think of a `thread` as a global container

CP.25: Prefer `gsl::joining_thread` over `std::thread`

CP.26: Don't `detach()` a thread

CP.31: Pass small amounts of data between threads by value, rather than by reference or pointer

CP.32: To share ownership between unrelated `thread`s use `shared_ptr`

CP.40: Minimize context switching

CP.41: Minimize thread creation and destruction

CP.42: Don't `wait` without a condition

CP.43: Minimize time spent in a critical section

CP.44: Remember to name your `lock_guard`s and `unique_lock`s

CP.50: Define a `mutex` together with the data it guards. Use `synchronized_value<T>` where possible

CP.coro: Coroutines

CP.51: Do not use capturing lambdas that are coroutines

CP.52: Do not hold locks or other synchronization primitives across suspension points

CP.53: Parameters to coroutines should not be passed by reference

CP.par: Parallelism

CP.mess: Message passing

CP.60: Use a `future` to return a value from a concurrent task

CP.61: Use `async()` to spawn concurrent tasks

CP.vec: Vectorization

CP.free: Lock-free programming

CP.100: Don't use lock-free programming unless you absolutely have to

CP.101: Distrust your hardware/compiler combination