Modern C++ Concurrency - How to share data and resources between threads

In this lesson, we will cover the topic of data sharing and resources between threads. Imagine a scenario where an integer o needs to be modified by two threads t1 and t2. If we are not careful in handling this scenario a data race might occur. But what is a data race exactly?

Data Race

A data race occurs when two or more threads access some shared data and at least one of them is modifying such data. Because the threads are scheduled by OS, and scheduling is not under our control, you do not know upfront which thread is going to access the data first. The final result might depend on the order in which threads are scheduled by the OS.

Race conditions occur typically when an operation, in order to be completed, requires multiple steps or sub-operations, or the modification of multiple data. Since this sub-operations end up being executed by the CPU in different instructions, other threads can potentially mess up with the state of the data while the other's thread operation is still ongoing.

Most of the complexity of concurrent programming comes from avoiding race conditions.

As en example consider the following scenario:

#include <iostream>
#include <thread>

using namespace std;

int shared_data = 0;

void function() { shared_data++; }

int main() {
  thread t1(function);
  thread t2(function);
  t1.join();
  t2.join();
  cout << shared_data << endl;
}

thread t1 and t2 simply increment shared_data by 1. You would expect the final result of shared_data to always be 2.

If you run this code you will most likely get a result 2 most of the time. But if you run this code multiple times you might find it’s value to be 1 at the end! These are the all different values that I obtain out of 10000 executions:

[knotman@archazzo] →
for i in `seq 1 10000`
  do ./tutorial4
done | sort | uniq
1
2

As you can see the same code return three different values i.e. 1 and 2.

How is that possible? Well, you have to understand that the increment operations is not atomic. When shared_data = shared_data +1 gets executed, the OS first retrieve the value of shared_dataand puts it into a CPU register. Then finally, itincremented by 1. Those are three different operations, as you can see from the assembly generated by g++ -S for function()

_Z8functionv:
....
  movl  shared_data(%rip), %eax
  addl  $1, %eax
  movl  %eax, shared_data(%rip)
....
  ret

The OS can decide to stop the execution at any point! What happens if the OS decided to interrupt the execution of t1 right after the value is retrieved and put into the register and before the increment is performed and start running the other thread? Both t1 and t2 will read the same value of shared_data = 0 and both will end up writing back into it1! This type of data race is, for obvious reasons, called lost update.

How to avoid race conditions - Locking

The simplest solution to this problem is to somehow force the execution of multiple instructions operations to be atomic. In other words, when such atomic operations are performed, no other threads can mess up with the data until the operation is completed in its entirety. Another term used to describe this data protection mechanism is synchronization. Threads synchronize their operations s.t. they do not step into each other’s toes.

In programming terms, what the guarding mechanism that I just described translates to the usage of mutex.

Mutex

A mutex is a synchronization mechanism and stands for mutual exclusion. Its usage is simple and yet elegant. Before you enter a critical section (the set of sub-operations that need to be atomically executed), a thread locks the mutex if the mutex is not already locked otherwise it waits, until the mutex becomes lockable again. This way, we are sure that only the thread that successfully acquired the mutex can operate on the critical section of the code and modify the critical data.

Mutex in Modern C++

In C++ we can construct a std::mutex object and acquire or release it using its member functions std::mutex::lock() and std::mutex::unlock(),

In a nutshell, what is needed to be done in order to protect our critical section is to wrap our code in a pair of lock() and unlock() as in the following example:

std::mutex mtx;
void fun(){
  do_stuff();

  //critical section begins
  mtx.lock()
  op1();
  op2();
  mtx.unlock()
  //critical section ends

  do_other_stuff();
}

Having in mind what we have just learned, let’s rewrite our incremement by 1 example above with a mutex so we always get the correct result i.e. 2. We need to modify function() and protect the increment operation by using lock() and release the mutex when the increment is finally performed and the result is safely written back into memory. This way other threads will always read the correct value and no update will be lost! Great!

#include <iostream>
#include <thread>
#include <mutex>

using namespace std;

int shared_data = 0;

std::mutex mtx;

void function() { 
  mtx.lock();
  shared_data++;
  mtx.unlock();
}

int main() {
  thread t1(function);
  thread t2(function);
  t1.join();
  t2.join();
  cout << shared_data << endl;
}

If we try to execute the code above now, we always get 2 as result, no surprises anymore.

[knotman@archazzo]
→ g++ -O0 -std=c++17  -Wall  -Wextra -pthread  -o tutorial4_mutex tutorial4_mutex.cpp

[knotman@archazzo] →
for i in `seq 1 10000`
  do ./tutorial4_mutex
done | sort | uniq
2

Conclusions

Mutexes are a useful tool on which we can always rely on protecting our data from race conditions but do not get too excited though. This way of synchronizing comes with its problem. Two of the most common are:

  • Performance: only one thread at the time will execute the critical section. We are serializing a portion of our code! This can have a huge impact on performance. See Ahmdal’s law which describes what is the maximum theoretical speedup obtainable when we serialize a portion B of the code (In short, even when a small percentage of our code is serialized things do not scale well), see the following image.
Ahmdal's Law
  • Deadlock: when two threads are waiting on two mutexs that can only be released if their critical section is executed (which cannot because they are waiting for the mutexs to be unlocked).

References

cppreference.com, std::move

Wikipedia, Ahmdal’s law

wlu.edu, Ahmdal’s law

Be the first to leave a comment. Don’t be shy.

Join the Discussion

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>