C++11 improved the support for bidirectional fences in multi-thread applications. In modern C++ development, Fences are synchronization primitives in multi-threading operations, they are memory barriers in threads, and they can acquire semantics, release semantics, or both. In this post, we explain what are fences and how we can use them.
What are bidirectional fences in C++?
A fence is a primitive that enforces ordering between preceding loads or stores and subsequent loads or stores. C++11 improves the support for bidirectional fences. Fences are synchronization primitives in multi-threading operations they can have acquire semantics, release semantics, or both of them. A fence with acquire semantics is called an acquire fence and a fence with release semantics is called a release fence. If both operations are done, we call them full fence.
In modern C++, std::atomic_thread_fence is called fences, they are memory barriers in multi-thread operations, and they establish synchronization and ordering constraints between each thread without any atomic operation. In other words, std::atomic_thread_fence establishes memory synchronization ordering of non-atomic and relaxed atomic accesses without an associated atomic operation.
How to use bidirectional fences in C++?
Fences are useful between load and store operations and there are 4 types .
- LoadLoad : A load followed by a load
- LoadStore : A load followed by a store
- StoreLoad : A store followed by a load
- StoreStore : A store followed by a store
Here is the syntax how we can use std::atomic_thread_fence,
1 2 3 |
extern "C" void atomic_thread_fence( std::memory_order order ) noexcept; |
Here are two simple examples how to use in release and acquire operations of thread functions,
1 2 3 4 5 6 |
void myf1() { atomic_thread_fence(std::memory_order_release); } |
1 2 3 4 5 6 |
void myf2() { atomic_thread_fence(std::memory_order_acquire); } |
according to open-std.org document , depending on the value of m
emory_order, this operation it has different effects,
- if
order == memory_order_relaxed
; it has no effects, - if
order == memory_order_acquire || order == memory_order_consume
; it is an acquire fence, - if
order == memory_order_release
; it is a release fence, - if
order == memory_order_acq_rel
; it is both an acquire fence and a release fence, (full fence) - if
order == memory_order_seq_cst;
it is a sequentially consistent acquire and release fence
Is there a full C++ example of how to use bidirectional fences in C++?
Let’s assume we have two threads one is writing to data other one is reading
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 |
#include <iostream> #include <thread> int x[500]; void myf1() { for (int i = 0; i<500; i++) { x[i]=i; std::cout << "R"; } } void myf2() { for (int i = 0; i<500; i++) { std::cout << "A"; } } int main() { std::thread t1 (myf1); std::thread t2 (myf2); t1.join(); t2.join(); system("pause"); } |
In different runs, you will receive different R and A prints, for example,
1 2 3 |
RRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAARRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA |
In this example above, we want all operations to be done in the first thread, thus we can use std::atomic_thread_fence as below,
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 |
#include <iostream> #include <atomic> #include <thread> int x[500]; std::atomic<bool> done{false}; void myf1() { for (int i = 0; i<500; i++) { x[i]=i; std::cout << "R"; } atomic_thread_fence(std::memory_order_release); done.store(true, std::memory_order_relaxed); } void myf2() { while( !done.load(std::memory_order_relaxed) ) { // do operations not related with tasks } atomic_thread_fence(std::memory_order_acquire); for (int i = 0; i<500; i++) { std::cout << "A"; } } int main() { std::thread t1 (myf1); std::thread t2 (myf2); t1.join(); t2.join(); system("pause"); } |
here is the output,
1 2 3 |
RRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAPress any key to continue . . . |
In C++, there is another fence option, std::atomic_signal_fence
is another fence type that synchronizes between a signal handler and code running on the same thread. This will be explained in another post.
For more information about bidirectional fences, please see Bidirectional Fences Proposal document.
C++ Builder is the easiest and fastest C and C++ IDE for building simple or professional applications on the Windows, MacOS, iOS & Android operating systems. It is also easy for beginners to learn with its wide range of samples, tutorials, help files, and LSP support for code. RAD Studio’s C++ Builder version comes with the award-winning VCL framework for high-performance native Windows apps and the powerful FireMonkey (FMX) framework for cross-platform UIs.
There is a free C++ Builder Community Edition for students, beginners, and startups; it can be downloaded from here. For professional developers, there are Professional, Architect, or Enterprise version.