My previous post was exclusively in response to this question:
| can the order of the data store operations (marked with memory_order_relaxed) thus be reordered by the compiler? |
In other words, my response was "yes, not only may the compiler reorder those writes, the CPU may reorder them as well".
The write with std::memory_order_release is different because it follows a code path that contains a memory barrier, which the compiler does understand, and does honor.
| Can you give me an example of what you mean by this? I'm trying to wrap my head around what you said here. |
It's really nothing very complex. If the machine code in thread A contains
1 2 3
|
mem[0] = 42;
mem[1] = mem[0];
mem[2] = 77;
|
the code will behave as expected.
If you have two threads running in parallel such that their instructions are scheduled like this:
1 2 3
|
mem[0] = 42; //Thread A
mem[1] = mem[0]; //Thread B
mem[2] = 77; //Thread A
|
the value of mem[1] will be uncertain, because the propagation of effects across threads is non-deterministic unless the code asks for determinism by using a memory barrier. For example, thread B might mistakenly believe that the value of mem[0] that's in its core-specific cache is up-to-date, even though thread has already committed the value to RAM, or thread A might have delayed sending the value for a few cycles.