Multi-threading performance influence in effect

Pages: 12
> The loop can still be optimised away.
I meant, volatile is desirable likely when we need the value back, as in the example for the referenced sum. Otherwise why do you care about a local variable to be take into consideration or not when the compiler does better without it?
Last edited on
> volatile is desirable likely when we need the value back

Volatile objects may be required when they are involved in communication with signal or interrupt handlers. They are inappropriate in almost every other situation.


> Otherwise why do you care about a local variable ...

Normally, we would want the optimiser to do its job; to perform code transformations that do not change the observable behaviour of the program. We are not concerned about how the computation of a value was performed, as long as we get (the program behaves as if we got) the right value.
1) Supposedly the volatile specifier is merely used for variables in particular types of programs, regarding what you mentioned, when there're handlers (outside of compiler scope) that I also agree. But can you name the category of such programs, please? I don't guess that concurrency/multi-threading is the only one area for that. I wish to be able to recognize situations in my programs that it's needed to use volatile for a variable or set of variables.

2) Your code is a good example on using multi-threading without bringing locks and mutex into play.
Is it a right approach to use std::async and futures only where we're using multi-threading (concurrency) *WITHOUT* shared data, as in the current example, and if there's some shared data we still need to consider locks, mutex and condition variable to handle it?

3) > assert( result = N * (N+1) / 2 ) ; // sanity check
What does this assertion do? I mean as far as I'm concerned, assert has been used to check an equivalent, but there you apply an *assignment*. Then likely you check that if it's done correctly or not. Right!?
> But can you name the category of such programs

volatile objects are typically used when they are shared between normal program code and:
1. Code in a signal handler
2. An interrupt service routine
3. Memory-mapped i/o devices
etc.


> I don't guess that concurrency/multi-threading is the only one area for that.

Normal multithreaded code is *not* an area for the use of the volatile qualifier.

volatile access does not establish inter-thread synchronization. ...
Standard volatile semantics are not applicable to multithreaded programming
https://en.cppreference.com/w/cpp/atomic/memory_order#Relationship_with_volatile



> if there's some shared data we still need to consider locks, mutex and condition variable to handle it?

Yes (or use atomic operations). This would be required no matter how the multithreading is realised: with std::thread, std::async, native OS threads, whatever...

If a data race occurs, the behaviour of the program is undefined.
https://en.cppreference.com/w/cpp/language/memory_model#Threads_and_data_races



> assert has been used to check an equivalent, but there you apply an *assignment*.

There is a typo in the original code; it should have been an equality comparison instead of an assignment.
assert( result == N * (N+1) / 2 ) ; // sanity check
volatile is needed in this program:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
#include <csignal>
#include <cstdio>

/* volatile */ std::sig_atomic_t x = 0; 
void sigint_handler(int) { ::x = 1; }

int main()
{
  std::signal(SIGINT, ::sigint_handler);
  std::printf("%d ", static_cast<int>(x));
  std::raise(SIGINT); 
  // Compiler does not understand that std::raise(SIGINT) will 
  // call sigint_handler and modify x
  // 
  // Without `volatile` the compiler may optimize out the second access to x
  // and print 0 0 instead of the "expected" 0 1
  std::printf("%d\n", static_cast<int>(x));
}


Demo here (try un-commenting volatile):
https://godbolt.org/z/zGsha48fv
Last edited on
frek wrote:
3) > assert( result = N * (N+1) / 2 ) ; // sanity check
What does this assertion do? I mean as far as I'm concerned, assert has been used to check an equivalent, but there you apply an *assignment*. Then likely you check that if it's done correctly or not. Right!?


Apart from the "=" needing replacing by "==" (which has already been noted - twice) the analytical result does not correspond to the terms being summed.

If you sum the numbers 0,...,N-1 (which is currently being done) you will get (N-1)*N/2
If you sum the numbers 1,...,N (which would make more sense) you will get N*(N+1)/2

At the moment the sum formula doesn't correspond to the sequence being summed. So, choose whichever summation you wish to do and set the analytical expression for the result accordingly.
Last edited on
> 1. Code in a signal handler, 2. An interrupt service routine, 3. Memory-mapped i/o devices
these are hard for me to work on for the time being, including the code mbozzi wrote, but by "typically used when they are shared between normal program code" do you mean when a variable is send or returned from a function to another, similar to our code?

By this I just wish to know where in the code should I have doubts if the compiler optimizes out some variable that is not desired, and then to make the variable volatile.

@lastchance
Yeah, I assume it should have been (N-1)*N/2
by "typically used when they are shared between normal program code" do you mean when a variable is send or returned from a function to another, similar to our code?

No. You're not reading properly. What JLBorges actually said was:

JLBorges wrote:
typically used when they are shared between normal program code and: [a list of other things]


It's when the memory that stores the variable is accessed by something outside the C++ code, which means that the compiler can't know about it, so can't consider it when deciding what to optimize. As in all of the cases listed by JLBorges.
Last edited on
Let me ask the question once again. JL said:
volatile objects are typically used when they are shared between normal program code and:
1. Code in a signal handler
2. An interrupt service routine
3. Memory-mapped i/o devices
etc
.

1) What does "normal program code" mean, please?
2) And, which of the above cases does the code he wrote here refer to (in which it's needed to add volatile for the variable sum preventing the compiler from optimizing it away)?

My last question is:
3) Can we rely on the number given out by std::thread::hardware_concurrency and launch only at most that number of threads if we aim to using multi-threading?
Last edited on
as well as, I'm rather confused about:
1
2
        first_val += N_PER_THREAD;
futures.push_back(std::async(std::launch::async, accumulate_range, first_val, first_val + N_PER_THREAD));

and guess it should have been:
1
2
futures.push_back(std::async(std::launch::async, accumulate_range, first_val, first_val + N_PER_THREAD));
        first_val += N_PER_THREAD;
Topic archived. No new replies allowed.
Pages: 12