I have a program I've written using C++. The task it completes falls into the "embarrassingly parallel" realm, so I should be getting a pretty good performance gain using multi-threading over no threading.
I am using thread_group to create threads. All the thread call the same function member within the same class. The at end I use the join_all to wait until all thread finish.
The problem is that if I let the thread call the executable which does the same work, the performance scale very well.
I don't know what cause the function call in the thread has worse performance than call executable, which are processes instead of threads.