Beginner Posix/pthread Question

Pages: 12
Hi all,
I am trying to figure out how to use pthreads for some of my work. The usage seems fairly clear from the online tutorials I have had a look at. However, I have a more basic question.

As I understand it, the user simply creates the threads and the system attempts to execute them in parallel. There is nothing that stops me from creating many more threads than the number of processors available.

Say, I have a 4 processor system and want to execute the same function func() on the 4 different processors. The two ways i can do this is create 4 threads using pthread_create() and use pthread_join() in main() to wait for these to complete execution. The other option is use pthread_create() to call the function func() 3 times, call func() 4th time in main() and then use a pthread_join() to wait for the 3 threads.

While option is better ? My concern is that main() itself is a thread which might be put to better use rather than just creating and waiting for other threads to finish execution.
There is no guarantee that a given thread will be scheduled on a given processor. The OS decides what's best and deals with SMP/NUMA issues, resourcing all the other applications/drivers already running, memory issues and so on.

The first of your methods is preferred. Don't think about multithreading as running something on a processor. Instead, try to think of solving problems in a parallel way. If you focus on the application domain, while just keeping an eye on the physical, you'll be better off. Parallel programming is hard, although threads and so on are easy enough to understand on their own, so don't make it harder for yourself.
Keep in mind that the CPU count merely says how much actual work you'll be able to do at the same time. Threads in wait state do no work. The OS allocates no CPU time for them until they stop waiting. Thus, you can create as many threads as you want and have them waiting and it'll cost you nothing in terms of computational power (this is the basic principle behind thread pools).
The only extra cost of having the main thread waiting is that you'll have to create one extra thread, which means allocating one more stack; about 1-2 MiB of memory. It's not outrageous, so really just use whichever is easier.
Thanks for the reply. For the moment, I have a two processors system. I was trying to see if I can achieve any speedups for some very simple programs from a tutorial. For 2 threads I could get a 1.5 times speedup. This got me wondering if some of the system resources are being wasted. Perhaps not.
Sometimes, how long the function runs for can be a limiting factor. If just creating the thread takes a significant amount of work compared to the function, you'll see you don't gain as much from parallelizing. Add to this that the threads won't run exactly parallel. For example, thread 1 may start doing useful work only once thread 0 has already done half its workload:

Original workload |--------------------------------------------------------------|

Ideal (100% speedup):
Thread 0 |------------------------------|
Thread 1 |------------------------------|

Realistic (33% speedup):
Thread 0 |------------------------------|
Thread 1 ????????????????|------------------------------|

Worst case (no speedup or possible slowdown):
Thread 0 |------------------------------|
Thread 1 ????????????????????????????????|------------------------------|
Last edited on
Yes, The function has to be sufficiently expensive. Otherwise threads seem to make it slower.
If they're just sat there issueing I/O requests, your app won't go any faster; slower perhaps.
in general, multithreading won't help you unless your application is CPU-bound

as kbw pointed out, if your program is I/O-bound, adding multithreading could actually slow things down

if you have not seen this yet, please read it carefully:

http://pl.atyp.us/content/tech/servers.html

he points out how, if you are not careful, context switches could often kill the benefits of multithreading
Just to follow up on this discussion, how many threads do you recommend I create for a given application. Assume that teh size of the problem is sufficiently large and the threads do not need to communicate with each other.

I have a large array of data structures and I am using threads to perform identical operations on different portions of the array. I am interested in benchmarking the achievable speedups. I am hoping to understand what speedups are achievable before I go ahead and create a real application.

Now, I have a 4 processor system. With 4 threads I get an almost 3.5 times speedup which is not bad. But the code with 8 threads is a little faster than with 4 threads on the same system. this is something i did not expect. Note that the threads just do number crunching and do not require any memory of its own or any communication with each other.

Is there a rule of thumb in the number of threads that one should create ?
No more than one for every core in the system.
Note that some CPUs have something called "hyper-threading", which can make a single core run two threads concurrently. The physical core is then said to have two logical cores. It's up to you whether to count logical cores as physical for the purposes of creating threads.
helios wrote:
No more than one for every core in the system.

To elaborate: think 8 hands and 1 brain. You may be able to work on 8 thing at one time (asynchronously), but it might be just as fast to work on then one at a time instead because your attention is divided. As for multi(quad)-cores, think 4 people each with 2 hands (or something like that...)
Also, any suggestions for a good debug tool for threaded applications ? I am on ubuntu and mostly use ddd for my debugging.
Another noob question.

For thread synchronization I will use a combination of pthread_cond_wait() and pthread_cond_signal() protected by mutex variables. Now, as I understand it, there is no guarantee that the thread calling pthread_cond_wait() will get called before pthread_cond_signal(). So, in case pthread_cond_signal() ends up being called before pthread_cond_wait, that signal will be missed. Is this correct ?

If yes, then clearly it is crucial to get a feedback from the thread which calls on pthread_cond_wait() if it has received the signal before the thread calling pthread_cond_signal() can proceed. Is there any any standard way to do this ?
Valgrind has a tool to detect race conditions. I've never used it, so I can say how good it is.

a combination of pthread_cond_wait() and pthread_cond_signal() protected by mutex variables
Why the mutex? Condition variables are already thread safe.

IIRC, if a signal is sent before a thread is waiting for it, the next thread that waits for it will get through immediately.
your first comment is correct: that signal will be missed

however, your second concern is unwarranted - the thread that is about to wait needs to lock the mutex and check the condition first, before waiting so you should be fine

see if this explanation makes sense to you:

http://stackoverflow.com/questions/5536759/condition-variable-why-calling-pthread-cond-signal-before-calling-pthread-con
Yes, I think i see what you mean. So even if the thread calling pthread_cond_signal() finishes before the thread waiting for the signal, the second thread having checked that certain condition is satisfied can simply proceed without waiting. Also, i meant the following format when i say protected by mutexes

1
2
3
4
pthread_mutex_lock(&mutex); 
while (!condition)
    pthread_cond_wait(&cond, &mutex); 
pthread_mutex_unlock(&mutex);


1
2
3
4
pthread_mutex_lock(&mutex); 
changeCondition(); 
pthread_mutex_unlock(&mutex); 
pthread_cond_signal(&cond)
exactly: given blocks A and B above, only two things can happen:

1. A goes first and upon hitting pthread_cond_wait() and unlocks the mutex, but gets blocked waiting. B then locks the mutex and signals the condition change. A gets unblocked and tests the condition with the while (!condition).

2. B goes first, locks the mutex, and calls changedCondition() and pthread_cond_signal() before A comes up, thus losing the signal - but the mutex is unlocked. A then goes, but checks the condition first, so doesn't even need to wait in this case!

The most important idea is that the condition variable actually signals a possible change in the condition state. It's up to the other thread to do the checking.
Last edited on
I think I am making some progress. Here is another question. I have made a small Condition class which looks like this.

1
2
3
4
5
6
7
8
9
class Condition{
     bool cond;
     pthread_mutex_t m_cond;
     pthread_cond_t c_cond;

    // other useful functions.
};

Condition my_cond;


Now, I want to check if the condition is true, I could take two different approaches.

1
2
3
4
5
6
7
8
9
pthread_mutex_lock(&my_cond.m_cond);
if(my_cond.cond){
     pthread_mutex_unlock(&my_cond.m_cond);
     // do stuff
}
else{
     pthread_mutex_unlock(&my_cond.m_cond);
     // do stuff
}


Or an alternative approach could be a function of the form

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
bool Condition::isTrue(){
     pthread_mutex_lock(&m_cond);
     if(cond){
          pthread_mutex_unlock(&m_cond);
          return true;      
     }
     else{
           pthread_mutex_unlock(&m_cond);
           return false;      
     }
}

if(my_cond.isTrue()){
     // do stuff
}
else{
     // do stuff
}


I am certain the first approach will work. But the second approach seems a lot more cleaner. Are there any obvious problems with the second approach ?
getting condition variables right can be very tricky - I recommend reading this:

https://computing.llnl.gov/tutorials/pthreads/#ConVarSignal

taking what they have, and then refactor carefully until it reaches your code

it's very easy to mess them up
Here is another question. Is it required to protect function calls also by mutexes like variables ?

Say, I have created N threads with the function handle void *func(void *). Now if func() accesses another function say commonfunc(). Should the call to commonfunc() in func() also be protected by mutexes ?

Edit: I just did some tests and i don't think mutexes are needed for protecting function calls.
Last edited on
Pages: 12