parallel OMP with many "new" operators

Jul 14, 2012 at 1:31pm
Hello All,

I have a code that uses multi threads using OMP.
However, Inside the code I allocate too many memory using "new".

I know that the memory allocation using new must be done in serial. Therefore I am not getting a good performance out of the multi-threaded program.

I thought about allocating a big memory only once, and then advance the pointer every time I need to allocate a small memory.

For example, if I need to allocate chunks of 7 doubles:
1
2
3
4
5
main_memory = new double [10000];
double* x1 = main_memory;
main_memory+= 7;
double* x2 = main_memory;
main_memory+= 7 


However, I do not know how to handle the memory deallocation. Beside, this is not thread safe method because main_memory is shared by all threads.

Anyone has an idea how to solve this. May be there is a boost library that does this, which I am not aware of.

Thank you in advance.
Jul 14, 2012 at 1:54pm
However, I do not know how to handle the memory deallocation.

Answer: delete[] main_memory;
Jul 14, 2012 at 2:05pm
Do note that you've changed the pointer main_memory such that it's pointing somewhere else. When you get to deleteing the memory, you need to provide a pointer with the same value as the pointer you were given from new.

For example,

1
2
3
4
5
6
7
8
9
10
int main()
{
double* main_memory = new double [10000];
double* const deleteMe =  main_memory; // Now safe to change main_memory becasue we keep track of the value with deleteMe
double* x1 = main_memory;
main_memory+= 7;
double* x2 = main_memory;
 main_memory+= 7 ;
delete[] deleteMe;
}
Jul 14, 2012 at 2:19pm
Thank you for your help,
What I meant is that I need to deallocate some of the small chunks not the complete main_memory.

Also, it turned out that it is not thread safe solution as well.

Thank
Jul 14, 2012 at 11:45pm
Beside, this is not thread safe method because main_memory is shared by all threads.


Wait until all threads are done with it, then deallocate it.
Jul 15, 2012 at 1:13am
Wait until all threads are done with it, then deallocate it.


No, I mean the allocation itself is not thread-safe, because main_memory itself is shared by all the threads. So racing condition can easily happen
Jul 15, 2012 at 1:22am
I know that the memory allocation using new must be done in serial. Therefore I am not getting a good performance out of the multi-threaded program.
I've never heard of this. Are you certain memory allocation is what's consuming most of your time? Have you run the program through a profiler to find its hot spots?
Last edited on Jul 15, 2012 at 1:22am
Jul 15, 2012 at 10:26pm
I've never heard of this. Are you certain memory allocation is what's consuming most of your time?


The heap uses sets of locks to control the memory allocation. Therefore threads have to wait for the lock to release in order to allocate a memory. The following url discuss this issue and gives a solution on windows intel compilers.

http://software.intel.com/en-us/articles/avoiding-heap-contention-among-threads/

But I am using linux with GCC compilers. So I was searching for a similar solution may be in Linux


Thank you
Jul 15, 2012 at 11:24pm
I'm aware of that, but the way you said it seemed to imply that management had to be done in serial at the application level. I.e. that no more than one thread could safely call malloc() at the same time.

What you're suggesting in the OP isn't necessarily any faster than what malloc() does, since you'll still need to synchronize to move your pointer. Rather than this, would it be possible to reorder your allocations so you only need to allocate memory at the very beginning of the thread?

I ask again, though: are you certain this is what's consuming most of your time?
Jul 16, 2012 at 12:25pm
This is the only implementation that I am aware of which is available for free:
http://code.google.com/p/gperftools/?redir=1

There are also more commercial libraries that do the same thing.
Jul 17, 2012 at 6:06pm

Thank you for your help guys,

I ask again, though: are you certain this is what's consuming most of your time?

Yes, I ran a profiler and this seems to be the issue.

This is the only implementation that I am aware of which is available for free:
http://code.google.com/p/gperftools/?redir=1

Do you mean the tcmalloc library there?

thank you
Mina
Jul 19, 2012 at 2:11pm
Any help please?

Thanks
Jul 19, 2012 at 2:18pm

What I meant is that I need to deallocate some of the small chunks not the complete main_memory


It is not possible. You can only deallocate it once and fully.



Any help please?


Why not allocate a private memory region for each thread? I mean - don't share memory between threads.
Last edited on Jul 19, 2012 at 2:19pm
Jul 19, 2012 at 4:33pm
Do you mean the tcmalloc library there?


Yes.
Topic archived. No new replies allowed.