the correct use pointers and references

I have the following function, which is tested and works fine:

 
 check_gjk_intersection(struct bd *bd1, struct bd *bd2)


The structure contains of bd.coords only for the moment.

Now I need to correctly point the 2D vector to it with this function:

1
2
3
4
5
6
7
8
9
10
11
void
Polymer::get_mcoord(int32_t index, struct bd *bd1) const
{
    for (int32_t i = 0; i < 8; i++)
    {
        for (int32_t j = 0; j < 3; j++)
        {
            bd1->coord[i][j] = _monomers[index].get_coords()[i][j];
        }
    }
}


and then I use it in the following function which kinda works but incorrectly (intersection function always returns true which is impossible):

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
bool
Polymer::polymer_intersecting_parallel(const Polymer& other)
{   
    const auto nom = number_of_monomers();
    const auto onom = other.number_of_monomers();
    bool checker = false;
    auto bd1 = allocate_bd(); auto bd2 = allocate_bd(); //allocating memory

            for (int32_t i = 0; i < nom; i++)
            {
                if (!checker)
                { get_mcoord(i, &bd1);
                        for (int32_t j = 0; j < onom; j++)
                        {
                            get_mcoord(j, &bd2);
                            if (check_gjk_intersection(&bd1, &bd2))
                            {   checker = true;
                                break;
                            };
                        }
                    }
                }
    free_bd(&bd1); free_bd(&bd2);
    return checker;
}


I am pretty sure I messed up with pointers and references here?

simplify, debug.
first, checker serves no purpose.
for()
{
stuff()
if something, return true.
}
got here, above never happened, return false.
that becomes

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
{   
    const auto nom = number_of_monomers();
    const auto onom = other.number_of_monomers();  
    auto bd1 = allocate_bd(); auto bd2 = allocate_bd(); //allocating memory

            for (int32_t i = 0; i < nom; i++)
            {
                        get_mcoord(i, &bd1);
                        for (int32_t j = 0; j < onom; j++)
                        {
                            get_mcoord(j, &bd2);
                            if (check_gjk_intersection(&bd1, &bd2))
                              return true;
                        }                    
             }
    free_bd(&bd1); free_bd(&bd2); //it would be nice if this were automatic (smart pointer) ... 
//if you do not use a smart pointer to fix this you need to free it on the return true also
    return false;
}

then debug it.
- set up the smallest problem you can where the things do NOT intersect.
- run it, see if you get false.
- if you get true, step through it to see where that happens, what exactly it compared or how it was incorrect.

I don't see anything in your pointers and the only reference I saw was other -- it seems correct on the surface.
you don't have to simplify it of course, its just that much less clutter when debugging. Up to you on that.
Last edited on
It is hard to see due to the [over]use of auto...

But I guess that on line 7 bd1 and bd2 are already pointer? Then on line 12/15/16 you have pointer to pointer which would be wrong, but the compiler should have complained. So it's hard to tell.
Thank you both very much. I did not simplify it because the function was intended for pragma use and I thought it is related with memory management. However as soon as I started to simplify it, I suddenly saw the obvious mistake as always (other.get_mcoord in the second loop). Sorry for that. But now I really need the actual help - I heard that memory management when using pragma operatives are more complex, one must allocate memory for each thread. Could someone maybe show the example of how exactly it should be properly done? Actually it works now as it is, but probably when allocating the memory for each thread it will work even more efficient?Or I am mistaken?

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
bool
Polymer::polymer_intersecting_parallel(const Polymer& other) const
{
    const auto nom = number_of_monomers();
    const auto onom = other.number_of_monomers();
    
    auto bd1 = allocate_bd();
    auto bd2 = allocate_bd();
    
    bool checker = false;

#pragma omp parallel shared (nom, onom, checker)
    {
#pragma omp single nowait
        {
            for (int32_t i = 0; i < nom; i++)
            {
                if (!checker)
                {
#pragma omp task firstprivate(i)
                    {
                        get_mcoord(i, &bd1);

                        for (int32_t j = 0; j < onom; j++)
                        {
                            if ((i==number_of_monomers() -1) && (j==0)) continue;

                            other.get_mcoord(j, &bd2);

                            if (check_gjk_intersection(&bd1, &bd2))
                            {
#pragma omp atomic write
                                checker = true;
                                break;
                            };
                        }
                    }
                }
            }
        }
    }
    free_bd(&bd1);
    free_bd(&bd2);
    
    return checker;
}
I have not done threading via pragmas -- not sure at all what these things do.
but ... memory should be allocated and freed as little as is possible. If you can re-use it (example: you know the max size and it is not so big that allocating it is bad, or if its the same size every time, etc) then do so.
you can lose a TON of time if you call a function in a tight loop and the function allocates memory, does a little work, then frees the memory, it can spend half or more of its time in the memory management!
Another way to handle this is to pass the memory around outside the inner logic. 'here, use this buffer' style. Then you can manage the memory at a higher level and make better decisions than you can do locally where you can't see what anything else is doing.
Maybe those will help you think through it, but I have nothing on how to use these tools optimally.
do you have a real use case for the code that you can see if it seems slow? Or are you trying to wring every last nanosecond out of it? If it isn't too slow for some requirement, and it works, move on.
Thanks. It is a real use case : I am generating tons of the molecules and I need to do them really fast, and the whole code will be used on the supercomputer so parallelism is a must.

So when in parallel, we need to allocate the memory for each thread separately:
1
2
3
4
5
6
7
8
9
10
11
12
    struct bd* ptr_bd1;
    struct bd* ptr_bd2;
    const auto nthreads =  omp_get_max_threads();

    ptr_bd1 = (bd*) malloc(nthreads * sizeof (bd));
    ptr_bd2 = (bd*) malloc(nthreads * sizeof (bd));

    for (int32_t i = 0; i < nthreads; i++)
    {
        ptr_bd1[i] = allocate_bd();
        ptr_bd2[i] = allocate_bd();
    }


and then in the inner pragma cycle we point to each thread's ptr_bd.
Last edited on
OK.
you could allocate a giant block of 2*nthreads and pull pointers out of that instead but I dunno that it helps in any way. If its too slow, profile what takes time, focus on that...
The very complex function I am using is written using the pointers.

That's the only reason why I am using them as well as I have no options.

And I need parallelism. Therefore we have such....situation.
Topic archived. No new replies allowed.