Hard spheres - check for overlap

Pages: 12
I always move a single particle at a time, there are no velocities.


That's magic!
multithreading the checks would be nice, its a flat speed up and has no thread complexity issues (you just break the data into say 4 chunks, eg 4 equal lists of sphere locations) and do the same code over the chunks, getting a flat 4x speedup (assuming typical machines have 4 cores, you can mix this up based on your hardware). Since you are not writing anything (the thread functions would presumably just be 4 boolean results on collisions or not) or changing anything during this time, all the race conditions just go away, you don't need mutex or any of that stuff.

it literally becomes just 4 lines that look very much like this (actual thread library used changes this very, very slightly, eg it may be create thread instead of begin thread).

threadvariable/init code line or two (boilerplate from examples on web);
beginthread(function, data1, priority/flags);
beginthread(function, data2, priority/flags);
beginthread(function, data3, priority/flags);
beginthread(function, data4, priority/flags);

if you want to take a crack at that idea.
Now if you tried to place spheres in threads, that would give you a headache.. you could try to place 2 at the same time that would hit each other but pass the checks because neither is there yet... that sort of thing may be best avoided if you don't have time to get it right. You can still place them 1 by one but check collisions multiples at a time.

as for how you do things in C++ ... c++ has a <list> built into the language,but personally I avoid those most of the time in favor of <vectors> unless you absolutely need the ability to insert things in the middle of an ordered group frequently. Whatever you are doing, odds are c++ has a built in container that can do it for you -- containers are done to death in the language/ tools. I thought you were low on time to modify it now, but if you want to do some sort of list or major modifications... let us know if you need help

Last edited on
PhysicsIsFun wrote:
I tried to insert your tipps into my routine, but the crude test you suggested leads to slightly more computing time:

The whole point of the quick test was to avoid lots of multiplies - you've just put a whole lot more in!
if(rx*L>2*R || ry*L>2*R || rz*L>2*R)


What is all this about ...
4*(R/L)*(R/L)
L is being used as a lengthscale throughout. Just work in non-dimensional variables, where L is 1.
Or put this in a single variable (diameter_squared?) - I bet the values of R and L don't change after they are first set.
Last edited on

Branch prediction is great but it can get overwhelmed with so many. it only supports a few conditions per tight loop.

Consider avoiding most if your conditionals as per this example, see if the approach gives better speed on your compiler. continue is a jump, if is a jump, reduce these where you can (eg rewrite if blah continue to if!blah{stuff} instead?) This code is counter intuitive; it does much more work but it probably still executes faster. Ive had solid results with it in several similar cases.

multiply in the FPU is relatively cheap but as lastchance is saying avoid those too, can you refactor the math to compare differently? Its probably not helpful anymore; but once long ago killing the sign bit yourself with & was >> faster than abs. You could try it ... not sure what modern compiler does with it.

//this function checks wether a chosen sphere u overlaps with any other sphere
bool overlap(vector<particle> &pos, size_t u, const double R, double L){

double rx, ry, rz;
double sx[2],sy[2],sz[2];
for(size_t j=0; j<pos.size(); j++){
if(u==j) continue;
//calculate shortest distance between spheres (periodic boundaries!)

sx[0]=abs(pos[u].x-pos[j].x);
sy[0]=abs(pos[u].y-pos[j].y);
sz[0]=abs(pos[u].z-pos[j].z);
sx[1]=1-rx; //periodic boundaries
sy[1]=1-ry;
sz[1]=1-rz;
rx = sx[(sx[0]>0.5)];
ry = sy[(sy[0]>0.5)];
rz = sz[(sz[0]>0.5)];

if(rx*L>2*R || ry*L>2*R || rz*L>2*R) continue;
if(rx*rx+ry*ry+rz*rz < 4*(R/L)*(R/L)){
return true;
}
}
return false;
}


rx*L>2*R
is the same as rx > (2*R)/L (for positive values of L)
and (2*R)/L is a constant that you can compute outside the whole loop, I think
Last edited on
jonnin wrote:
multithreading the checks would be nice, its a flat speed up and has no thread complexity issues (you just break the data into say 4 chunks, eg 4 equal lists of sphere locations) and do the same code over the chunks, getting a flat 4x speedup (assuming typical machines have 4 cores, you can mix this up based on your hardware). Since you are not writing anything (the thread functions would presumably just be 4 boolean results on collisions or not) or changing anything during this time, all the race conditions just go away, you don't need mutex or any of that stuff.

it literally becomes just 4 lines that look very much like this (actual thread library used changes this very, very slightly, eg it may be create thread instead of begin thread).

threadvariable/init code line or two (boilerplate from examples on web);
beginthread(function, data1, priority/flags);
beginthread(function, data2, priority/flags);
beginthread(function, data3, priority/flags);
beginthread(function, data4, priority/flags);

if you want to take a crack at that idea.
Now if you tried to place spheres in threads, that would give you a headache.. you could try to place 2 at the same time that would hit each other but pass the checks because neither is there yet... that sort of thing may be best avoided if you don't have time to get it right. You can still place them 1 by one but check collisions multiples at a time.

as for how you do things in C++ ... c++ has a <list> built into the language,but personally I avoid those most of the time in favor of <vectors> unless you absolutely need the ability to insert things in the middle of an ordered group frequently. Whatever you are doing, odds are c++ has a built in container that can do it for you -- containers are done to death in the language/ tools. I thought you were low on time to modify it now, but if you want to do some sort of list or major modifications... let us know if you need help


I did not understand much of the first part (sorry :D), but wouldn't a multithreading of my code to make use of the 4 kernels of my cpu cause problems on a machine that does not have 4 kernels?

To the last part: Yeah, I don't want to 'waste' time, but if it's necessary to get acceptable simulation times....
I would like to implement the neighbor list, because my prof. suggested it and it seems to be one of the standard methods for this problem (illustrated in my book on molecular dynamics simulation and in various papers).

I don't know how to do it best in c++.
One classical approach seems to be a symmetric NxN array that stores a 1 at index (i,j), if particle i is close to j. My particles have radius R, that means the interaction distance is 2R. Now I am defining a 'shell' of radius 2R+d around all of my particles. Then I check which pairs (i,j) of particles are inside of each others shells and create my array according to this.

So it goes like this:
-create array
-create displacement variable for each particle, that stores the total displacement from the last array update

-move a particle, say i
-add the displacement to displacement variable
-if (displacement > 0.5 d) {recalculate the array}
-check overlaps, using only the particles whose index are 1 in the i-th row of the array

lastchance wrote:
The whole point of the quick test was to avoid lots of multiplies - you've just put a whole lot more in!

Yeah, I did not really get your example, or how to incorporate it otherwise in my code.


What is all this about ...
4*(R/L)*(R/L)
L is being used as a lengthscale throughout. Just work in non-dimensional variables, where L is 1.
Or put this in a single variable (diameter_squared?) - I bet the values of R and L don't change after they are first set.

I am in scaled coordinates. My box has a length L, I am using the coordinates s=r/L.
If r goes from [-L/2, L/2]^3, s goes from [-1/2, 1/2]^3.
The overlap condition |r1-r2|<2R transforms to |s1-s2|<2R/L. Feel free to correct me^^
L is fluctuating in each simulation step. I am doing a Monte Carlo simulation at constant pressure, meaning the volume fluctuates.


jonnin wrote:
sx[0]=abs(pos[u].x-pos[j].x);
sy[0]=abs(pos[u].y-pos[j].y);
sz[0]=abs(pos[u].z-pos[j].z);
sx[1]=1-rx; //periodic boundaries
sy[1]=1-ry;
sz[1]=1-rz;
rx = sx[(sx[0]>0.5)];
ry = sy[(sy[0]>0.5)];
rz = sz[(sz[0]>0.5)];

Ahh, I guess the comparison in the brackets yields 1 or 0, right? And this is cheaper than using if-statements?
And it should be
1
2
3
sx[1]=1-sx[0]; //periodic boundaries
sy[1]=1-sy[0];
sz[1]=1-sz[0];

right?
Last edited on
Multithreading on a single core machine is slightly slower as it swaps the threads out of the single core instead of just going thru it. If you have 1 core, don't do this (but it works just fine). With more code you can adapt to the cpu cores available, but for single machine testing like this, just build for your box.

---
That sounds ok. As I said earlier, the idea is all the same... get rid of checking you don't need to do. Loads of ways to do that here.

------
Yes, Boolean expressions become false (zero) or true (one). You can exploit this. It isn't always better, but its something you can check/test easily.

yes I missed that substitution.
Last edited on
I have 4 cores, but I was worried about what would happen, if someone else was trying to run the program.
I will keep this idea of multi threading in my head, but first, I want to try out more "conventional" ways.
So the array idea seems ok? Updating such an array seems expensive, but if it only happens like every 10 to 20 steps, it might be fine.

Even though it takes far more lines of code to implement such a neighbor list than the naive brute computation of all the distances of all the pairs at any time. But I guess more lines of code does not necessarily mean more computational effort.
Last edited on
Try to find an efficient way to express your idea. If you are doing a bunch of computations to assign the array, you are working too hard. You are looking to maintain a cross reference between 2 spheres in a 2-d table. That is great... but now you need to fill in the table. Right back to where we were: comparing every possible combination is not cheap. Filling in the ones that are unpossible to 0 and then checking and assigning the few that could be collisions is cheap. Its the same problem, restated to a new way of storing that info...
But the array is supposed to be that "crude" way of storing which particles are close.
In order to update the array, I have to calculate the distances.
But I would rely on the fact that I don't have to update the array at each iteration.

-set up array with 1's at (i,j), if i is close to j.
for (j=0, j<M, j++){
-choose random particle, say i
-move particle i
-increase displacement
-if (displacement since update < 0.5d)
{calculate exact distances between i and all particles whose number is 1 in the i-th array}
else {upate array row i + calculate exact distances then}

and of course, decide what to do with i in case of an overlap with other particles.
Last edited on
right. Just saying if you find yourself checking everything against everything again, then its probably suboptimal.
ah ok, yeah, it might help that I just have to update one row of the array.

Thanks up until here, I will post my code or give results asap.
Topic archived. No new replies allowed.
Pages: 12