1. Did you even compile benchmarks/test_speed? The code in the repository doesn't compile. srandom() and random() are not standard functions.
2. The first time I ran it your code didn't even get to run. It crashed because numTrees == 1. in several places you do things like allData[numTrees - 2] without checking if that's a valid operation.
3. Your benchmark code needs some serious reorganization. Please put every test in its own function, don't just shove all the code into main(). It's a pain to follow the data flow like this.
4. Some benchmarks are invalid because you're not really doing anything with the results. The compiler can optimize away code whose results are discarded.
5. rebalance_trees() is incorrect. When you erase() from an std::set (for example, here:
https://github.com/AvniOsmani/ParallelSet/blob/master/parallelSetClass.h#L93 ), the iterator you erased becomes invalidated. You can't then increment that iterator like you do in the for header. I rewrote those sections into this:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39
|
{
auto &set = allData[0];
auto b = set.begin(), e = set.end();
for (auto i = b; i != e;){
if (*i <= rightEndPoints[0]){
++i;
continue;
}
insert_element(*i);
set.erase(i++);
}
}
for(int i = 1; i < numThreads - 1; ++i){
auto &set = allData[i];
auto b = set.begin(), e = set.end();
for (auto j = b; j != e;){
if (*j > rightEndPoints[i - 1] || *j <= rightEndPoints[i]){
++j;
continue;
}
insert_element(*j);
set.erase(j++);
}
}
{
assert(numThreads >= 2);
auto &set = allData[numThreads - 1];
auto b = set.begin(), e = set.end();
for (auto i = b; i != e;){
if (*i > rightEndPoints[numThreads - 2]){
++i;
continue;
}
insert_element(*i);
set.erase(i++);
}
}
|
6. Again, rebalance_trees() is incorrect. I added this line to testData initialization:
1 2 3 4 5
|
for(int i=0; i<numOfTestSubjects; ++i){
auto x = x1 + ( x2 - x1) * (rand() % max_rand) / max_rand;
testData.push_back(x);
testData.push_back(1); //this
}
|
and find_parallel() started returning incorrect results. I believe it's because rightEndPoints is incorrectly estimated between here
https://github.com/AvniOsmani/ParallelSet/blob/master/parallelSetClass.h#L69
and here
https://github.com/AvniOsmani/ParallelSet/blob/master/parallelSetClass.h#L86
7. This is less important, but your number generation code is incorrect. It's generating numbers between -1000 and 0, not between -1000 and 1000.
That's all I have for now, I'll continue looking at this tomorrow. I still don't quite understand what's happening with the timing; when I commented out the call to rebalance_trees() in insert_parallel(), the measured time for insert_parallel() went
up significantly, which obviously makes no sense. I think the compiler is reordering instructions.