Oct 26, 2017 at 6:21pm UTC
I have two versions the same code (10 million iterations):
First version.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42
bool HArrayFixRAM::insert(uint* key, uint value)
{
if (pLastContentCell + RowLen > pLastContentCellOnLastPage)
{
ContentPage* pLastContentPage = new ContentPage();
pContentPages[ContentPagesCount++] = pLastContentPage;
if (ContentPagesCount == ContentPagesSize)
{
reallocateContentPages();
}
pLastContentCell = pLastContentPage->pContent;
pLastContentCellOnLastPage = pLastContentCell + MAX_SHORT;
}
//insert value ============
uint keyOffset = 0;
uint headerOffset = key[0] >> HeaderBits;
ContentCell* pContentCell = pHeader[headerOffset];
if (!pContentCell)
{
pHeader[headerOffset] = pLastContentCell;
pLastContentCell->Type = (ONLY_CONTENT_TYPE + KeyLen);
//fill key
for (; keyOffset < KeyLen; keyOffset++, pLastContentCell++)
{
pLastContentCell->Value = key[keyOffset];
}
pLastContentCell->Value = value;
pLastContentCell++;
return true ;
}
return true ;
Execute time: 850msec
Second version:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43
bool HArrayFixRAM::insert(uint* key, uint value)
{
if (pLastContentCell + RowLen > pLastContentCellOnLastPage)
{
ContentPage* pLastContentPage = new ContentPage();
pContentPages[ContentPagesCount++] = pLastContentPage;
if (ContentPagesCount == ContentPagesSize)
{
reallocateContentPages();
}
pLastContentCell = pLastContentPage->pContent;
pLastContentCellOnLastPage = pLastContentCell + MAX_SHORT;
}
//insert value ============
uint keyOffset = 0;
uint headerOffset = key[0] >> HeaderBits;
ContentCell* pContentCell = pHeader[headerOffset];
//COMMENTED HERE, BLOCK WILL BE EXECUTED ALWAYS
//if (!pContentCell)
{
pHeader[headerOffset] = pLastContentCell;
pLastContentCell->Type = (ONLY_CONTENT_TYPE + KeyLen);
//fill key
for (; keyOffset < KeyLen; keyOffset++, pLastContentCell++)
{
pLastContentCell->Value = key[keyOffset];
}
pLastContentCell->Value = value;
pLastContentCell++;
return true ;
}
return true ;
Execute time: 405msec
It's really strange. By logic should be vice versa. Second piece of code on one block execution longer but in twice works faster !
Any ideas fellows ?
Last edited on Oct 26, 2017 at 6:25pm UTC
Oct 26, 2017 at 7:05pm UTC
compare the assembly and cpu clocks per call, not the C++ or wall clock time.
I don't see anything obvious... could be something compiler side like removal of the condition tripped the heuristic for qualification to inline. Could be something cool going on with the cache strategy. Did you test it a bunch of times? Could be the computer decided to check to see if the internet was still there in the middle of the first one.
Oct 26, 2017 at 9:52pm UTC
I am not sure, but probably I realized what happens.
This condition change semantic of algorithm in the root.
If we have condition
if (!pContentCell)
then we need execute all iterations before, because we never know, previous iterations will changed state of array cell or no.
If there is no this condition, then many iterations could be paralleled before. Because anyway last iteration will overwrite array cell.