Array Performance issue

Forum

Forum
General C++ Programming
Array Performance issue

Array Performance issue

denver2020 (222)

Below code runs at different speed, when array is of huge size.
Which one of these is faster?


struct B {
    int i[32];
};

int Sample1(B* array, int Sizearray) {
    int sum = 0;
    for (int a = 0; a<Sizearray; ++a) {
        for(int b =0; b<32; ++b) {
             sum += array[a].i[b];
        }
    }
    return sum;
}


int Sample2(B* array, int Sizearray) {
    int sum = 0;
    for (int b=0; b<32; ++b) {
        for (int a =0; a<Sizearray; ++a) {
            sum += array[a].i[b];
         }
    }
    return sum;
}

lastchance (6980)

Probably the first one, due to cache performance with the array i[] being in contiguous memory.

Any of:
- using iterators;
- calling std:accumulate()
- using threads or OpenMP
would probably speed it up further.

Last edited on

againtry (2313)

@OP

Which one of these is faster?

Why not time them?

lastchance (6980)

#include <iostream>
#include <ctime>
using namespace std;

// denver2020 code  *****************

struct B {
    int i[32];
};

int Sample1(B* array, int Sizearray) {
    int sum = 0;
    for (int a = 0; a<Sizearray; ++a) {
        for(int b =0; b<32; ++b) {
             sum += array[a].i[b];
        }
    }
    return sum;
}


int Sample2(B* array, int Sizearray) {
    int sum = 0;
    for (int b=0; b<32; ++b) {
        for (int a =0; a<Sizearray; ++a) {
            sum += array[a].i[b];
         }
    }
    return sum;
}

// end of denver2020 code  *****************


void timeIt( int fn( B*, int ), B* A, int n, int &sum, double &t )
{
   clock_t tstart = clock();
   sum = fn( A, n );
   clock_t tfinish = clock();
   t = double( tfinish - tstart ) / CLOCKS_PER_SEC;
}


int main()
{
   const int SIZE = 100000;
   B *A = new B[SIZE];
   for ( int r = 0; r < SIZE; r++ )
   {
      for ( int c = 0; c < 32; c++ ) A[r].i[c] = c;
   }

   int sum1, sum2;
   double time1, time2;
   timeIt( Sample1, A, SIZE, sum1, time1 );
   timeIt( Sample2, A, SIZE, sum2, time2 );
   
   cout << "Sample1: "<< sum1 << " in time " << time1 << " s\n";
   cout << "Sample2: "<< sum2 << " in time " << time2 << " s\n";
   delete [] A;
}

Sample1: 49600000 in time 0.001931 s
Sample2: 49600000 in time 0.027526 s

Sample1 is more than 10 times faster.

Last edited on

denver2020 (222)

Thanks lastchance, for your quick clarification.

jonnin (11497)

you can probably unroll it to a single loop instead of a nested one and gain a microscopic bit of speed by exploiting the simple nature of the struct and c++ memory alignment. But if this is an example from a bigger problem, where the struct has more data, that would not work: it works because the struct is basically nonexistent, making it (in memory) just a 2-d array (this may also depend on struct alignment, but you can toggle that around the loops if it is an issue).

Topic archived. No new replies allowed.

C++

Forum

Array Performance issue