class CMatrix{
public:
explicit CMatrix( int rows, int cols ){
m_pArray = newfloat[ rows * cols ];
m_nRows = rows;
m_nCols = cols;
}
~CMatrix(){
delete [] m_pArray;
}
float& operator()( int i, int j ){
return m_pArray[ (i + j * m_nRows) ];
}
floatoperator()(int i, int j) const{
return m_pArray[ (i + j * m_nRows) ];
}
float *GetPr(){
return m_pArray;
}
private:
float *m_pArray;
int m_nRows, m_nCols;
};
I defined Operator() to access Array in CMatrix.
Operator() is defined as inline function.
So I thought that access time by using Operator() is similar access time by Pointer.
But Using Operator() is 2 times slower than Using Pointer (in Release Mode).
This is part of Comparing Direct Access and Operator Access.
Probably because operators are implemented as function calls. This would result in the CPU possibly having a few cache misses as it follows all the pointers around the place, which would cause a small difference in the time taken. Also keep in mind that pushing the argument values onto the stack and copying them to the function would add a bit of overhead. However, how much of a time difference are we talking about here? None of these things should have any noticeable impact in normal operations. Then again, you are repeating a LOT of times...
operator 1.65023
direct 0.398849
operator 0.399388
direct 0.402081
operator 0.399494
direct 0.401038
I simply execute the tests several times.
Note how only the first time it takes a long time to execute.
You could also invert the test, putting direct access before operator. The result would be similar, only the first time it would take a lot of time.
That means that something is happening the first time, that it does not occur later (the engine was cold)
So your test is not adequate and the measures are not relevant.
1. Do not use the wall clock (gettimeofday(), clocks in <chrono>) to measure performance. These measure elapsed time and not processor time.
2. Accessing a large chunk of memory for the first time involves a penalty (cache misses); this can distort the results.
With this added, just before the tests
1 2 3
// to avoid distortion in results due to cache misses
// when accessing the chunk of memory for the first time
std::uninitialized_copy( psrc, psrc+(nRows*nCols), pdst ) ;
and using std::clock() (approximate processor time):