I have some old code that I use for ad hoc profiling of algorithms. It is Microsoft/Windows specific and uses the CPU Time Stamp Counter (maybe Intel CPU specific, haven't checked). I would like to have a more protable version of this, but before I spend time on researching this I was woundering if anyone already had somthing that they are willing to share.
/*****************************************************************************\
| This program is distributed in the hope that it will be useful, |
| but WITHOUT ANY WARRANTY; without even the implied warranty of |
| MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. |
\*****************************************************************************/
#define STRICT
#define WIN32_LEAN_AND_MEAN
#include <windows.h>
#include <iostream>
/*****************************************************************************\
The Time Stamp Counter is a 64-bit register present on all x86 processors
since the Pentium. It counts the number of ticks since reset.
\*****************************************************************************/
inlineunsigned __int64 GetCycleCount()
{
_asm _emit 0x0f
_asm _emit 0x31
}
class Timer
{
unsigned __int64 start_cycle;
public:
unsigned __int64 overhead;
Timer()
{
overhead =0;
Start();
overhead = Stop();
}
void Start()
{
start_cycle = GetCycleCount();
}
unsigned __int64 Stop()
{
return GetCycleCount() - start_cycle - overhead;
}
};
/*****************************************************************************/
int main()
{
Timer timer;
timer.Start();
Sleep(1000);
unsigned cpu_speed = (unsigned) (timer.Stop()/100000);
std::cout << "Timer overhead: " << timer.overhead << \
" clock cycles." << std::endl;
std::cout << "CPU Spped: " << cpu_speed/10 << "." << \
cpu_speed%10 << " MHz." << std::endl;
return 0;
}
Timer overhead: 285 clock cycles.
CPU Spped: 2415.4 MHz.
Don't have an answer for you, but I can say two things:
1) the tsc is intel specific;
2) the tsc cannot/should not be used for high precision timing according to intel.
The reason for #2 is that on mobile processors and most newer processors that do dynamic cpu frequency scaling,
the tsc does not increment at a constant rate.
You might check into using the HPET instead. On Linux, it is accessed via clock_gettime(), but I don't know about other OSes.