I use a profiling tool. There are some open source ones, but I use a commercial tool called "AQTime". It allows me to profile performance and memory usage etc of my code.
I let the compiler create the assembly code, insert a loop in the assembly code. At the start of the program I use clock() for the number of clock ticks since the program start and at the end of the program I return (startvalue-clock())/CLOCKS_PER_SECOND.
I usually use gprof (with the kprof frontend) and valgrind (callgrind actually) with the kcachegrind frontend. Those are for Linux and BSD, though (all Free & Open Source Software).