f is the only one you can do better than the built in language. the built in one does too much for integers, so an optimized integer version is worth writing; there are several examples on here. All the others are on the cpu circuits and too fast to try to beat out with code.
an example I played with a few years back. you can remove the higher powers for even more speed if not needed; all they are good for is like 2^60 type values, you can probably stop at 16 for most practical uses. (16 gets you x^31 or lower powers).
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
longlong ipow(longlong p, unsignedlonglong e)
{
constlonglong one = 1;
constlonglong *lut[2] = {&p,&one};
registerlonglong result = 1;
result *= lut[!(e&1)][0]; p *= p;
result *= lut[!(e&2)][0]; p *= p;
result *= lut[!(e&4)][0]; p *= p;
result *= lut[!(e&8)][0]; p *= p;
result *= lut[!(e&16)][0]; p *= p;
result *= lut[!(e&32)][0]; p *= p;
result *= lut[!(e&64)][0];
return result;
}
the same code can be used for doubles to integer powers.
if you want double powers, use the built in one, then you need the extra steps that it does.