From the example above, it seems the fabsf() function is redundant because it performs very slow and without it the basic C code still works very well (lighting speed IMO). Do you know why?
fabsf is not standard as far as I know and I think it only works on float. You use double so it will have to convert between double and float a lot if you use it in your program.
fabs is not the same as negation so it's not strange that fabs is slower.
Check your compiled code. It's possible that the compiler removed negation completely and you've only measured the time it takes to perform 100000000 reads and writes to i in the first case, but didn't know how to remove the call to fabs().
When I make both fVal and fResult volatiledoubles, my results are exactly identical across three compilers and two platforms.