As one for wanting to understand how things actually work underneath , I propose the question to those who possess higher level knowledge and can explain to this to me.
A while back I was just searching on stackoverflow about how to calculate the trigonometric functions sine and cosine faster and one person brought up the idea of using a lookup table for precomputed values.9 upvotes were given to the following response:
"A precomputed table will almost certainly be slower than just calling sin because the precomputed table will trash the cache"
The way he responded was as if everyone knew what he was talking about.
If true , why is that so and what does he mean "trash the cache"? Does it apply for all cases of functions? I read on wikipedia about cache but I still didn't understand why a lookup table would be slower (something about cache hits and misses?), I mean as a programmer these are the sorts of assumptions we make , well for me I do that.
The FPU typically has instructions to compute sine, cosine, and tangent in hardware. This will obviously be faster than anything you can do in software. It has nothing to do with any cache.
When the clib must do it in software, there are a number of options available, but the most common (IIRC) are using an algorithm called CORDIC, maybe Taylor series, and, yes, lookup tables.
There are several hardware caches that exist, any one of which can be clobbered by doing things. They are designed to optimize program flow through (a) branches and (b) memory access. For the most part, you can safely ignore them.