I'm trying to find out how CPU cache coherence is handled by pthread mutex implementation on SMP kernels with multiple CPUs, specifically on core2 chips. For example, when unlock happens, the implementation has to make sure that ANY previously made cache writes are visible to other CPUs; similar for reads.
I wonder what the exact implementation is in terms of CPU instructions used? Does it use MFENCE, LOCK, CPUID, etc?
Thanks in advance!