I don't know much about low-level instructions, or locking mechanisms. But today I suddenly thought about a locking mechanism that works if you have an atomic swap instruction.
Say the lock variable is initially 0 (unlocked). Threads that want to lock it create a local variable initialized to 1, then perform the atomic swap with the lock variable. If their local variable is now 0, they have successfully acquired the lock. If it's 1, they failed and must try again later. When the lock owner is done with the lock they just set it back to 0.
I'm assuming this works - since this is pretty simple I expect that it already exists and is well-known, so, what is it called and how often is it used?
That sounds like a mutex. The method of acquiring a lock by repeatedly checking it and then setting it as soon as it becomes clear is called spinlocking. It's used in any code which (1) uses mutable non-thread-local data and (2) needs to be thread-safe or re-entrant.
On x86 the xchg (exchange) instruction is atomic. Maybe all reads and writes are. There is also a lock prefix to prevent cache problems that can occur with concurrent accesses (even if they are atomic).
Many processors support an atomic test-and-set instruction; those that don't (eg. x86) provide equivalent functionality via an atomic (bus-locked or cache-locked) exchange instruction.
Wow, ok. I was aware of test-and-set but I was not aware that it should actually be called set-and-test. I was under the impression that test-and-set was an atomic operation that first tested the value and then set the value after the test.