So, lately I've been getting into hooking since my job is directly related to it. This question is for the x86 gurus.
Suppose there's a 3 byte instruction somewhere; say, an opcode plus an offset. If I were to do a LOCK XCHG from a different thread at that position with a 32-bit value, would this situation be possible?
1. Thread A fetches opcode.
2. Thread B overwrites memory atomically.
3. Thread A fetches overwritten offset.
4. Summoning of nasal demons complete.
What do you mean? You want to atomically patch the memory address. Yes maybe you may need to patch it using 0xF4EB then patch Address - 0xA then do the jump, this way you not only successfully hook it with you own callback but it is easy.
^
You are changing the behavior of the code by putting all those NOPs...
@helios
Are you sure that situation is even possible ? You can say the samething in the situation of a move statement, if the value is not aligned it would take 2 fetches to get the value. What if inbetween those fetches an atomic write happens and changes that value ? So you now have 2 halfs of old and new.
I'd try creating a test maybe, have 3 threads running the code that gets modified and then have another thread modifying the value atomically between two instructions and offsets in such a way that if the new offset is used with the old opcode it'll bring the code into a bad state.
for example when I hook X86SwitchTo64BitMode we need to do a atomic patch via using InterlockedExchange. Yes so it is possible.
Anyway we are not leaving it unfinished lol. if you are so much worried about it simply use a 0xE9 but make sure to build a stub before jumping back to "working" instructions but make sure to perform the calculation in the callback . It would work.
Can someone explain to us layman's why the race condition is necessary? I've dabbled in modifying assembly code and WHEN it gets modified never seems to be an issue.
I guess my question could be better asked as: "Why can't this offset address be modified before Thread A fetches the Op Code?". Or "Why does it have to be concurrent?".
I can tell you right away that my question stems from my comparative lack of experience. Even when I've had to work around run-time integrity checks I can get away with putting a break\pause point at the beginning of the function in question, I've never had to modify anything right when it is called. Thanks for the response.
Why can't this offset address be modified before Thread A fetches the Op Code?
Because user processes don't have such a fine-grained control of the scheduler or the CPU.
"Why does it have to be concurrent?"
The simplest form of remote hooking (i.e. hooking from another process) involves injecting a DLL by starting a remote thread. At this point, there's several ways of guaranteeing atomicity. One of them involves successively suspending and resuming all threads until none of their instruction pointers are within any of the address ranges that interest you.
It doesn't have to be concurrent; it's just the less dumb solution.
Even when I've had to work around run-time integrity checks I can get away with putting a break\pause point at the beginning of the function in question, I've never had to modify anything right when it is called.
Well, that's the thing. With few exceptions, you can only assume that the function you're hooking can be synchronized if it's yours. If that's the case, there's no point in dabbling in run time hooking. Just modify the source code and be done with it.