In the CPU, the instructions are carried out in one go and the CPU just stays idle for the rest of the instructions cycles instead of performing the appropriate reads/write at individual cycles. Is this what you meant ?
I'm not sure what that means either. Is cycle-accurate where instructions at executed at about the same rate they'd be executed on the original machine?
Imagine the system has a second chip that runs on the same clock as the CPU but can access memory and operate independently of the CPU. Suppose:
1. At clock 1000 an instruction will execute that takes 4 clocks.
2. The instruction will write to address 0xCD04 at clock 1003.
3. The second chip will read address 0xCD04 at clock 1002.
An instruction-accurate emulator would execute all side-effects of the instruction instantly, then the second chip will see the updated value of 0xCD04, when in a real system it would have seen the old value. The speed of the CPU will still be emulated, and the next instruction will execute at clock 1004.
A clock-accurate emulator would execute instruction side-effects at the correct clocks, like a real system would.