For example, a C program is converted to assembly, then there is microcode, which does each instruction in assembly, which then becomes machine code. |
No, microcode, if it exists, is an implementation detail of the CPU. The CPU executes instructions in an "instruction set" and the instruction set has a 1-to-1 correspondence with assembly language. In other words, an assembler translates assembly directly into instructions that the CPU can execute, and all (published) instructions that the CPU executes can be written in assembly.
I believe the steps for compiling a program are like this. Note that interpretted languages are different.
1. The compiler reads the program and, through a lot of work, converts it to an intermediate representation.
2. It the applies optimizations to the intermediate representation.
3. The intermediate rep is converted to CPU instructions. CPU-specific optimizations get applied here. The result is written out as an object file.
4. After all the source code is compiled, the linker combines the object files, along with any libraries that are included. Mostly what the linker does is resolve external references. In other words, if file1.o refers to int a that was defined in file2.o, then the linker figures out where int a is located and fixes the references in file1.o so they refer to the right location. The result is the executable program.
5. When you ask to run the program, there may be additional fixing that's needed. For example, initialized data may be decompressed. Memory references in the program may need to be adjusted to reflect the actual location in memory where the program runs, dynamic link libraries may be need to be loaded and calls to them may need to be bound.