assembly language is just one step above machine code. |
With the exception of the high level assemblers, the two are interchangeable. Assembler is machine code in human readable form.
C++ still has C inside it. C was
originally written to be an assembler. Most of the statements in C translate into single assembler/machine code instructions.
The "high level" you are thinking of are such topics as the C runtime library functions, C++ classes, etc. but that does not mean C constructs are higher level than assembler. Don't forget, the whole reason C was developed was to write UNIX, where the first version of UNIX was written in assembler which was then translated by hand into C, resulting in nearly the same thing on the original platform.
If a programmer chooses to use higher level techniques, then you'd be correct. If a programmer chooses to write to the lowest level, there's hardly any difference with assembler with the rare exception of those features on a particular CPU that aren't easily accessible. In that fact you are partially correct, but you gave the impression that assembler is "much closer to the metal", and it isn't. C is very close to the metal.
I know this from 4 decades of using it.
"The syntax is technically high level" |
Some of the syntax is.
For example:
1 2 3 4 5 6
|
register int a;
for( a = 0; a < 100; ++a )
{
//....
}
|
This forms a loop which when translated into the final machine code is likely the same as that written in hand assembler. "a" will be held in a register (likely would be without the keyword 'register'), and both the test against 100, the increment of 'a' and the conditional jump to the beginning of the loop will all be as efficient as a hand written assembler of the same code.
It is rare that optimizations could be implemented in assembler/machine code which aren't available to the C author and the optimizer.
The "theoretical" notion that one can still write assembler code that beats "any modern compiler" has come to a point to be so rare that it doesn't actually exist. It did back when RAM was $2000 per megabyte (not gigabyte), and optimizers had no room to breathe, but in the modern era you can try as much as you like and still never actually manage to do better than
just match what the compiler generates from C. This has been demonstrated so many times in the last 10 years that it simply makes the assertion you believe in a myth at this point. There is one caveat - the author must write knowing how the machine works, just as the assembler programmer must do. In other words, when a programmer learns to view C (and that of C within C++) as an assembler, the results are almost always a match.
In the 80's I could easily match and subsequently exceed the compiler's output in about 30% of the high performance code I wrote. By the late 90's that was down to maybe 5%. At this point if I could match the compiler's voodoo optimizations 0.1% of the time I'd be doing well. Exceeding the compiler's output is rare, and usually related to something like vectorization of floating point calculations on bulk data (and even then, not usually, just occasionally).
Put another way, the only way a modern assembler programmer can hope to succeed at beating the compiler's output is to be able to make assumptions about the CPU and the process being written that the compiler can't make. This boils down to a theoretical point about languages themselves, the amount of information made available to the compiler in a particular expression.
An example of that, outside the scope of assembler, is the std::sort algorithm when compared to the C qsort algorithm. Both implement a form of introsort (quick sort with alternative tails). However, in the C version, the only way to implement a universal comparison function is by using a function pointer. The C library's qsort function calls the comparison function via pointer to function. There is no further optimization available to the compiler because the information about the comparison is blocked (made opaque to the optimizer) through the pointer to that function.
In the C++ version, however, what is passed can be evaluated as a function object. That can resolve to the pointer to a function, as it does in C, but there is an opportunity in the language construct to provide more information that merely a pointer to a function, it can describe the function. This means the optimizer is not "blinded" by an opaque pointer, but by the code itself. The optimizer can then choose to emit the comparison linline, avoiding a function call, making the std::sort faster than C's qsort.
This is an example of that analysis used to create languages, where that information we can encode in our statements is "understood" by the compiler for what it means, not just what it says.
What I'm saying is that the primary potential "block" to a language like C to match or exceed hand written assembler is not the "level" of the language, but of the information available within the language constructs. For C, that was initially design to be a match for the CPU's (generic) feature set.
For high level languages, the "level" of the language does not necessarily mean the result is slower. It can sometimes mean the result is
faster, because the information can be used to form an output from what amounts to an "AI" kind of machine which can "out think" a human attempting the same thing in assembler, relative to the behavior of a particular CPU.
This means that, occasionally, what appears to be "far from the metal" due to the multiple-level nature of a language like C++ can actually produce "at the metal" results, and commonly a match to the best assembler a human has sufficient time to write.
Viewed from a different perspective, what I'm saying is that an assembler programmer must assume they have more knowledge and skills for a particular algorithm's implementation than the many PhD's who contributed to the modern C/C++ compiler and the development of the C/C++ language itself.
It isn't likely.