Why Assembly Sucks

Pages: 12

Assembly, my old friend. The kind of friend you shoot in the back when he turns around, and you know it's for the best.

The logic with coding in assembly isn't that much different than C++ with some exceptions. When coding, it feels like I'm taking forever to make the simplest of logic, since everything takes *5 as many steps as any line in C++.

But I could live with that. Coding in Assembly wasn't too hate-filled. But then we reach the biggest issue with Assembly.... IT'S INCONSISTENT TO THE POINT WHERE IT FEELS LIKE YOU'RE CONSTANTLY LEARNING DIFFERENT VERSIONS OF ASSEMBLY.

All assignments so far have been using global _start. Latest assignment wants to output, and professor used instead global main. So I think, "whatever, same logic and coding!" Well, apparently not. Nothing I'm coding is even giving me a hint of working. Variables don't even change values when I directly mov a new value in them from a register. Nothing is working, I have barely any idea why, and my professor hasn't even responded to my email after almost a full day now. The assignment itself isn't difficult logically, but impossible to code when your code doesn't work.

Assembly needs to be taught more as a concept with more attention to details of how it works. All this class is a bunch of assembly functions thrown at us, and then assignments telling us to use the functions.

Professor is a nice guy, but couldn't teach if his life depended on it. Just a barrage of information with barely any context.

Side note, we're going over big-o notation and stuff this semester with my other CS class.

Well, I end my rant now. Also, the assembly set was for Ubuntu. The window's subsystem for linux and ubuntu app were very handy to set up quickly without a virtual machine.

jonnin (11331)

yes, assembly was never meant for software development at the application level. Its used when you translate a language (if you write compilers, for example) and for the very occasional high performance tiny routine (as often as not, to use a cpu instruction that can't be expressed in your language, like "how many bits are set in this 64 bit int"). The last time I used assembly frequently was when it was the only way to get a high res timer (cpu clocks to secs), the only way to do endian flipping in one clock cycle, and a couple of other things of that nature.

yes, its inconsistent. you will see that the code to inline it in c++ also varies a fair amount across compilers.

It sounds like the class could be better. It may be the kind of class that is pushed onto the lowest ranking prof or draw straws or something. The generation where everyone knew how to code in it directly is aging off, and finding a prof that knows that stuff may be hit or miss now at many schools.

helios (17506)

Eh. Ball peen hammers make poor screwdrivers. There are things you can do in Assembly that are otherwise completely impossible in pure C or C++. It's unfortunate your course seems to instead have taken the "just implement in Assembly the things you would normally only do in a high level language", which is what every hack with zero experience in real world Assembly comes up with in five minutes.

A few examples of things I've done:
* Generate x86 or x86-64 code from BPF opcodes[1] and load into kernel.
* Coroutines with context switching.
* Use specific instruction sets. AES-NI in particular.

[1] A Berkeley Packet Filter program is a series of instructions that describe the logic used to separate network packets into two sets (usually "allow" or "block"). BPF programs have limited memory and can only jump forward, so they're FSMs, not Turing machines.

againtry (2313)

I suspect if you wanted some jobs at Intel et al or even something exotic at a quantum computing startup of some kind a smattering of Assembly experience might be useful to get past the first interview.

Perhaps even a sweatshop in Shenzhen chugging out microprocessor knockoffs to be flogged on Ebay or similar might take you on.

zapshe (1831)

It may be the kind of class that is pushed onto the lowest ranking prof or draw straws or something.

There use to only be one professor who taught it, the professor teaching it for me now claims he learned assembly from the previous professor when he was a student! So a total of two professors who teach assembly.

Eh. Ball peen hammers make poor screwdrivers. There are things you can do in Assembly that are otherwise completely impossible in pure C or C++.

I could see that. But yea, just programs you could easily write in C++ that's being made into assembly. In fact, some of these programs I wrote in C++ first and then just wrote the assembly code to follow the same logic.

It's made me a bit confused though. I was looking at my C++ assignment for the other class for a whole 10 minutes with no idea what to do. My brain was still thinking in assembly.

It's useful in a lot of places - like when breaking a program down to its assembly to see what it does. It just sucks the way we're coding with it and learning.

I don't even really understand assembly. I have no idea what the difference between global _start and global main is, no idea how to get things to output to the execution screen, no idea, how many of the functions operate (like cvtsi2ss??? I use it but have no idea what it REALLY does), and I don't even understand how these macros work. Trying to figure out macros right now since it's the last part of this assignment, but I'm doing it early and hoping he'll go over it in class. He already did, but in the usual, "these exist, let's move on" way. Can't understand how to use a macro like that.

Last edited on

helios (17506)

I have no idea what the difference between global _start and global main is

I don't know what assembler you're using or what you're targeting, but I would guess there's no difference, and "global <identifier>" just declares a symbol that's accessible from outside the object file, like extern in C/++.
How the entry point for the program is decided, though, I don't know. Presumably it's either the first function in the input or the entry point is stated when you run the assembler from the command line.

no idea, how many of the functions operate (like cvtsi2ss??? I use it but have no idea what it REALLY does)

cvtsi2ss in particular is an instruction.
https://www.felixcloutier.com/x86/cvtsi2ss
Instructions are implemented in the CPU. Nowadays they're implemented by executing tiny microcode programs consisting of a series of micro-operations, which are the most fundamental actions that the CPU can perform. Microcode is dependent on the microarchitecture of the CPU (how the CPU is designed internally) and is invisible and inaccessible to the programmer.
In CPUs from 20 or so years ago, and possibly in some rudimentary modern CPUs (I'm not really sure if there are any new CPUs that don't use microcode), instructions would instead be implemented directly on the CPU's circuitry.
In a way, machine code is still kind of a high level language, and modern CPUs act as machine code interpreters that translate in real time machine code instructions into microcode instructions.

I don't even understand how these macros work

Assembly macros are pretty much the same as C macros. They're just text manipulation functions. Some assemblers have slightly more powerful macros, but they're still just code generation tools.

Last edited on

zapshe (1831)

I don't know what assembler you're using or what you're targeting, but I would guess there's no difference

I thought the same at first, but changing from one to the other would break my programs. Can't be certain why, but I assume they're probably different.

cvtsi2ss in particular is an instruction

Well, I knew that much, just have no idea what it does! Too lazy to read documentation at the moment.

In a way, machine code is still kind of a high level language

I was thinking that as well when we started using some of these instructions. I figured nothing would be done for you, but there are plenty of instructions that are there to make life easier.

Assembly macros

I just wasn't sure how to code them. I kept getting errors which I realized was because assembly apparently makes several instances of the macro when it's called and there can be naming conflicts, needed to use the "." in order to create a local label.

Assembly is a bit of a pain, but when you get it working it makes sense and is cool. But the amount of details you have to deal with is just too much to ever consider coding with it.

jonnin (11331)

no idea how to get things to output to the execution screen

I don't remember exactly but I think you put a pointer into a register and call in interrupt and it dumps that memory location as a c-string as ascii text to the console. interrupt 21 maybe?

of all the things... you should have coded hello world first, maybe google hello world in intel assembly... it will say how to do it. if you can't output, you can't do anything useful!

But the amount of details you have to deal with is just too much to ever consider coding with it.
this is why people started making other languages, on like day 3 after building the first computer...

Last edited on

zapshe (1831)

Yea! The latest assignment he gave us a template where it'll output the variables he wants to the screen. I'm not fully sure how it's done, but with the template I can output to the screen by simply copying what he did.

Before that, I'd use debugger commands like x/dw &Answer to look at important variables. You could write the commands in a .txt file and run them all at once with a command if there were many variables to look at.

this is why people started making other languages, on like day 3 after building the first computer...

I really wanted to just code in C++ and then break it down to assembly and turn those in. I tried to see if it would work, but obviously not. Different assembly set and different who knows what else. Every time I look up something in assembly, I get assembly code that is WAY different than the assembly I'm coding. Makes finding help for specific issues almost impossible without posting a question.

Last edited on

jonnin (11331)

the compiler should make assembly that will assemble. It has a bunch of comments and weirdness that your program would never have, and it has access to libraries and macros and more, and it will do things differently than a human, because it can't make assumptions that a human can know to be true for their code (eg you can know that your 64 bit int holds byte sized values and fits in AL, but the c++ compiler can't always know that). It knows instructions that you haven't covered yet. Its not going to be a way to cheat, but it should actually assemble.

we did 2 or 3 programs directly into the computer's ram, via the old 'debug' program for DOS, before we started writing text file programs. :)

Last edited on

helios (17506)

The more usual problem is that the compiler's assembler's syntax is completely different to the syntax of the assembler one is using, so the output is basically useless without a full rewrite.
This is particularly a problem with GCC, since it spits out AT&T syntax for GAS, and no other assembler (IIRC) uses AT&T.

mbozzi (3910)

I'm pretty sure GCC can output Intel syntax (the flag is -masm=intel)

zapshe (1831)

The more usual problem is that the compiler's assembler's syntax is completely different to the syntax of the assembler one is using

The exact issue. Along with jonnin's point that the assembly code is completely ridiculous. Even if it would compile with the assembly set we use, it would be so obvious I cheated and could/would never have written that code.

Assignments are too easy to risk cheating anyway, but they tend to be time consuming. I wrote the C++ program for the sort we needed to use, did it in 14 lines with my regular programming style (so could have been much shorter). Took well over 30 to code in assembly.

However, I did like that in assembly, switching the value of two array elements felt more intuitive then in C++. In assembly, you just throw both into a register, then mov the values into the elements. In C++ (without the swap function so that I could follow the logic while recoding it in assembly), it takes 3-4 lines and the order of swapping matters so you don't overwrite a value you still need.

It just felt more convenient to throw the values into a register, it was minimalistic in terms of logic. At worse, it's an extra line of assembly since in C++ you can initialize the value of the temp variable with the needed value. Of course, C++ wins out all together if you use the simple 1 line swap function.

helios (17506)

I mean, you can write a swap exactly like that if you want:

template <typename T>
void swap(T &a, T &b){
    auto old_a = a;
    auto old_b = b;
    a = old_b;
    b = old_a;
}

Cubbi (4774)

helios wrote:
This is particularly a problem with GCC, since it spits out AT&T syntax for GAS, and no other assembler (IIRC) uses AT&T.

every assembler on every Unix flavor uses the AT&T (Unix) syntax. On x86, the surviving Unixes are Linux, *BSD, MacOS and Solaris, but more importantly, anything non-x86 (arm, power, sparc, what-have-you) has the same look and feel. It's Intel/MS that is the odd one out, besides being unintuitive and verbose.

Last edited on

jonnin (11331)

c++ can swap bigger things though. not everything fits in a register :)
when you get to the FPU you will have some fun, if you do get there. The FPU has large doubles (80+bits usually instead of c++ 64 bit format) and its own play area.

zapshe (1831)

I mean, you can write a swap exactly like that if you want

I meant more of the logistics of simply dumping the values in a register and then assigning them as needed. In C++, you're making variables while in assembly it's just thrown into a register that you were going to have access to anyway. Just feels right. I'd rather never touch assembly again though.

when you get to the FPU you will have some fun, if you do get there. The FPU has large doubles (80+bits usually instead of c++ 64 bit format) and its own play area.

We'll see :((

dhayden (5795)

_start vs main: _start is the actual entry point to the program. The linker will do whatever the OS requires to go there. main is the C or C++ entry point. The C/C++ runtime library contains a _start function that calls the constructors of global variables and sets up the argc and argv parameters for main, then calls it.

Too lazy to read documentation at the moment.

There's the root of all your assembly problems. RTFM, my friend. RTFM.

In CPUs from 20 or so years ago, and possibly in some rudimentary modern CPUs (I'm not really sure if there are any new CPUs that don't use microcode), instructions would instead be implemented directly on the CPU's circuitry.

You've got it backwards. Original processors were hard coded. As they became more complex, they started using microcode. I think this was around the mid 80's. Then in the early 90's Hennessey and Patterson showed that computers could be made faster by employing less complex instruction sets, not more, and RISC was born. This meant going back to direct execution (no microcode) and resulted in a big leap in processor speeds at that time. Today I think every processor is RISC except the x86, which brute forces speed at the cost of enormous numbers of transistors. In particular, ARM, which powers most small devices like phones and tablets, is a RISC processor.

The big insight that led to RISC was the realization that, for the most part, people no longer wrote assembly code, compilers did. Compilers can easily deal with whacky time saving measures in the instruction set that people would find hard to keep straight.

I haven't written any assembly code in years, but I do find it helpful in debugging. Sometimes the symptom shows up in a library routine and you have to slog through the assembly to see what the problem is. In those cases, you have to know assembly obviously.

Also, I think it's very helpful to understand what the processor is doing under the hood. There are times when this helps you write better code.

helios (17506)

You've got it backwards. Original processors were hard coded. As they became more complex, they started using microcode.

Isn't that what I said? Sure, it was 35 rather than 20 years ago, but the point is that older processors were hard-wired, and then they became microcoded.

Today I think every processor is RISC except the x86, which brute forces speed at the cost of enormous numbers of transistors.

Well, x86 is RISC as well, but presents a CISC API (machine language).

dhayden (5795)

I'm saying that processors went from hard coded, to micro-coded, and back to hard-coded via RISC.

Pages: 12

C++

Forum

Why Assembly Sucks