what does LEA directive do here?

Dec 28, 2010 at 10:06am
I disassembled a simple C program:
1
2
	j = i = 2;
	s = ++i + j++;

the assembly code's here:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
 80483c4:       55                      push   %ebp
 80483c5:       89 e5                   mov    %esp,%ebp
 80483c7:       83 e4 f0                and    $0xfffffff0,%esp
 80483ca:       83 ec 20                sub    $0x20,%esp
 80483cd:       c7 44 24 1c 02 00 00    movl   $0x2,0x1c(%esp)
 80483d4:       00 
 80483d5:       8b 44 24 1c             mov    0x1c(%esp),%eax
 80483d9:       89 44 24 18             mov    %eax,0x18(%esp)
 80483dd:       8b 44 24 18             mov    0x18(%esp),%eax
 80483e1:       8b 54 24 1c             mov    0x1c(%esp),%edx
 80483e5:       8d 04 02                lea    (%edx,%eax,1),%eax
 80483e8:       89 44 24 14             mov    %eax,0x14(%esp)
 80483ec:       83 44 24 1c 01          addl   $0x1,0x1c(%esp)
 80483f1:       83 44 24 18 01          addl   $0x1,0x18(%esp)
 80483f6:       b8 e4 84 04 08          mov    $0x80484e4,%eax
 80483fb:       8b 54 24 1c             mov    0x1c(%esp),%edx
 80483ff:       89 54 24 08             mov    %edx,0x8(%esp)
 8048403:       8b 54 24 14             mov    0x14(%esp),%edx
 8048407:       89 54 24 04             mov    %edx,0x4(%esp)
 804840b:       89 04 24                mov    %eax,(%esp)
 804840e:       e8 e1 fe ff ff          call   80482f4 <printf@plt>
...

I know some basic knowledge of assembly language. The LEA directive here makes me confused.Can anyone give me a hand, explain this for me?
Thanks for your attention on this page and trying help me!!
Merry Christmas!
Dec 28, 2010 at 10:22am
I rememer it meaning Load Effective Address.
But I'll have to get my assembly book out for a detailed explanation
Dec 28, 2010 at 12:00pm
According to google LEA stores the address of the second operand in the first operand.

But this is hardly the point of your question. I can't answer directly. Just to mention, there is no such thing like the "assembly language". Depending on the hardware of your platform and the compiler tool-chain, the syntax of the mnemonics and the semantics of the operations may differ significantly. As a suggestion, seek what assembler comes bundled with your C++ compiler. Both tools should have a reference manual (pdf, chm, etc.) or some other documentation. Consult the reference manual of the assembler and the reference manual of your target device (such as CPU or MCU, which you may have to download from internet) as they are the best sources of relevant information.
Dec 29, 2010 at 6:29am
Maybe, I should give some information of the environment I've done the job in:
My OS:
2.6.35.6-45.fc14.i686
GCC version info:
1
2
3
4
5
6
7
Using built-in specs.
COLLECT_GCC=/usr/bin/gcc
COLLECT_LTO_WRAPPER=/usr/libexec/gcc/i686-redhat-linux/4.5.1/lto-wrapper
Target: i686-redhat-linux
Configured with: ../configure --prefix=/usr --mandir=/usr/share/man --infodir=/usr/share/info --with-bugurl=http://bugzilla.redhat.com/bugzilla --enable-bootstrap --enable-shared --enable-threads=posix --enable-checking=release --with-system-zlib --enable-__cxa_atexit --disable-libunwind-exceptions --enable-gnu-unique-object --enable-linker-build-id --enable-languages=c,c++,objc,obj-c++,java,fortran,ada,lto --enable-plugin --enable-java-awt=gtk --disable-dssi --with-java-home=/usr/lib/jvm/java-1.5.0-gcj-1.5.0.0/jre --enable-libgcj-multifile --enable-java-maintainer-mode --with-ecj-jar=/usr/share/java/eclipse-ecj.jar --disable-libjava-multilib --with-ppl --with-cloog --with-tune=generic --with-arch=i686 --build=i686-redhat-linux
Thread model: posix
gcc version 4.5.1 20100924 (Red Hat 4.5.1-4) (GCC) 

I compiled the source code with gcc's default options.
thanks guestgulkan & simeonz trying help me!
there's nothing serious about this problem, I just interested in this!
Due to my poor English, it's hard for me to search the manual pages, and I am lack of knowledge in other related fields.
I'll pay attention on this thread, waiting for more useful infomation...
Dec 29, 2010 at 6:29am
Maybe, I should give some information of the environment I've done the job in:
My OS:
2.6.35.6-45.fc14.i686
GCC version info:
1
2
3
4
5
6
7
Using built-in specs.
COLLECT_GCC=/usr/bin/gcc
COLLECT_LTO_WRAPPER=/usr/libexec/gcc/i686-redhat-linux/4.5.1/lto-wrapper
Target: i686-redhat-linux
Configured with: ../configure --prefix=/usr --mandir=/usr/share/man --infodir=/usr/share/info --with-bugurl=http://bugzilla.redhat.com/bugzilla --enable-bootstrap --enable-shared --enable-threads=posix --enable-checking=release --with-system-zlib --enable-__cxa_atexit --disable-libunwind-exceptions --enable-gnu-unique-object --enable-linker-build-id --enable-languages=c,c++,objc,obj-c++,java,fortran,ada,lto --enable-plugin --enable-java-awt=gtk --disable-dssi --with-java-home=/usr/lib/jvm/java-1.5.0-gcj-1.5.0.0/jre --enable-libgcj-multifile --enable-java-maintainer-mode --with-ecj-jar=/usr/share/java/eclipse-ecj.jar --disable-libjava-multilib --with-ppl --with-cloog --with-tune=generic --with-arch=i686 --build=i686-redhat-linux
Thread model: posix
gcc version 4.5.1 20100924 (Red Hat 4.5.1-4) (GCC) 

I compiled the source code with gcc's default options.
thanks guestgulkan & simeonz trying help me!
there's nothing serious about this problem, I just interested in this!
Due to my poor English, it's hard for me to search the manual pages, and I am lack of knowledge in other related fields.
I'll pay attention on this thread, waiting for more useful infomation...
Dec 29, 2010 at 7:09am
Just out of curiousity - Is it me or is the assembly in the original post back to front?
I normally expect destination, source not source, destination

Is that the way GCC/GDB displays it??
Dec 31, 2010 at 1:45pm
It's AT&T format assembly...
This format is more popular under the Linux environment!
Dec 31, 2010 at 1:45pm
It's AT&T format assembly...
This format is more popular under the Linux environment!
Dec 31, 2010 at 1:46pm
It's displayed by 'objdump -d'.
It's AT&T format assembly...
This format is more popular under the Linux environment!
Dec 31, 2010 at 2:16pm
lea edx, [ecx+4]
we have not only a new addressing mode, but also a new opcode, namely lea.
The mnemonic lea stands for load effective address.
For both the mov and lea instructions,
the right operand can use one of the three addressing modes to address memory.
In the case of the mov instruction, the result is to access this location in memory and reference the
value stored there. In the case of an lea instruction, there is no memory reference, instead
the address itself is loaded into register.


The LEA Instruction Revisited
Earlier on, we looked at the lea (load effective address) instruction which had the form:
lea edx, {memory address}
Here {memory address} can be any of the possibilities in the chart we just gave. The result
is to compute the address and place the address into the target register (edx in the above
example). The lea instruction does not actually reference memory, since only the address is
stored, not the contents of the address in memory. This means that lea can be used for all
kinds of interesting operations:
lea edx, [eax+ebx] # edx = eax + ebx
lea edx, [eax+4] # edx = eax + 4
lea edx, [eax+4*ebx] # edx = eax + 4*ebx
lea edx, [eax+4*ebx+2] # edx = eax + 4*ebx + 2
Dec 31, 2010 at 6:32pm
Just like guestgulkan, I have not met this order of operands before.

I think the compiler utilizes the address arithmetic of the CPU in order to perform integral calculations in fewer steps. Actually the method is used/abused quite frequently. I'm going to indulge in some reverse engineering of the code, as it is impressive what the compilers sometimes do under the hood.

1
2
 80483c4:       55                      push   %ebp
 80483c5:       89 e5                   mov    %esp,%ebp
According to google, the base pointer is anchored so that it can be used for referring to the local variables with constant offsets. Here it seems to be used only to guarantee that the stack pointer is properly restored on return from the function.
 
 80483c7:       83 e4 f0                and    $0xfffffff0,%esp
Aligns the new stack frame on the 16 byte boundary. I think mainly to improve cache performance, but ISO C++ may have some alignment requirements of its own.
 
 80483ca:       83 ec 20                sub    $0x20,%esp
Simultaneously allocates space for the new stack frame and for the parameters to called functions, e.g. printf, in one go. This way no time is wasted to allocate stack space on a per-need basis.
1
2
 80483cd:       c7 44 24 1c 02 00 00    movl   $0x2,0x1c(%esp)
 80483d4:       00
Moves the value 2 into one of your variables. If the compilation is code for code, as I believe it is, then this is the initialization for i. 0 is probably for alignment, or to help the CPU somehow. I am not sure.
1
2
 80483d5:       8b 44 24 1c             mov    0x1c(%esp),%eax
 80483d9:       89 44 24 18             mov    %eax,0x18(%esp)
Copies the value of i into j. (Actually copies the result from the assignment of i into j.) That should be slower to directly initializing with 2, but you probably made a debug build and the assembly must map to the source code perfectly.
1
2
 80483dd:       8b 44 24 18             mov    0x18(%esp),%eax
 80483e1:       8b 54 24 1c             mov    0x1c(%esp),%edx
Loads i and j into registers.
 
 80483e5:       8d 04 02                lea    (%edx,%eax,1),%eax
Performs address arithmetic, computing the address location i + j + 1: i assumes the role of the base address, j assumes the role of the index, added (with no scaling) to the base address, and 1 is constant displacement (offset), which is added to the now indexed address. This is like computing i[j].*1 in some pseudo C++ notation.
 
 80483e8:       89 44 24 14             mov    %eax,0x14(%esp)
Now, the computed 'address' is stored into s. Something like cheating.
1
2
 80483ec:       83 44 24 1c 01          addl   $0x1,0x1c(%esp)
 80483f1:       83 44 24 18 01          addl   $0x1,0x18(%esp)
The side effects from the postfix and prefix increments. It must be a debug build indeed. (The value is directly 3.)
 
 80483f6:       b8 e4 84 04 08          mov    $0x80484e4,%eax
Loads the address of your printf format string.
1
2
3
4
 80483fb:       8b 54 24 1c             mov    0x1c(%esp),%edx
 80483ff:       89 54 24 08             mov    %edx,0x8(%esp)
 8048403:       8b 54 24 14             mov    0x14(%esp),%edx
 8048407:       89 54 24 04             mov    %edx,0x4(%esp)
Copies the values of the substitution arguments to printf (i and s). A lot of stack space is still hanging unused. Probably for variables in the function body after returning from the call, or for the arguments of the following calls.
1
2
 804840b:       89 04 24                mov    %eax,(%esp)
 804840e:       e8 e1 fe ff ff          call   80482f4 <printf@plt>
Shoves the format string on the stack and calls.
Topic archived. No new replies allowed.