Operator speeds

Jul 10, 2010 at 8:55pm
I don't care that much about the speed of various operators, but I'm making a game engine, so speed could quickly become an issue. My first question of speed: which is faster, addition or subtraction (particularly with the ++ and -- operators)?
Jul 10, 2010 at 8:57pm
Neither.
Jul 10, 2010 at 9:07pm
Ok, another question.

Is there a faster way to draw one surface to another than nested for loops?

Also, is it faster to use a local variable, or members of a class from a pointer? Or are they the same speed..?
Jul 10, 2010 at 9:22pm
PiMaster wrote:
Is there a faster way to draw one surface to another than nested for loops?

Doing it in one loop perhaps.

PiMaster wrote:
Also, is it faster to use a local variable, or members of a class from a pointer? Or are they the same speed..?

I'd say the first. The second is slower because you need to dereference the pointer.
Jul 10, 2010 at 10:12pm
which is faster, addition or subtraction (particularly with the ++ and -- operators)?


Use pre-increment or pre-decrement ( --i , ++j, etc) espesially in loops, as the opposite (i++, j++) can involve the creation of temporary data and a copy.

Even if your compiler optimizes this away it's still a good habit to be in.
Jul 10, 2010 at 10:38pm
Use pre-increment or pre-decrement ( --i , ++j, etc) espesially in loops, as the opposite (i++, j++) can involve the creation of temporary data and a copy.

Only for more complex data types, in all other cases these are mere formalities that result in no additional code.

I'd say the first. The second is slower because you need to dereference the pointer.

Accessing local variables is the same as accessing a class member (dereferencing stack pointer+offset). The member pointer needs to be stored in a register, but often just once (i.e. before a loop, but not necessarily again inside).

There's just one reliable way to tell which of several different methods is faster:
create a suitable testcase and measure the time.
Jul 10, 2010 at 10:49pm
@Athar:

Are you sure?

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
void f(MyClass * p)
{
    int x;
    int & y=p->member;

    //do these two have
    //the same access time?
    x=5;
    p->member=10;

    //I'm not talking about
    //this. This is different
    x=5;
    y=10;
}

EDIT: Or did the OP mean something like this?

1
2
3
4
5
6
7
8
9
10
11
void f()
{
    int x;
    MyClass c;
    
    int * p1=&x;
    int * p2=&c.member;
    
    *p1=5;
    *p2=10;    
}
Last edited on Jul 10, 2010 at 11:01pm
Jul 10, 2010 at 11:09pm
I think he meant the first one.
x=5 is just one instruction (probably mov [ebp-4],5), while p->member=10 takes two (e.g. mov eax,[ebp+something], mov [eax+something],10). However what I meant is that the first instruction has to be executed just once. All following member accesses for p just take one instruction for the time the compiler decides to keep the address in eax.
Jul 10, 2010 at 11:22pm
Ah, I see! Though, I can imagine cases where both instructions have to be executed because eax has to change between the class-members access operations...

Mmmm... Since you are familiar with assembly, would you also happen to know how references are implemented? Are they implemented as pointers or does the compiler hardcode the address of the referenced object every time he encounters a reference? (I think the latter would explain why they are so much more limited than pointers) Because, now that I look at it again, maybe y=10; is not different from p->member=10; above, hahaha :D
Last edited on Jul 10, 2010 at 11:25pm
Jul 10, 2010 at 11:39pm
Ah, I see! Though, I can imagine cases where both instructions have to be executed because eax has to change between the class-members access operations...

Yeah, there's that. But if the functions that are called are short and can be inlined, the compiler can still take special care that eax (or another register) is not used for other operations if it considers that to be profitable.

References are implemented as pointers, so it's basically just the syntax that is different. In fact, if you change your example so that p is passed by reference, the compiler would still produce exactly the same code.
Jul 11, 2010 at 12:12am
Is there a faster way to draw one surface to another than nested for loops?
If you can make such assumptions as "both surfaces have the same pitch", "both surfaces have their channels in the same order", or "I don't need alpha blending", you can usually optimize pixel copy operations by calling memcpy(), or by copying whole pixels into 32-bit integers.

But even not using any of those assumptions can be fast enough. By combining very careful coding with thread pools, I wrote an alpha blending routine once that had a theoretical top speed on my machine (Core 2 Duo 1.86 GHz) of ~500 640x480x32 blits per second, or 6.5 nanoseconds per pixel. Four times as fast as my original version. It's all just a matter of being careful with what you write and doing a lot of testing to see what performs better.
Topic archived. No new replies allowed.