How compiling and running a program differs from the programming

I wrote a program yesterday that eventually runs quadrillions of calculations. In an attempt to make it as fast as possible (even by thousandths) I rewrote the equation such as

countl++

500,000 times before it hit several "if" statements to do other things and see if it needed to return to do the countl's again. Now, I know, that it only cuts out a few if statements so it shouldn't improve performance by much, but instead, it turns out it takes about 4 times longer for the program to go through the calculations than it did with a simple if and goto loop. Why/How is it faster to use more commands to loop/repeat the commands than it does to simply repeat the commands themselves in the programming? I thought loops were just for ease of programming, not for actual speed.
I'm not following you.

Can you show an example of the before and after code?
initial coding example

1
2
3
4
5
6
7
8
9
10
11
12
loopme:

countl++;

if (countl % 1000 == 0)
{ 
cout << countl << endl;
}

if (countl != 40000000)
goto loopme;


new code that takes longer
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20

loopme:
countl++;
countl++;
countl++;
countl++;
countl++;
countl++;
countl++;
countl++;
countl++;
countl++;

if (countl % 1000 == 0)
{ 
cout << countl << endl;
}

if (countl != 40000000)
goto loopme;


Now that's not the actual code (it'd be way too much to show here) but I believe this shows the basic issue. I'd assume that since the second code doesn't loop as many times it would finish the same work quicker, but I'm finding it takes four or so times longer for the second code to run than the first code. My actual code goes from one "countl++" to the new code with 500,000 "countl++"'s before the loop.
Um... side note... ever heard of the += operator?

countl += 500000; //Increases countl's value by 500000.

Although I'm currently ridiculously sleepy, I would guess that the first code running faster is due to a compiler optimization which for whatever reason didn't happen for the second code.

-Albatross
Last edited on
well "countl" is only part of my program its actually preceded by two other equations each time, I'm just showing the "countl++" here to show the basics of what I'm doing.
How are you benchmarking this? I can't imagine those would take any different time. The compiler should be optimizing this:

1
2
3
4
5
6
7
8
9
10
countl++;
countl++;
countl++;
countl++;
countl++;
countl++;
countl++;
countl++;
countl++;
countl++;


into this:

 
countl += 10;


And in that event the two snippits would be near identical.

This makes me think you are doing speed tests in a debug build with optimizations turned off -- in which case analyzing the performance is a complete waste of time because it's not built for performance.
Okay, the full equations I'm using are

1
2
3
4
k = 30903 * (k & 65535) + (k >> 16);
j = 18000 * (j & 65535) + (j >> 16);
	
countl++;


Marsaglia "mother of all" random number generator with a period of 2^59

My original code was the code I just put up. My subsequent coding had those three lines 500,000 times. Each time countl == 1 billion it would cout the value of countl. That's how I was benchmarking it, and the difference was so substantial that it was abundantly clear there was a big difference to the strong disadvantage of writing those three lines 500,000 times instead of using a loop.

My main question is why would there be a big disadvantageous difference? I would think that each extra command (such as an if statement or loop statement) would be extra processing time and by eliminating those extra statements for 500,000 sets of calculations that it would (albeit, minutely) speed things up.
In the second case, where you are repeating the line a gazillion times, you are blowing the I cache in the processor. Your loop is small enough that all instructions can fit in the processor's internal cache. In the second case it can't. Accessing RAM is much, much slower than accessing the cache.

NB - for speed you should consider using the pre-increment operator instead of the post-increment operator. It can't hurt, but it may help the compiler generate more efficient code.
Thank you for that information, that explains everything. Though I did notice something else that I thought was odd, the second program with the 500k repeats of code is about 50 megs on my hard drive but when running the program, the program only takes up less than 1 meg of memory so it would seem it may be running that code in real time off the hard drive, which would, as I understand it, make it exponentially slower too.
Last edited on
No. You are probably building with debug symbols in the executable, in which case most of it is debug.
What's a "debug symbol"?
Names of functions, variables, and such so that when you debug the program you can see names associated with the values instead of just arbitrary memory addresses.
Topic archived. No new replies allowed.