Compiler optimization question

Nov 10, 2021 at 5:37pm
Can anyone help me explain why one a compiler might perform the following optimization. Assuming all variables are declared and initialized appropriately.

1
2
3
4
//Before
for (j=0; j< 8192; j++)
   for (i = 0; i < 200000; i++)
      x[i][j] = x[i][j] + 1;


1
2
3
4
//After
for (i=0; i < 2000000; i++)
   for (j=0; j< 8192; j++)
      x[i][j] = x[i][j] + 1;


Please be specific if you can, I'm not sure why it does this.
Nov 10, 2021 at 5:57pm
IF it did it, then it is smart enough to realize that the [j]s are sequential in memory (fewer page faults, etc). doing it the other way, you skip 2million *sizeofthing bytes every iteration, causing you to need a different page of memory each iteration... ouch!
if the dimensions of i were much smaller, like 100, ... would it do it? I dunno ... at some point its small enough hops not to matter.

these loops can be condensed to 1d if you just increment everything. depends on the full dimensions of x, not just the touched dimensions?
Last edited on Nov 10, 2021 at 5:59pm
Nov 10, 2021 at 6:21pm
> these loops can be condensed to 1d if you just increment

If x is a two-dimensional array, and the second dimension is known,
the optimiser knows how to condense it (and use SIMD instructions).

https://gcc.godbolt.org/z/rK84ca7sb
Nov 10, 2021 at 6:59pm
@GroovyJack

You seem to really be struggling with some basic stuff here. I realize it is new and non-obvious, but I have to wonder: did you really not get any good notes for your course? Surely the curriculum has listed reading materials for you.

If you would just read through them once or twice you could skip a lot of these really basic questions.


I’m not trying to be unkind here. If you want to develop software, one of your most important skills will be the ability to be given a hefty manual and access to the code and familiarize yourself with them in just a few days.


[edit]
And by all means, ask questions. Just show us that you’ve put some effort into understanding them before asking. Right now you are just using us like a software version of Cliff Notes.
Last edited on Nov 10, 2021 at 7:00pm
Nov 10, 2021 at 7:37pm
Try it:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
#include <iostream>
#include <ctime>
using namespace std;

int main()
{
   static int x[20000][8192]{};
   int i, j;

   clock_t t1 = clock();

   for (j=0; j< 8192; j++)
      for (i = 0; i < 20000; i++)
         x[i][j] = x[i][j] + 1;

   clock_t t2 = clock();

   for (i=0; i < 20000; i++)
      for (j=0; j< 8192; j++)
         x[i][j] = x[i][j] + 1;

   clock_t t3 = clock();
   
   cout << "First : " << double( t2 - t1 ) / CLOCKS_PER_SEC << " s\n";
   cout << "Second: " << double( t3 - t2 ) / CLOCKS_PER_SEC << " s\n";
}


First : 3.713 s
Second: 0.271 s



But the compiler must be brighter still, because if you add -O3 optimisation it gives
First : 0.202 s
Second: 0.056 s

Last edited on Nov 10, 2021 at 7:41pm
Nov 10, 2021 at 9:23pm
and the 1d runs about 5% faster on my computer in O3

1
2
3
4
5
	int * ip = (int*)x;
   do
   {
	    (*ip)++; ip++;   
   } while(ip < (int*)x+20000*8192);  


and interestingly as a benchmark, the above is only a small amount (not even %s) slower than a benchmark
static char x2[20000][8192]{};
--
memset(x2,1,20000*8192);
--
which does the same thing if they are zero initialized as in the example.
Last edited on Nov 10, 2021 at 9:31pm
Nov 11, 2021 at 10:03am
one of your most important skills will be the ability to be given a hefty manual and access to the code and familiarize yourself with them in just a few days


I wish! IMO the days of having manual(s) that document the system/languages etc are long gone. These days it's all on-line. I well remember back in the 1980's having shelves full of manuals for Data General, DEC, Systime etc. Everything you ever wanted to know about these systems was in these manuals - somewhere. The original DOS came with manuals that even contained bios code listings. IMO sadly these are no longer supplied. If you're lucky you get a 'leaflet' explaining how to plug it in (or whatever) and links to on-line info - which often is neither detailed or complete.
Nov 11, 2021 at 2:24pm
it varies. My complaint is most docs throw the house on you.
me: I want to use your library to compute A*B
their example: get the prime factorization of A and B, load it into ourweirdcontainer, call prep for multiply ... and 2 pages of code later ... they just cannot seem to give simple get started level examples in most systems.
Topic archived. No new replies allowed.