Heap vs Stack iteration on large array.

Forgive me in advance if this question sounds completely asinine to you.

I am trying to understand memory management, and I am using C++ as a tool.

I've read some information on HEAP vs STACK allocation. And a lot of it points to observation/fact that STACK is much faster than heap for allocating data. No dispute there.

My question is why is it faster to access data stored on heap than it is to access stack data?

Here is a bit of test code that will only compile and run in debug via vc++ (USE /STACK:41943040 linker tag ):

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
#include "stdafx.h"
#include <string>
#include <iostream>
#include <malloc.h>
#include <windows.h>
#include <ctime>

using namespace std;

// determines free stack space
size_t stackavail()
{
	static unsigned StackPtr;	// top of stack ptr
	__asm mov [StackPtr],esp	// mov pointer to top of stack
	static MEMORY_BASIC_INFORMATION mbi;			// page range
	VirtualQuery((PVOID)StackPtr,&mbi,sizeof(mbi));	// get range
	return StackPtr-(unsigned)mbi.AllocationBase;	// subtract from top (stack grows downward on win)
}

int _tmain(int argc, _TCHAR* argv[])
{
	string input;

	cout << "Allocating 22mb on stack." << std::endl;
	unsigned int start = clock();
	char eathalfastack[23068672]; // approx 22mb
	auto length = sizeof(eathalfastack)/sizeof(byte);
	std::cout << "Time taken in ms: " << clock()-start << std::endl;

	cout << "Setting through array." << std::endl;
	start = clock();
	for( int i = 0; i < length; i++ ){
		eathalfastack[i] = i;
	}
	std::cout << "Time taken in ms: " << clock()-start << std::endl;
	cout << "Free stack space: " << stackavail() << std::endl;


	cout << "Allocating 22mb on heap." << std::endl;
	start = clock();
	int* heaparr = new int[23068672];
	std::cout << "Time taken in ms: " << clock()-start << std::endl;

	start = clock();
	cout << "Setting through array." << std::endl;
	for( int i = 0; i < length; i++ ){
		heaparr[i] = i;
	}
	std::cout << "Time taken in ms: " << clock()-start << std::endl;
	
	delete[] heaparr;
	getline(std::cin, input);
}



The output is:
Allocating 22mb on stack.
Time taken in ms: 0
Setting through array.
Time taken in ms: 42
Free stack space: 18872028
Allocating 22mb on heap.
Time taken in ms: 17
Setting through array.
Time taken in ms: 34

Why is heap ITERATION faster than stack?
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
#include <iostream>
#include <ctime>
using namespace std;

int main()
{
    clock_t t1, t2;
    t1 = t2 = clock();

    // loop until t2 gets a different value
    while(t1 == t2)
        t2 = clock();

    // print resolution of clock()
    cout << t2-t1 << " ms.\n";

    return 0;
}
Measuring performance is a delicate business.

Here's what I get when I compile your program (after fixing the non-standard parts, and after converting the output of clock() to actual milliseconds (you forgot to divide by CLOCKS_PER_SEC)

Allocating 22mb on stack.
Time taken in ms: 0
Setting through array.
Time taken in ms: 0
Allocating 22mb on heap.
Time taken in ms: 0
Setting through array.
Time taken in ms: 0


Zeroes pop up for at least two reasons: first, a single 22Mb allocation/iteration is far too small to measure by clock(), and second, if the compiler manages to properly analyze your code (and it is almost certain in case of stack-based arrays), it will eliminate your code completely, because your allocation of eathalfastack and your assignments to the elements of eathalfastack have absolutely no effect.

To get a decent benchmark, you have to
0) Make your arrays the same size: right now your stack array is an array of char, while your heap array is an array of int!
1) run each test for at least a second
2) make sure every operation you are measuring either directly contributes to a value you're printing or accesses a volatile variable.

Once you do that, you will see, on most modern systems, that heap allocation takes no time, while heap access takes a whole lot longer than stack access: this is because operator new[] does not actually allocate memory, it is allocated as-needed, when accesses.

so,
3) fill the memory you allocate, e.g. with new type[]() rather than new type[]

Here's my modification of your test, and my output:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
#include <iostream>
#include <utility>
#include <chrono>

using namespace std::chrono;
const size_t length = 23068672;

typedef high_resolution_clock::duration dur_t;
std::pair<dur_t, dur_t> stack()
{
        auto start = high_resolution_clock::now();
        volatile int eathalfastack[length] = {};
        auto mid = high_resolution_clock::now();
        for( size_t i = 0; i < length; ++i ){
                eathalfastack[i] = i;
        }
        auto end = high_resolution_clock::now();
    return {mid-start, end-mid};
}

std::pair<dur_t, dur_t> heap()
{
        auto start = high_resolution_clock::now();
        volatile int* heaparr = new volatile int[length]();
        auto mid = high_resolution_clock::now();
        for( size_t i = 0; i < length ; i++ ){
                heaparr[i] = i;
        }
        auto end = high_resolution_clock::now();
        delete[] heaparr;
        return make_pair(mid - start, end - mid);
}

int main()
{   
    dur_t stack_alloc, stack_write, heap_alloc, heap_write;
    for(int cnt = 0; cnt < 100; ++cnt)
    {   
        std::pair<dur_t, dur_t> timing = stack();
        stack_alloc += timing.first;
        stack_write += timing.second;
        timing = heap();
        heap_alloc += timing.first;
        heap_write += timing.second;
    }   

        std::cout << "Time taken in ms:\n"
              << "stack alloc: " << duration_cast<milliseconds>(stack_alloc).count() << "ms\n"
              << "stack write: " << duration_cast<milliseconds>(stack_write).count() << "ms\n"
              << "Heap  alloc: " << duration_cast<milliseconds>(heap_alloc).count() << "ms\n"
              << "Heap  write: " << duration_cast<milliseconds>(heap_write).count() << "ms\n";
}


Resuts: gcc on linux:
Time taken in ms:
stack alloc: 4233ms
stack write: 2761ms
Heap  alloc: 48826ms
Heap  write: 2857ms


intel on linux
Time taken in ms:
stack alloc: 4245ms
stack write: 2753ms
Heap  alloc: 53167ms
Heap  write: 2812ms


xlc on aix
Time taken in ms:
stack alloc: 968ms
stack write: 1888ms
Heap  alloc: 2030ms
Heap  write: 1880ms


as you can see, iteration takes the same time.
Last edited on
Topic archived. No new replies allowed.