Non-in-place Merge Sort (maybe)

OK, let me try submitting this again and hope I don't get a server error.

This is a portion of a question from my first HW. It looks ok to me and compiles and runs without errors. However, it does change the array at all and doesn't work even if it did.

From manual testing, I find that the value of "size" in the "mergesort" function is 10 the first time and 1 for the subsidiary arrays "left" and "right". This doesn't make sense as sizeof(integer_array) is the number of bits in the array and sizeof(integer_array)/sizeof(int) should be the actual number of elements in the array, correct? Or is there some special exception for dynamically generated arrays?

I'm really at a loss here and would appreciate any help, be they clues, examples, etc. Thank you.

Here's my code:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
// tester.cpp : Defines the entry point for the console application.
//
#include "stdafx.h"
#include <iostream>
#include <string>
using namespace std;

void merge(int *left, int *right) {
	int temp[sizeof(left)/sizeof(int)+sizeof(right)/sizeof(int)];
	int left_pos = 0;
	int right_pos = 0;
	int tmp_pos = 0;
	int left_size = sizeof(left)/sizeof(int);
	int right_size = sizeof(right)/sizeof(int);
	
	while ((left_pos < left_size) && (right_pos < right_size)) {
		if (left[left_pos] <= right[right_pos]) {
			temp[tmp_pos] = left[left_pos];
			left_pos += 1;
		}
		else {
			temp[tmp_pos] = right[right_pos];
			right_pos += 1;
		}
		tmp_pos += 1;

	}
	while (left_pos < left_size) {
		temp[tmp_pos] = left[left_pos];
		left_pos += 1;
		tmp_pos += 1;
	}
	while (right_pos < right_size) {
		temp[tmp_pos] = right[right_pos];
		right_pos += 1;
		tmp_pos += 1;
	}
}

void mergesort(int *a, int size) {
	int* left = NULL;
	int* right = NULL;
	
	if(size > 1) {
		int mid = size/2;
		left = new int[mid];
		
		if(size%2 != 0) {
			right = new int[mid+1];
			right[mid+1] = a[size];
		}
		else {
			right = new int[mid];
		}
		
		for(int x = 0; x < mid; x++) {
			left[x] = a[x];
			right[x] = a[mid+x];
		}

		mergesort(left, sizeof(left)/sizeof(int));
		mergesort(right, sizeof(right)/sizeof(int));
		merge(left, right);

		delete [] left;
		delete [] right;
		left = NULL;
		right = NULL;
	}
}

int _tmain(int argc, _TCHAR* argv[]) {
	int arr[10] = {536, 34, 5, 6, 3, 42, 24, 42, 5, 245};
	int size = sizeof(arr)/sizeof(int);
	
	mergesort(arr, size);

	for(int x = 0; x < sizeof(arr)/sizeof(int); x++) {
		cout << arr[x] << endl;
	}
	
	return 0;
}
Also, I realize that the reason the original isn't affected is that I'm not doing an in-place partitioning (that is, using positions instead of whole new arrays for each sub-array). The reason is that the assignment explicitly defines the mergesort function as "mergesort(int *a, int size)". Why? I don't know. I also don't know if it's possible to do an in-place merge sort with the recursive part looking as it did on the assignment (I don't think it is as I'd have to pass the mid+1 as the starting point for the "right side" array, I think).

Anyway, again, thanks for your help! This is more me not really being familiar with C++ than actually understanding a merge sort, which is simple and, to me, trivial to implement in Java. I'm new to C++ and pointers confuse me :/. Thanks again!
That's the annoying part about raw arrays, they decay into pointers. Thus, when you use sizeof(), you just get the sizeof() the pointer, rather than the array. Hence why you need a size parameter.

However, since this is C++, I would suggest just using a vector or something...assuming your HW lets you.
I've solved my sizeof() issues by realizing that "left" will always be of size "mid" (the array to be sorted's length divided by 2) and that right will always be "mid" or "mid+1" which I already have conditions for, so I just added a "rightsize" variable that stores the length of "right".

@firedraco: that makes sense now, thanks for explaining it to me! Well, since I've fixed that issue, albeit crudely, I'm going to stick with arrays.

My main issue remains returning a the sorted array, as the "mergesort" function now appropriately cuts the passed arrays in half. As it turns out, instead of "void mergesort" I'm supposed to have "int mergesort". Poor vision and all. Anyway, I don't see how it is possible to have "int mergesort(int *a, int size)" as a function that does a merge sort. The return type makes it impossible as I'd end up returning an integer, not an array. But this could work with "int* mergesort(int *a, int size)" right? I think either my prof is trolling the class (but they seem to have done it just fine) and I'm supposed to do it however I want or I'm really dumb.

Please, any help would be appreciated, thank you!

Here's my revised code:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
// tester.cpp : Defines the entry point for the console application.
//
#include "stdafx.h"
#include <iostream>
#include <string>
using namespace std;

void merge(int *left, int *right) {
	int temp[sizeof(left)/sizeof(int)+sizeof(right)/sizeof(int)];
	int left_pos = 0;
	int right_pos = 0;
	int tmp_pos = 0;
	int left_size = sizeof(left)/sizeof(int);
	int right_size = sizeof(right)/sizeof(int);
	
	while ((left_pos < left_size) && (right_pos < right_size)) {
		if (left[left_pos] <= right[right_pos]) {
			temp[tmp_pos] = left[left_pos];
			left_pos += 1;
		}
		else {
			temp[tmp_pos] = right[right_pos];
			right_pos += 1;
		}
		tmp_pos += 1;

	}
	while (left_pos < left_size) {
		temp[tmp_pos] = left[left_pos];
		left_pos += 1;
		tmp_pos += 1;
	}
	while (right_pos < right_size) {
		temp[tmp_pos] = right[right_pos];
		right_pos += 1;
		tmp_pos += 1;
	}
}

void mergesort(int *a, int size) {
	int* left = NULL;
	int* right = NULL;
	int rightsize;

	if(size > 1) {
		int mid = size/2;
		left = new int[mid];
		
		if(size%2 != 0) {
			right = new int[mid+1];
			right[mid+1] = a[size];
			rightsize = mid+1;
		}
		else {
			right = new int[mid];
			rightsize = mid;
		}
		
		for(int x = 0; x < mid; x++) {
			left[x] = a[x];
			right[x] = a[mid+x];
		}

		mergesort(left, mid);
		mergesort(right, rightsize);
		merge(left, right);

		delete [] left;
		delete [] right;
		left = NULL;
		right = NULL;
	}
}

int _tmain(int argc, _TCHAR* argv[]) {
	int arr[10] = {536, 34, 5, 6, 3, 42, 24, 42, 5, 245};
	int size = sizeof(arr)/sizeof(int);
	
	mergesort(arr, size);

	for(int x = 0; x < sizeof(arr)/sizeof(int); x++) {
		cout << arr[x] << endl;
	}
	
	return 0;
}
Since you are passing the array as a pointer, the modifications you perform on it inside the function carry over to outside. (Similar to passing an object to a function in Java)

As for int mergesort...I would think he meant int* but...I guess you could also just modify the passed array and maybe return the int in a boolean context (not sure how you could fail to mergesort but...). I'd probably just ask if he meant int* if you can.
Thanks for the advice, firedraco. I assume I would have to modify the passed array, but how? "*a = merge(left, right)"? Or maybe return "merge(left, right)" and store it's value in a pointer in main, which would be the new sorted array? Then what should I do with the "merge" function? It must return something, right? Oh! Perhaps I can pass the *a array to "merge" and store contents in that array with something like:
1
2
3
4
5
  for (i=0; i < num_elements; i++)
  {
    a[right_size] = temp[right_size];
    right_size -= 1;
  }
?

I can't really visualize what's happening with my data. What I mean is, what's the first thing to get merged? The second? Third? etc. Recursion seemed simpler when it didn't involve arrays...
Last edited on
Nevermind, that code snippet wouldn't work because it assumes that right_size is the index of the rightmost element in a subarray of the original array, meaning right_size would never be the same value. But in my code, it can since it is simply the size of the right-side sub-array.

Maybe I can send the actual positions in the original array to merge. Something like:

merge(*a, temp, left_pos, (mid+1), right_size)

Where *a is the pointer to the original array, left_pos is the starting position in the original array of the left sub-array, (mid+1) is the starting position in the original array of the right sub-array, and right_last is the last index in the original array for the right sub-array. My only problem would be calculating left_pos and right_last, since I don't pass their values during the mergesort.
I've made some changes to my code, namely cleaning it up to look prettier. The prof said I can use whatever type I want for the functions, but he didn't say I could mess with the function parameters, which means I have to make something gross and space-inefficient. Anyway, here's my cleaned-up code:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
// tester.cpp : Defines the entry point for the console application.
//
#include "stdafx.h"
#include <iostream>
#include <string>
using namespace std;

void merge(int *left, int l, int *right, int r, int *arr, int a) {
	int left_pos, right_pos, arr_pos;
	left_pos = right_pos = arr_pos =0;
	
	while((left_pos < l) && (right_pos < r)) {
		if (left[left_pos] <= right[right_pos]) {
			arr[arr_pos] = left[left_pos];
			left_pos++;
		}
		else {
			arr[arr_pos] = right[right_pos];
			right_pos++;
		}
		arr_pos++;
	}
	while(left_pos < l) {
		arr[arr_pos] = left[left_pos];
		left_pos++;
		arr_pos++;
	}
	while(right_pos < r) {
		arr[arr_pos] = right[right_pos];
		right_pos++;
		arr_pos++;
	}
}

void mergesort(int *a, int size) {
	int* left = NULL;
	int* right = NULL;
	
	if(size > 1) {
		int mid = size/2;
		left = new int[mid];
		
		right = new int[size-mid];
		right[size-mid] = a[size];
		
		for(int x = 0; x < mid; x++) {
			left[x] = a[x];
			right[x] = a[mid+x];
		}

		mergesort(left, mid);
		mergesort(right, size-mid);
		merge(left, mid, right, size-mid, a, size);

		delete [] left;
		delete [] right;
		left = NULL;
		right = NULL;
	}
}


int _tmain(int argc, _TCHAR* argv[]) {
	int arr[10] = {536, 34, 5, 6, 3, 42, 24, 42, 5, 245};
	int size = sizeof(arr)/sizeof(int);
	
	mergesort(arr, size);

	for(int x = 0; x < sizeof(arr)/sizeof(int); x++) {
		cout << arr[x] << endl;
	}
	
	return 0;
}


What I ask now is how I can actually get this to work, since I get a runtime error now that says:

HEAP CORRUPTION DETECTED: after Normal block (#194) at 0x002D7EB8.
CRT detected that the application wrote to memory after end of heap buffer.


I don't see what's wrong, honestly. As far as I'm concerned, my code is perfectly fine. But then, that's why I'm asking for help :). Please help.
Got rid of my runtime error and managed to sort the array I passed! Sort of.... The first two elements of the displayed array are gigantic negative numbers, which I'm guessing means a null value. Next, the numbers that are missing are "3" and "245" and, as such, they aren't in any special place in the original array. Here's the semi-working code:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
// tester.cpp : Defines the entry point for the console application.
//
#include "stdafx.h"
#include <iostream>
#include <string>
using namespace std;

int* merge(int *left, int l, int *right, int r, int *arr, int a) {
	int left_pos = 0, right_pos = 0, arr_pos = 0;
	left_pos = right_pos = arr_pos = 0;
	
	while((left_pos < l) && (right_pos < r)) {
		if (left[left_pos] < right[right_pos]) {
			arr[arr_pos++] = left[left_pos++];
		}
		else {
			arr[arr_pos++] = right[right_pos++];
		}
	}
	while(left_pos < l) {
		arr[arr_pos++] = left[left_pos++];
	}
	while(right_pos < r) {
		arr[arr_pos++] = right[right_pos++];
	}

	return arr;
}

int* mergesort(int *a, int size) {
	int* left = NULL;
	int* right = NULL;
	
	if(size > 1) {
		int mid = size/2;
		left = new int[mid];
		
		right = new int[size-mid];
		right[size-mid] = a[size];
		
		for(int x = 0; x < mid; x++) {
			left[x] = a[x];
			right[x] = a[mid+x];
		}

		mergesort(left, mid);
		mergesort(right, size-mid);
		return merge(left, mid, right, size-mid, a, size);

		delete [] left;
		delete [] right;
		left = NULL;
		right = NULL;
	}
}


int _tmain(int argc, _TCHAR* argv[]) {
	int arr[10] = {536, 34, 5, 6, 3, 42, 24, 42, 5, 245};
	int size = sizeof(arr)/sizeof(int);
	
	mergesort(arr, size);

	for(int x = 0; x < sizeof(arr)/sizeof(int); x++) {
		cout << arr[x] << endl;
	}
	
	return 0;
}


Any ideas? Thanks!
Wait, these values ARE in special places in the array. They are at spots 5 and 10, the last spots in the first left and right halves (after the original array is split into left and right sub-arrays, I mean). Any ideas? I feel I'm getting close to the correct implementation....
Current version of my code, same issues, more compact/neat:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
// tester.cpp : Defines the entry point for the console application.
//
#include "stdafx.h"
#include <iostream>
#include <string>
using namespace std;

int* merge(int *left, int l, int *right, int r, int *arr) {
	int left_pos = 0, right_pos = 0, arr_pos = 0;
	
	while((left_pos < l) && (right_pos < r)) {
		if (left[left_pos] < right[right_pos]) {
			arr[arr_pos++] = left[left_pos++];
		}
		else {
			arr[arr_pos++] = right[right_pos++];
		}
	}
	while(left_pos < l) {
		arr[arr_pos++] = left[left_pos++];
	}
	while(right_pos < r) {
		arr[arr_pos++] = right[right_pos++];
	}

	return arr;
}

int* mergesort(int *a, int size) {
	if(size > 1) {
		int mid = size/2;
		int *left = new int[mid];
		
		int *right = new int[size-mid];
		right[size-mid] = a[size];
		
		for(int x = 0; x < mid; x++) {
			left[x] = a[x];
			right[x] = a[mid+x];
		}

		mergesort(left, mid);
		mergesort(right, size-mid);
		return merge(left, mid, right, size-mid, a);

		delete [] left;
		delete [] right;
	}
}

int _tmain(int argc, _TCHAR* argv[]) {
	int arr[10] = {536, 34, 5, 6, 3, 42, 24, 42, 5, 245};
	int size = sizeof(arr)/sizeof(int);
	
	mergesort(arr, size);

	for(int x = 0; x < sizeof(arr)/sizeof(int); x++) {
		cout << arr[x] << endl;
	}
	
	return 0;
}
Fixed it! I had:

right[size-mid] = a[size]


Which has me trying to access an index in right that doesn't exist so I can write a value in size that also doesn't exist. Thing is, this only happens with odd-number-sized arrays. Can't believe such a simple thing caused such headaches. Oh well, at least I learned from my experience and got my code nice and compact.
Here's my code in case someone else needs to do a merge sort this way:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
#include <iostream>
using namespace std;

void merge(int *left, int l, int *right, int r, int *arr) {
	int left_pos = 0, right_pos = 0, arr_pos = 0;
	
	while((left_pos < l) && (right_pos < r)) {
		if (left[left_pos] < right[right_pos]) {
			arr[arr_pos++] = left[left_pos++];
		}
		else {
			arr[arr_pos++] = right[right_pos++];
		}
	}
	while(left_pos < l) {
		arr[arr_pos++] = left[left_pos++];
	}
	while(right_pos < r) {
		arr[arr_pos++] = right[right_pos++];
	}
}

void mergesort(int *a, int size) {
	if(size > 1) {
		int mid = size/2;
		int *left = new int[mid];
		
		int *right = new int[size-mid];
		right[size-mid-1] = a[size-1];

		for(int x = 0; x < mid; x++) {
			left[x] = a[x];
			right[x] = a[mid+x];
		}

		mergesort(left, mid);
		mergesort(right, size-mid);
		merge(left, mid, right, size-mid, a);

		delete [] left;
		delete [] right;
	}
}

int main() {
	int arr[10] = {536, 34, 5, 6, 3, 42, 24, 42, 5, 245};
	
	mergesort(arr, sizeof(arr)/sizeof(int));

	for(int x = 0; x < sizeof(arr)/sizeof(int); x++) {
		cout << arr[x] << " ";
	}
	cout << endl;
	
	return 0;
}
Last edited on
Topic archived. No new replies allowed.