Java...

Pages: 1... 456789
@rapidcoder

@andywestken: Concluding on performance from the program dependencies is a big wtf to me.

Huh? My last post did not state any conclusion. You are over extrapolating from my comments.
moorecm wrote:
Memory leaks, just like other bugs, are an absolute certainty as code size increases. Testers should have the mindset that if they aren't finding bugs, they aren't doing their job, IMO.


Do you know in small software outfits, the testers are actually the developers themselves? It is only in large organizations where they employ dedicated tester positions that can afford the luxury to do such detailed testing to catch memory leak and other bugs as well.

You statement seem to imply testers should bear the responsibility for not catching memory leak problems instead of attributing the problem to the C++ developer fault?

Now I wait for a software tester to comment on this. This is going to open up another long thread of debate of software tester vs software developer :P
Speaking of memory leaks... I gotta give it to notch. Only a truly skilled programmer can introduce such a massive memory leak in a garbage collected environment (the first desktop alpha of minecraft had it pretty bad xD)
xander337 wrote:
Speaking of memory leaks... I gotta give it to notch. Only a truly skilled programmer can introduce such a massive memory leak in a garbage collected environment (the first desktop alpha of minecraft had it pretty bad xD)


Above is also true. With garbage collected environment, developers tend to get carried away and new'ed memory in abundance. This I got to admit but the saving grace is the program get slow-ed down (as GC will free-ed up those memory not referenced any more) but at least it never crash isn't it? Or maybe it does?

Hmmm.. I guess the garbage collector debate on their merits or not will go on for years to come.
Well I thought it might be interesting to see how GCC C++ compares time-wise against Sun's Java 1.6 when allocating a large number of varying sized chunks of memory. So I have written what I assume to be fairly equivalent pieces of code in C++ and Java to allocate (and deallocate) a bunch of char arrays:

GCC C++
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
#include <iostream>
#include <ctime>

const int PTR_NUM = 10000;
const int RUN_NUM = 1000;

int main()
{
	char* ptrv[PTR_NUM];

	std::clock_t b, e;

	b = std::clock();
	for(int r = 0; r < RUN_NUM; ++r)
	{
		for(int i = 0; i < PTR_NUM; ++i)
			ptrv[i] = new char[(i % 1023) + 1];

		for(int i = 0; i < PTR_NUM; ++i)
			delete[] ptrv[i];
	}
	e = std::clock();

	std::cout << "Run time: " << (double(e - b) / CLOCKS_PER_SEC) << '\n';

	return 0;
}
Run time: 11.36

Sun Java 1.6
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
public class Allocate
{
	final static int PTR_NUM = 10000;
	final static int RUN_NUM = 1000;

	public static void main(String[] args)
	{
		char[][] ptrv = new char[PTR_NUM][];

		long b, e;

		System.gc();
		b = System.currentTimeMillis();
		for(int r = 0; r < RUN_NUM; ++r)
		{
			for(int i = 0; i < PTR_NUM; ++i)
				ptrv[i] = new char[(i % 1023) + 1];
		}
		System.gc();
		e = System.currentTimeMillis();

		System.out.println("Run time: " + String.valueOf((double)(e - b) / 1000));
	}
}
Run time: 69.12

On my system C++ is about 6 times faster than Java.

Perhaps there are optimisation flags for Java that I am unaware of as I have not used Java for quite a few years now. Or perhaps there are other reasons why this is not a very fair allocation test between C++ and Java. But it would be interesting to see if those advocating that Java allocates faster than C++ can provide a bench test to demonstrate that ability.

I am not saying that Java does not allocate faster than C++, only that I am unable to produce evidence of that myself. But then my Java is rather rusty and I may be missing something.

I have done a similar test to see how GCC C++ compares time-wise against Sun's Java 1.6 when calling virtual methods. So, once again, I have written what I assume to be fairly equivalent pieces of code in C++ and Java:

GCC C++
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
#include <iostream>
#include <ctime>

const int RUN_NUM = 100000;

class Base
{
public:
	static int n;

	virtual ~Base() {}

	static void scall() { ++Base::n; }

	virtual void vcall() = 0;
};

int Base::n = 0;

class Derived
: public Base
{
public:
	virtual ~Derived() {}

	virtual void vcall() { ++Base::n; }
};

int main()
{
	Base* base = new Derived;

	std::clock_t b, e;

	b = std::clock();
	for(int m = 0; m < RUN_NUM; ++m)
		for(int r = 0; r < RUN_NUM; ++r)
		{
			Base::scall();
		}
	e = std::clock();

	std::cout << "Static call run time: " << (double(e - b) / CLOCKS_PER_SEC) << '\n';
	std::cout << "n: " << Base::n << '\n';

	b = std::clock();
	for(int m = 0; m < RUN_NUM; ++m)
		for(int r = 0; r < RUN_NUM; ++r)
		{
			base->vcall();
		}
	e = std::clock();

	std::cout << "Virtual call run time: " << (double(e - b) / CLOCKS_PER_SEC) << '\n';
	std::cout << "n: " << Base::n << '\n';

	delete base;

	return 0;
}
Static call run time: 0
n: 1410065408
Virtual call run time: 34.01
n: -1474836480

Sun's Java 1.6
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
abstract class Base
{
	public static int n = 0;

	public static void scall() { ++Base.n; }

	public abstract void vcall();
};

class Derived
extends Base
{
	public void vcall() { ++Base.n; }
};

public class Allocate
{
	final static int RUN_NUM = 100000;

	public static void main(String[] args)
	{
		Base base = new Derived();

		long b, e;

		b = System.currentTimeMillis();
		for(int m = 0; m < RUN_NUM; ++m)
			for(int r = 0; r < RUN_NUM; ++r)
			{
				base.scall();
			}
		e = System.currentTimeMillis();

		System.out.println("Static call run time: " + String.valueOf((double)(e - b) / 1000));
		System.out.println("n: " + String.valueOf(Base.n));

		b = System.currentTimeMillis();
		for(int m = 0; m < RUN_NUM; ++m)
			for(int r = 0; r < RUN_NUM; ++r)
			{
				base.vcall();
			}
		e = System.currentTimeMillis();

		System.out.println("Virtual call run time: " + String.valueOf((double)(e - b) / 1000));
		System.out.println("n: " + String.valueOf(Base.n));
	}
}
Static call run time: 22.981
n: 1410065408
Virtual call run time: 74.487
n: -1474836480

On my system C++ is about two times faster than Java for a virtual call and many many times faster for a static call..

The same disclaimers apply, perhaps I am missing something that makes this an unfair test.

EDIT: Replaced spuriously bad Java result with a better more common result.
Last edited on
Hi can you post the C++ and Java timing ? Also you are testing some inheritance classes.

Item 1 Can test without? That is, a class without extends and similarly for C++.

Item 2 Can test primitive types instead of classes? That is, say int, long, char etc.

I believe the change to your test programs are very minimal. I am interested to see how Java handle inheritance and primitive types. Does it allocate them differently. Likewise for C++.

Edit: I just saw you are measuring virtual calls timing correct ? I try your Java with and without extends Base

Extends Base
Static call run time: 0.711
n: 1410065408
Virtual call run time: 0.652
n: -1474836480

Derived only
Static call run time: 0.539
n: 1410065408
Virtual call run time: 0.564
n: -1474836480

There is a difference in timing.

C++ is terrible
Static call run time: 39.81
n: 1410065408
Virtual call run time: 51.55
n: -1474836480

Last edited on
Galik, I've run your tests without ANY modifications on a Core i5 2.6 GHz laptop, Oracle Java 6, default options (= all additional optimisation options turned off) and got timings orders of magnitude better than yours:

The memory test program output:
Run time: 1.761

The call test program output:
Static call run time: 0.627
n: 1410065408
Virtual call run time: 0.416
n: -1474836480

WTF have you done to your Java? Or are you running that on a VM installed on a VM installed on a 286 with 128 kB RAM? ;)


Whatever it is, such benchmarks are by definition flawed, because they are very far model from what is really done by useful applications. I mean, probably no real application allocates blocks of memory in such a predictable way (just to delete it immediately after) nor calls the same empty method of the same object in a loop.
Last edited on
Please post the results of the tests for the other language too, rapidcoder, else those numbers have a greatly reduced meaning.

@sohguanh: What C++ compiler are you using and what flags have you set on it? Similarly, what flags did you set on your Java compiler and/or JVM?

-Albatross
Last edited on
For C++ I did below.

g++ Allocate.cpp
./a.out

g++ -v
Reading specs from /usr/lib/gcc-lib/x86_64-redhat-linux/3.2.3/specs
Configured with: ../configure --prefix=/usr --mandir=/usr/share/man --infodir=/usr/share/info --enable-shared --enable-threads=posix --disable-checking --with-system-zlib --enable-__cxa_atexit --host=x86_64-redhat-linux
Thread model: posix
gcc version 3.2.3 20030502 (Red Hat Linux 3.2.3-53)

For Java I did below
javac Allocate.java
java -cp. Allocate

java -version
java version "1.6.0_26"
Java(TM) SE Runtime Environment (build 1.6.0_26-b03)
Java HotSpot(TM) 64-Bit Server VM (build 20.1-b02, mixed mode)

Edit: I notice my desktop is using different JVM. It is using Java HotSpot(TM) Client VM (build 20.1-b02, mixed mode)

Last edited on
Oops I just discover Galik post two posting. One to test virtual calls and one to test primitive type memory allocation. My results is below which shows a slight favor on C++ side.

Java primitive
Run time: 7.616

C++ primitive
Run time: 7.02

What get me thinking is why virtual calls are so slow on C++ ?

Java primitive
Run time: 7.616

C++ primitive
Run time: 7.02


Add zeroing memory to the C++ code, e.g. using memset, and remove that System.gc from the Java test. It is useless there, you force the JVM to do full GC, when it is not really needed. And ofc, you are not measuring allocation / deallocation speed then, but GC performance.

Otherwise, the tests are not doing the same. BTW, you forgot to add optimisation options to your gcc command (at least add -O2) - it should improve the method call benchmark.

---------------
Added later:
Oh, man, one more thing: have you noticed, the Java version of the benchmark allocates twice as much memory as the C++ version? chars in Java are 2 times wider than in C++. Use wchar in C++ or byte in Java.
And removing System.gc does not affect performance, at least in my tests.

---------------
Added later:
After changing it to bytes, the Java version got about twice faster.
So, conclusion: Java program was able to allocate, zero and GC twice as much memory as the C++ program was able to only allocate and free, and it took Java less than 10% more time. Impressive.
Last edited on
Hi rapidcoder what is your timing for the primitive type memory allocation ?

Since I did no optimization options to javac and java, I also do nothing to g++.

System.gc() is needed cuz in the C++ program, there is a delete[] to free memory so it sort of make them equivalent.

For since Java auto-initialize default values for primitive for which C++ does not, I added below line into sample code to make it more equivalent.

for(int i = 0; i < PTR_NUM; ++i) {
ptrv[i] = new char[(i % 1023) + 1];
*(ptrv[i]) = '0'; //add this line to explicitly default some values
}

The timing goes to 7.06-7.07 a bit higher but still slightly faster than Java primitives.
Last edited on
closed account (1vRz3TCk)
This type of benchmarking has always seemed flawed to me.

Some of Galiks code with a small modification, both tests do the same thing
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
#include <iostream>
#include <ctime>

const int RUN_NUM = 100000;

class Base
{
public:
    static int n;

    virtual ~Base() {}

    static void scall() { ++Base::n; }

    virtual void vcall() = 0;
};

int Base::n = 0;

class Derived
: public Base
{
public:
    virtual ~Derived() {}

    virtual void vcall() { ++Base::n; }
};

int main()
{
    Base* base = new Derived;

    std::clock_t b, e;

    //Base::n = 0;    // <----- Here
    b = std::clock();
    for(int m = 0; m < RUN_NUM; ++m)
        for(int r = 0; r < RUN_NUM; ++r)
        {
            base->vcall();
        }
    e = std::clock();

    std::cout << "Static call run time: " << (double(e - b) / CLOCKS_PER_SEC) << '\n';
    std::cout << "n: " << Base::n << '\n';
    
    Base::n = 0;
    b = std::clock();
    for(int m = 0; m < RUN_NUM; ++m)
        for(int r = 0; r < RUN_NUM; ++r)
        {
            base->vcall();
        }
    e = std::clock();

    std::cout << "Virtual call run time: " << (double(e - b) / CLOCKS_PER_SEC) << '\n';
    std::cout << "n: " << Base::n << '\n';

    delete base;

    return 0;
}


As the code is (with line 35 commented out)
Static call run time: 23.052
n: 1410065408
Virtual call run time: 26.907
n: 1410065408
The second test takes around 4 seconds longer to do the same thing.

So uncommenting line 35 (notice that this is outside any timing points) and you get the following.
Static call run time: 23.246
n: 1410065408
Virtual call run time: 23.323
n: 1410065408
The second test does not take the extra four seconds to complete.

Edit:
NB: Multiple compile and runs carried out on the code to obtain results.
Last edited on

For since Java auto-initialize default values for primitive for which C++ does not, I added below line into sample code to make it more equivalent.

for(int i = 0; i < PTR_NUM; ++i) {
ptrv[i] = new char[(i % 1023) + 1];
*(ptrv[i]) = '0'; //add this line to explicitly default some values
}


No, this is not equivalent, you only set the first character of the array to '0'. Java zeroes the whole array to prevent reading garbage, so in this case it can be up to 1024 characters. Nevertheless, the main problem of Galik's benchmark is still the C++ char vs Java char inequality - either use single-byte array elements in both (char in C++, byte in Java), or two-byte elements (wchar_t in C++, char in Java). This is simply not fair that Java has to allocate twice as much memory.


Hi rapidcoder what is your timing for the primitive type memory allocation ?


0.9 sec for Java using bytes. I haven't tested c++, because I haven't installed a decent compiler.



Last edited on
@sohguanh

My results are achieved with C++ optimisation turned on -O3. The reason being that all the Java performance claims are based upon its ability to optimise the code.

@rapidcoder

I tested both Java and C++ to compare the relative results. An absolute test of Java alone is not very meaningful. Your hardware is much faster than mine.

rapidcoder wrote:
Whatever it is, such benchmarks are by definition flawed, because they are very far model from what is really done by useful applications. I mean, probably no real application allocates blocks of memory in such a predictable way (just to delete it immediately after) nor calls the same empty method of the same object in a loop.


Such bench tests are no measure of application performance. However they are targetting your information about how Java operates in specific tasks.

You said that memory allocation in Java is achieved by simply incrementing a pointer. Well bench tests can certainly test how effective that really is. If I remove the deallocation code and change char to byte then I still get C++ allocating much faster than Java.

GCC C++: Run time: 11.25
Java: Run time: 42.849

I get the same results in GCC with or without optimisation turned on for the allocation test. I assume there is little to optimise about allocating memory.

When it comes to testing how fast a function can be called optimisation has a large effect on C++, presumably due to its inlining the static method call.

What optimisation flags are available to Java that can improve its performance here?

With regard to zeroing memory, I don't see why that should be done. The vast majority of memory requests do not want zeroed memory. Are you suggesting that Java zeroes the memory of an allocated object only to overwrite those zeroes with the class initialisation values? That seems a little redundant. And frankly if Java truly zeroes a byte array before the program then overwrites those zeros with the data that it actually wants to be there, then this is an inefficiency on Java's part.

@sohguanh

It is interesting that your relative allocation tests are virtually the same for C++ and Java. I am not sure how to explain the difference between my results and yours. I too am using Linux (though 32 bit) so my initial thought that it was an OS thing is invalidated.
Last edited on
Okay, based on the idea that Java only zeroes the memory for allocated primitive types, I have now produced a new allocation test using user defined types that is strongly favourable to Java:
GCC C++
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
#include <iostream>
#include <ctime>

const int PTR_NUM = 10000; // Also 100000
const int RUN_NUM = 10000;

class C
{
public:
	char a, b, c, d;
	C(): a(1), b(2), c(3), d(4) {}
};

int main()
{
	C* ptrv[PTR_NUM];

	std::clock_t b, e;

	b = std::clock();
	for(int r = 0; r < RUN_NUM; ++r)
	{
		for(int i = 0; i < PTR_NUM; ++i)
			ptrv[i] = new C;

		for(int i = 0; i < PTR_NUM; ++i)
			delete ptrv[i];
	}
	e = std::clock();

	std::cout << "Run time: " << (double(e - b) / CLOCKS_PER_SEC) << '\n';

	return 0;
}
Run time: 10.24

Sun's Java 1.6
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
class C
{
	public byte a = 1;
	public byte b = 2;
	public byte c = 3;
	public byte d = 4;
}

public class Allocate
{
	final static int PTR_NUM = 10000; // Also 100000
	final static int RUN_NUM = 10000;

	public static void main(String[] args)
	{
		C[] ptrv = new C[PTR_NUM];

		long b, e;

		System.gc();
		b = System.currentTimeMillis();
		for(int r = 0; r < RUN_NUM; ++r)
		{
			for(int i = 0; i < PTR_NUM; ++i)
				ptrv[i] = new C();
		}
		System.gc();
		e = System.currentTimeMillis();

		System.out.println("Run time: " + String.valueOf((double)(e - b) / 1000));
	}
}
Run time: 3.723

In this test Java is about twice as fast as C++. I presume that the zeroing of memory is causing Java to allocate much slower for byte arrays.

I then repeated the test increasing the number of elements in the array by a factor of 10. This produced the opposite result of C++ performing about twice as fast as Java:

GCC C++
Run time: 116.53

Sun's Java 1.6
Run time: 244.267

So why is Java much slower for larger numbers of allocated objects than C++?

We know that Java must copy every live object to a new location in memory when it runs out of free slots as a means of de-fragmentation. So I am wondering if this is happening when the number of objects reaches a certain size?

One thing is for sure, there are swings and roundabouts when comparing C++ to Java allocation. It seems that for primitive arrays one must pay the price of having them set to zero regardless if you need them to be zero or not. For user defined types Java appears to hand out object rather more quickly than C++ up to a certain point. Then its performance degrades. This could be consistent with the fact that Java has to copy all of its live objects in order to defrag memory.

However C++ must also pay a price for maintaining its free-store as unfragmented as possible. This test is not going to tax C++ in that department because it allocates then immediately deallocates the same blocks.


Your hardware is much faster than mine.


40 times?! O_o
Are you sure, you are running the Oracle Java 6.0? Which update?

sohguanh got the Java allocation about 2x faster than C++ (after assuming change from char to byte).

As for zeroing the arrays, Java does this only if it cannot make sure you initialised your array in the code. So, if you create an array with new, and immediately after that you fill it with values, you pay no cost of additional zeroing. The same applies to object creation.
Last edited on
I am using an AMD Turion 64 Mobile technology ML-32 1.8GHz in 32 bit mode.

Java is definitely Oracle (Sun)

java version "1.6.0_23"
Java(TM) SE Runtime Environment (build 1.6.0_23-b05)
Java HotSpot(TM) Client VM (build 19.0-b09, mixed mode, sharing)

Last edited on
I am using an AMD Turion 64 Mobile technology ML-32 1.8GHz in 32 bit mode.

Java is definitely Oracle (Sun)

java version "1.6.0_23"
Java(TM) SE Runtime Environment (build 1.6.0_23-b05)
Java HotSpot(TM) Client VM (build 19.0-b09, mixed mode, sharing)


I just notice something very alarming! Yesterday I ran the Java program on a Linux server and it is using Java HotSpot(TM) 64-Bit Server VM (build 20.1-b02, mixed mode). Today I ran the same program on my desktop PC and it is using Java HotSpot(TM) Client VM (build 20.1-b02, mixed mode)

The server execute in about 7 secs while the desktop PC execute in about 14 secs !!!! That is at least 100% difference! So it seems the choice of the JVM is a VERY IMPORTANT factor indeed!

So Galik whatever timing you get from Java may not be accurate since using different JVM can skew-ed the timing by a lot. You have also not explain why for virtual calls in C++ it is excruciating slow in comparison to Java :P
Pages: 1... 456789