Garbage Collection

Pages: 123

nice one

I believe for Android, the apps run within a Dalvik run-time and this is pretty aggressive in garbage collection. If you use the SDK provided debugging tools you will see their messages scrolling whenever your code do some image manipulation or other graphics intensive operation. The Android app final compiled binaries are not Java .class format. They are Dalvik format. The source code is Java style but the final compiled is Dalvik format which I presume is specially tuned to work well with Dalvik.

Now is Dalvik as a run-time garbage collector a bad thing?

Mats (1398)

Luc Lieber, that was very interesting!

closed account (S6k9GNh0)

Really, Java has many other problems than just its GC. I started out in Java and in about a month, I was really tired with it. I cannot take the idea of "throw more hardware at it" very seriously. While memory is becoming more available, that doesn't mean you unnecessarily use more of it. I watched the video and it's decided that 2~5 GB of memory for each application is alright. I can easily run a game server, a Mumble server, a mail server, a website, and MySQL database server on a 3 GB server (with enough processing speed). If any one of those were made of Java, I would have trouble keep my server stable. The video completely ignores anything underneath a $7k server which most small or medium businesses don't even go into, depending on there needs.

I don't like this video because of it's philosophy. It doesn't solve my problem and it doesn't sound like it's for client-side applications either. It just has the idea that solving the ridiculous long inevitable pauses the come with using Java and the GC in the first place solves a fundamental issue with Java.

Although I'll make very clear that Java has a ton of issues on PC outside of just its GC and that's not the only reason I don't use it.

Last edited on

Gaminic (1621)

Isn't more RAM much easier (and less bounded) to achieve than more processing speed?
(If that wasn't the trade-off you were pointing out, forgive me.)

closed account (S6k9GNh0)

Yeah but I have "the right" philosophy.

Also, I don't give a damn about corporate applications. Reducing quality and efficiency just because you have the money to do so doesn't make whatever application that is made more productive towards society. I can even claim that it's counter-productive to encourage it.

Last edited on

Gaminic (1621)

Depends. For internal applications, I don't see the point of spending hours/days/weeks/months on optimization if it can be avoided with a minor [for them] investment. I'd go as far as saying that even for medium (and for all but the smallest small) businesses, a $7k investment in server hardware is much, much cheaper than the manhours spent on optimization.

If it's an application they intend to sell, then it depends on the customer type, the number of customers and the product. Keep in mind that those extra manhours will translate into a higher product price.

rapidcoder (1010)

It seems some of you haven't uderstood the purpose of the work of Azul engineers. They sell servers. I've just recently personally talked to one of them. They are trying to solve the problem of very large heaps not because you have to use 64 GB for Java applications *now*. Java's standard GC is perfectly ok with heaps of sizes 2-8 GB. But being able to use larger heaps, still without pauses, opens new possibilities. RAM is getting ridiculously cheap and making use of it is a *good thing*.

I watched the video and it's decided that 2~5 GB of memory for each application is alright. I can easily run a game server, a Mumble server, a mail server, a website, and MySQL database server on a 3 GB server

Good joke. Say it to Google, Facebook, Amazon or Twitter. They would pay you $10000000 if you managed to keep their services up and running with a single 3 GB server. BTW: The latter three use full Java stack for their databases. :D And Google is using it for the main service it earns money from: AdWords and AdSense. Do you suggest they are doing it wrong?

You see, Azul is *not* targetting small companies that need to have a website to publish their blog. They are targetting companies, for which buying a $20k server is like buying a new set of paper-clips.

Now is Dalvik as a run-time garbage collector a bad thing?

Dalvik is just a JVM. Sligtly different and less advanced than Oracle's, but still JVM.

Firefox, the web browser, can stay open for several hours

Yes, and it eats about 10x more memory than it actually uses, if it stays open for enough long. Do a simple test: open Firefox. Measure its RAM consumption. Open additional 10 tabs and load some pages. Close that 10 tabs. Measure the RAM consumption once again. I'm sure it will be 2x or 3x higher than it was at the beginning. If Firefox was GCed, it could return all the unused RAM to the OS, after the next GC run. Firefox can eat more RAM than my whole Eclipse with a few projects eats after one week without restart. And Eclipse is a much larger and complicated than Firefox. Firefox is a perfect example that manual memory management is not the preferred way to go in a large and complex application.

If any one of those were made of Java, I would have trouble keep my server stable

This claim is without backup and wrong. I've been running Linux + Tomcat + PostgreSQL + Spring + Hibernate stack on a 3 GB server in a company for several months without a restart. And it was running a small, 100k users MMORPG. We even switched to Tomcat only from Apache + Tomcat tandem because Tomcat was faster. So, you'd better change your admin.

<troll mode> BTW: "Stable and fast" and MySQL contradict themselves.</troll mode>

Of course now that C++ supports mark-sweep GC (not mark-compact, thankfully, which would be impossible anyway), things may get more interesting, once the compilers catch up.

Mark-sweep is pretty lame. It is like Java 15 years ago. Slow and expect long pauses.
And sharedptr is a performance killer on multicore architectures. It is a nice addition, but if you use it too often, you'll get much worse performance even than some ancient mark-sweep GCs. It is funny there are so many C++ programmers that are trying to add features from some other modern languages to C++. But then, after some time thy usually realize it is better to just use some other modern language itself. It simply doesn't work. You can't program it C++ as it were Java. It was made for different tasks.

Last edited on

Cubbi (4774)

rapidcoder wrote:
And sharedptr is a performance killer on multicore architectures. It is a nice addition, but if you use it too often

Saying that shared_ptr is for GC is like saying weak_ptr is for breaking shared_ptr cycles. It can be used for that purpose, but it's not what it's for, and it's probably a design error to use it that way. C++ is all about determinism, including well-defined object ownership.

(I am guilty of abusing shared_ptrs a little myself, GC-style approaches *are* tempting, even if they poison the language)

Last edited on

rapidcoder (1010)

C++ is all about determinism, including well-defined object ownership.

Requiring object ownership to be defined for every object is a huge limitation IMHO, and it often leads to worse performance. E.g. storing the same object in multiple containers or implementing graph structures - you can do it by sharedptr, by defensive copying or by prayer. For example std::string are generally slower than strings in Java or C# - because they often have to be copied, and in Java, thanks to lack of ownership problem, you can safely pass references everywhere. Some paradigms, e.g. functional programming by definition rely on lack of object ownership and are not possible without GC.

And what determinism are you talking about? If you want hard real time determininsm, you have to go to C level or even lower, to assembly level, and forget about dynamic allocation at all, no matter manual or automatic. A simple std::vector is just as deterministic as Javolution vectors with concurrent GC.

Last edited on

closed account (S6k9GNh0)

rapidcoder, I'm the admin of my server. Since you called me out here's my response.

With multiple Java applications the maximum heap size can overlap and one application may not know its actually nearing the end of the physical memory. This is rather a flaw (which I think was addressed in earlier years) but keeping low maximum heap sizes on multiple Java applications tends to have large stability problems. It's backed up by every Java developer out there that partially knows what he's doing. Having a high user server would be even worse. I don't know of any real-time servers for Java outside of Minecraft which has got to be the worst example of a server ever considering high-user hosts buy servers with more than 16GB of RAM and sometimes have issues with MEMORY USAGE.

3GB for the server applications I'm running is actually overkill and ascertains basic stability. Google can afford more plus they have much more traffic thus they probably receive more, no other reason.

Also, Firefox has a garbage collector.

Last edited on

rapidcoder (1010)

With multiple Java applications the maximum heap size can overlap and one application may not know its actually nearing the end of the physical memory

So what? Just set the f**ing heap sizes correctly and don't leave them default or overlapping. At least you can limit the heap usage safely - you have much more control over how much RAM your applications consume, while you cannot do this in general with native apps. If you set the memory limit for a Java app low, it will probably run slower, but continues to operate. If one of your native apps hits the memory limit it gets immediately killed. Is that the stability you are talking about?

Last edited on

closed account (S6k9GNh0)

Wrong, you have plenty of control of how much RAM your native applications consume. You consume as much as you allocate, no more, most likely less. The idea of a native application is to use the least amount of memory as possible (which may also be through the means of a GC). You can handle lack of memory allocation however you want, probably with the program shutting down. You can also generally calculate how much memory a native application will use.

Java applications do run out of memory as well. Garbage collection does not equal infinite memory usage. What happens when you begin to hit the maximum limit and memory can't be released and compressed anymore?

Last edited on

Cubbi (4774)

rapidcoder wrote:
A simple std::vector is just as deterministic as Javolution vectors with concurrent GC.

It's as deterministic as I want it to be. Prefaulted and locked, it fulfills hard RT requirements on most systems. But in general, I'm talking about deterministic handling of resources, i.e. SBRM/RAII. That is the core market of C++ and it's incompatible with GC.

rapidcoder (1010)

It's as deterministic as I want it to be. Prefaulted and locked, it fulfills hard RT requirements on most systems.

Dude, Java meets hard RT requirements. The maximum pause time caused by GC is upper-bounded and depends mostly on memory bandwidth and heap size. Realtime is all about predictability, not speed. 0.1 second hard realtime guarantee allows me for using heap several hundreds megabytes large. Additionally if I refrain from allocating long lived objects, the full GC is never run. Which makes it just as real time as I want it to be. Region allocation in Java RT makes things even more flexible.

But in general, I'm talking about deterministic handling of resources, i.e. SBRM/RAII.

Which is totally overrrated and never was a problem to Java/.NET/Python/Ruby crowd. I've seen plenty of memory-leaking C/C++ applications, but I have yet to see a resource leaking Java app. You just don't rely on GC for managing other resources than memory. You do it by hand, exactly as in C++. RAII is nice, but try-with-resources or try-finally is just as nice and easy.

Wrong, you have plenty of control of how much RAM your native applications consume. You consume as much as you allocate, no more, most likely less

Oh man, how is it different than in GCed apps?

The idea of a native application is to use the least amount of memory as possible

Yep, the idea. The practice is different. Ever heard of heap fragmentation?

You can also generally calculate how much memory a native application will use.

No, contrary to GCed apps, you can't, except in some trivial 10 line hello-world cases. You can analyze it empirically only. If you are unlucky, fragmentation can eat 80% of your memory. And you can't guarantee it won't. And you can't guarantee when you call delete, that the deleted memory returns back to the OS.

When using GC I can exactly calculate the maximum live memory size, add 10-50% to it, to make GC overhead negligible and I'm done. I'm guaranteed it won't OOM. And I'm guaranteed when after the compacting GC is run (and I can force it manually) the memory returns back to the OS.

Last edited on

Cubbi (4774)

If I don't allocate long living objects in Java, Java meets hard RT requirements.
It does? My C++ objects (including strings and vectors and even maps) generally live tens to hundreds of milliseconds, I don't think that's "long living". Can I switch to Java and maintain my guarantees? Will it sustain them for months/years of continuous operation as C++ does?

You can certainly do resource management by hand in most languages, but C++ is the only major language where it's automatic. Which is why there are no viable alternatives to C++ in so many industrial applications.

rapidcoder (1010)

but C++ is the only major language where it's automatic.

It is not any more automatic than the try block in Java or C#. If you forget to delete a RAII object, the resource won't be automatically released. I'd call it semi-automatic. In Java I can also tell the GC to warn me about resources I forgot to release. In C++ there is no such possibility - you end up with a resource leak you don't know of until it crashes (and when it crashes in production you are often left with a meaningless message instead of a stacktrace).

Will it sustain them for months/years of continuous operation as C++ does?

Yes. Deallocating short lived objects that don't survive a minor GC costs 0 in Java. But in case you need long-lived or medium-lived objects, there is JRTS or systems from Azul which don't make any pauses. Some serious companies do algorithmic realtime trading systems in Java and it is realtime enough.

moorecm (1932)

Albatross wrote:
@moorecm You obviously have never tried Fedora. No offense. :/

I have, and now avoid it, for that very reason.

Cubbi (4774)

If you forget to delete a RAII object, the resource won't be automatically released. I'd call it semi-automatic.

I haven't explicitly deleted a C++ object (in the code I'm paid for) even once in the last seven years. It is automatic. Or, perhaps, better description would be "implicit".

rapidcoder (1010)

I haven't explicitly deleted a C++ object (in the code I'm paid for) even once in the last seven years. It is automatic. Or, perhaps, better description would be "implicit".

So basically you are either doing trivial things and keeping all your objects on the stack or you are passing them excessively by value or some kind of smart pointers. E.g. you can't pass an object up the stack in C++ without either - copying it, using explicit heap allocation or some kind of smart pointer. And smart pointers, just as most STL features, are slow compared to raw hand optimised C code.

Last edited on

Pages: 123