STL containers (ie vectors) & Boehm GC

Nov 13, 2008 at 7:53pm
Hello All

On my Linux/Debian/Sid/AMD64 I would like to use some STL containers (like vectors or maps) allocated using the conservative Boehm garbage collector (ie using GC_malloc).http://www.hpl.hp.com/personal/Hans_Boehm/gc/

The point is that I want every internal data used by the container (e.g. the vector or the map) to also be allocated thru GC_malloc

I want to do that because I want all my data to be garbage collected, because I do not want to bother manually delete-ing it.

I suspect that the allocator template argument of vector or map should be relevant, but I am not sure.

How can I achieve all that? Do any one have any example of allocator for Boehm's GC?

Boehm's GC is designed for that kind of stuff, and is rumored to work quite well in practice.

Regards.

--
Basile Starynkevitch --- http://starynkevitch.net/Basile/
Last edited on Nov 13, 2008 at 7:55pm
Nov 13, 2008 at 7:57pm
I believe that STL containers already handle allocation and deallocation for you...unless I am misunderstanding your question.
Nov 13, 2008 at 8:09pm
Since new ultimately calls malloc() and delete ultimately calls free(), you should be able
to replace malloc() and free() with the GC version.

man malloc_hook

The man page gives you fairly good information on how to go about doing that.

You can then let all your containers use their default allocators knowing that that will use your new malloc() method.
Nov 13, 2008 at 8:13pm
The point is that any conservative GC (at least Boehm's one) does not call any destructors; it only free dead memory.

My concern is more precisely that I would allocate (using new(gc) for instance, which is calling GC_malloc) a vector v then it will internally call malloc (ie ordinary ::new operator) for some internal data d (belonging to and pointed by data inside v), not GC_malloc, and when Boehm's GC would free my v the internally malloc-ated data d inside would not be freed.

Boehm's GC essentially disallow a GC_malloc-ed zone to point to an owned malloc-ed zone (because of course Boehm's GC does not call any C++ destructor implicitly; it only calls implicitly the equivalent of GC_free, not of delete).

So I want all internal allocation to be done thru GC_malloc, because no destructors would be called.

In other words, I want to use STL with Boehm's GC without ever coding any delete.

It could happen that this is not achievable; this would mean that I either need to code all my own template classes for vectors and maps, or avoid C++. I definitely cannot live without real garbage collection (but I am able to code a GC when needed).

I actually am not sure to understand the role of allocator-s inside STL.

--
Basile Starynkevitch
Last edited on Nov 13, 2008 at 8:18pm
Nov 13, 2008 at 8:22pm
malloc_hook is not entirely satisfactory. It modifies every malloc, including those done by the deepest routines in libc or libstdc++.

I need only to change the allocation inside (some of my instantiations of template) STL containers. For instance, I want iostream-s or stdio FILE-s to still use malloc, not GC_malloc, but I want to have a vector of pointers to MyClass -which is GC_malloc-ed- such that the vector and any internal data, including the actual array, is GC_malloc-ed!

Thanks.

--
Basile Starynkevitch
Last edited on Nov 13, 2008 at 8:27pm
Nov 13, 2008 at 10:04pm
Allocators encapsulate memory management; you could write your own allocator to use something different from new. However, I don't quite understand what you mean with
because no destructors would be called.
In what situation is a destructor called without you wanting it?

Also, don't confuse the heap with the free store -
it will internally call malloc (ie ordinary ::new operator)
this is an assumption true for most implementations, but certainly not a C++ requirement. Also bear in mind that memory allocated by 'new' has to be freed via 'delete', while memory allocated by 'malloc' has to be returned to the system by 'free' (for exactly this reason - the heap and the free store need not be the same).
Nov 14, 2008 at 7:09am
My question is more precisely; what kind of allocations do the allocator template argument of e.g. vector<> handle? It seems that they deal with allocation of elements in the vector, not with allocation of the vector internals, such as the vector's internal array.

My point is that I wanted to use STL with GC_malloc only on data which does not use non-memory resources (this typically excludes vector<ofstream> for example). In that case, the only useful resource is memory, and I want it to be handled by Boehm's GC.

Recall that Boehm's GC is a conservative collector which tracks by marking all reachable memory zones (reachable from the stack and global data, by conservatively handling every word as a potential pointer) and frees (essentially thru GC_free) all memory zones which is not live (i.e. not marked as such in the previous marking step). This can happen at any call to GC_malloc etc..).

And indeed, for simplicity, I assumed that openrator ::new called the system malloc. This is true on Linux but is an implementation detail handled by Boehm's GC.

So if my vector (instanciated template) used only GC_malloc internally (ie new(GC) but not ::new) I would be happy. I won't have to call delete, and the memory resource is managed by Boehm's collector. Trust me, it works well. Of course, Boehm's GC does not call any C++ destructor (because it does not call any delete C++ operator, only GC_free!), but that it ok for my use: it only reclaim automatically memory which is not live anymore (for a conservative approximation of live).

However, if there is no way of instanciating the vector<> or map<> template such as every internal allocation (inside STL code for vector) is using GC_malloc (or new(GC) which wraps it) then I cannot use C++.

Thanks for reading.

--
Basile Starynkevitch http://starynkevitch.net/Basile/
Last edited on Nov 14, 2008 at 7:12am
Nov 14, 2008 at 8:22am
If you free memory of a class without calling the destructor first, you invoke undefined behavior. You don't want to do that. As soon as a vector goes out of scope, the destructors are called (you still don't tell why this is a problem) and memory is freed automatically. So where exactly is the thing vector does you don't want to be done? Perhaps you should include a short piece of code demonstrating the effect you (don't) want to have.
Nov 14, 2008 at 3:52pm
I just would want something like


class MyObject;
// MyObject instances are only allocated with new(GC) MyObject and are never explicitly delete-d

typedef vector<MyObject*> MyVector;
// MyVector instances are only allocated with new(GC) MyVector and are never explicitly delete-d


The point is that I never want to code any call to delete in my code. And I never allocate any instance of MyObject or MyVector on the stack, only using new(GC) which calls GC_malloc.

I'm surprised you don't understand my concerns. It seems like you never coded anything in any garbage collected language.

My concern is that Boehm collector won't free any memory (which is malloc-ated, which for my system, Linux, is the same as ::new regarding memory allocation) pointed by a zone which has been GC_malloc-ed (of course, it would GC_free when needed a GC_malloc-ed zone pointed by another GC_malloc-ed zone).

In a short sentence, I want to use C++ like Java or Lisp regarding memory allocation: just call new(GC) and never code any delete in my code. Of course, I admit the limitation of GC_malloc, mostly having pointers to (or near of) start of allocated zones, which prohibits any multiple inheritance in C++ (and other internal pointers), and objects inside other objects (I would use pointers in that case).

The initial post contains a pointer to Boehm's collector. And there are lot of litterature on garbage collection. I suggest Jones' & Lins' book.


Regards.

Last edited on Nov 14, 2008 at 4:12pm
Topic archived. No new replies allowed.