• Forum
  • Lounge
  • an elementary C++ trick just dawned on m

 
an elementary C++ trick just dawned on me

Hey folks,

I just discovered the existence of this trick and I am so pleased:

Goal of the trick: do not allocate memory for large objects until they get used for the first time, while in the same time 1) avoid writing too much code 2) easily transform your old code to use the new trick.

My solution:
Transform old code such as
1
2
3
int NumberVariables=5; 
Polynomial P1,P2; //here both P1 and P2 initialize their hash tables;
P1.Nullify(NumberVariables);

to
1
2
3
int NumberVariables=5; 
MemorySaving<Polynomial> P1,P2; //we initialize nothing until we actually use
P1.GetElement().Nullify(NumberVariables);//all we do is add an extra .GetElement() 


Implementation:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
template <class Object>
class MemorySaving
{
private:
  Object* theValue;
public:
  Object& GetElement()
  { if (this->theValue==0)
      this->theValue=new Object;// you can add some error catching here, erased mine for clarity
    return *(this->theValue);
  };
  const Object& GetElementConst()const{ return *this->theValue;};//ouch discovered this is needed in a painful way
  bool IsZeroPointer(){return this->theValue==0;};
  void FreeMemory(){ delete this->theValue; this->theValue=0;};
  MemorySaving(){this->theValue=0;};
  ~MemorySaving(){this->FreeMemory();};
};



There's one shortcoming of my implementation: it works only for objects that have a no-argument constructor (due to my programming taste however this holds for all of my objects).


[Edit:] Just discovered in a painful way that I need an extra function for the solution to work fine...
Last edited on
Although I've never really had a situation where such a class would be useful, it's an interesting idea.

This seems like a mini smart pointer.

A few things:

1) Not really important, but you don't need a semicolon after function bodies.

2) This is not copyable, so you should forbid copying. If you do something like passing a MemorySaving to a function by value, it will explode because both copies will try to delete the same object.

3) No need to rename 'GetElement' to GetElementConst for the const version. Just name both const and non-const versions the same thing. Less maintanance issues.

4) Typical smart pointer classes implement the * and -> operators so that you don't need the lengthy function call. Example:

1
2
  Object* operator -> () { return &GetElement(); }  // you can make const versions too
  Object& operator * ()  { return GetElement(); }


That way you can just do this:

1
2
3
P1->Nullify(NumberVariables);
// instead of the more verbose:
P1.GetElement().Nullify(NumberVariables);
const Object& GetElementConst()const{ return *this->theValue;}
You have to check whether theValue is NULL or will cause some serious problems

Thanks for the advice!

@2) I already discovered that on my own... almost caused me some nasty bugs.
@3) Good idea, will do! Feels a bit weird however that compilers allow it.
@4) Cannot judge whether I should do this. The benefit you mentioned is obvious, however it has a couple of minuses:
- It could be important to know whether you are using a such a "smart pointer", or a regular one (for example if you are optimizing for speed). Calling .GetElement() every time will quietly remind you and forcefully save you from carbon-based memory failures.
- If another programmer reads your code, he will quietly be informed that you are using a "smart pointer".

Of course, I could implement the overloading. Then I could use my judgement which notation is more suitable... but then again, I've been programming long enough to know not to trust my judgement :)


Although I've never really had a situation where such a class would be useful, it's an interesting idea.


My situation is I am writing a calculator parser. Originally, each node in the expression tree was a rational number. Then I started adding more and more data structures (Polynomials, weyl algebra elements, Universal enveloping algebra elements, .... (the sizes of those rise exponentially with the size of their names ;) ).

Originally I was allocating an empty object of every type in each node. (Those include hash tables, etc.). Then I figured out allocating 1000 nodes was 500MB RAM... you get the idea...

Looks like a perfect way to strip thread-safe objects from their thread-safety.
@rapidcoder: How is this code less thread safe than using bare pointers?

As the new - delete code is shared between threads (as is the program's memory), if you are using new/delete you lose thread safety automatically.

So, I generally agree with your comment, except that calling what I do "stripping" is misrepresenting my efforts: the new classes have very different names, so there is no ground for confusion.

The lack of thread safety will be seen from afar: better *know* that your code is non-thread safe, than think wrongly it is...

@Bazzy:

const Object& GetElementConst()const{ return *this->theValue;}
is needed if you pass your data structure as const.

If you call .GetElement() on a const object you are not allowed to modify internal structure and create your "smart pointer".

So, using .GetElementConst() on a const object already is tricky business.

Returning *this->theValue as Object& should be no problem: I think it works similarly to:
 
const Object* GetElementConst()const{ return this->theValue;}

Last edited on
If you try to dereference a NULL pointer, you'll get a segmentation fault

As the new - delete code is shared between threads (as is the program's memory), if you are using new/delete you lose thread safety automatically.


They are not guaranteed by the standard to be thread safe, but they usually are in a thread-aware program (just link your program with posix threads, and they immediately become thread-safe). Even if not, they can be easily made thread-safe by using LD_PRELOAD trick.


I cannot imagine that you can make such a code thread-safe (and that is not the goal).

However, if we assume that new/delete are thread safe (I did use Windows before I switched to Ubuntu), do you have a thread-safe solution?

@Bazzy

If you try to dereference a NULL pointer, you'll get a segmentation fault


Since the return type is Object& I don't think that the compiler should make a pointer dereference when seeing
return *this->theValue; - as far as I understand C++ syntax (which is not too far) the return type is a pointer, so no derefence operation should be carried out - I expect
(&* this->theValue);
to be transformed to
this->theValue;,
rather than to
&(*(this->theValue));
(the second transformation will indeed cause a segmentation fault as you say).

[Edit]: I just tested the following code:
1
2
3
MemorySaving<int> tempI;
if( &tempI.GetElementConst()==0)
  std::cout<< "all is good";

and it ran without any hickups. That is, on gcc compiler on Ubuntu.
Last edited on
Object& is a reference to Object.
*this->theValue is dereferencing this->theValue

&tempI.GetElementConst() is getting the address of theValue, which is NULL ( 0 ).

Try the following code:
1
2
MemorySaving<int> tempI;
std::cout << tempI.GetElementConst();
It' uses the dereferenced NULL causing segmentation fault
I think that the dereferencing in your example is happening after the return statement. If you try
1
2
MemorySaving<int> tempI;
std::cout << &(tempI.GetElementConst());

it works just fine.

[Edit]: However, the fact that it works with the brackets around tempI.GetElementConst() is a bit puzzling to me. Does it not contradict a statement I made above?

Last edited on
&(tempI.GetElementConst()) or &tempI.GetElementConst() ( which are the same ) will return the address of the reference returned by GetElementConst, which is NULL. If you try to use the reference itself, you'll get the segmentation fault because you dereference NULL, that happens in *this->theValue; where * is the dereference operator and this->theValue is NULL
http://www.cplusplus.com/doc/tutorial/pointers/
My situation is I am writing a calculator parser. Originally, each node in the expression tree was a rational number. Then I started adding more and more data structures (Polynomials, weyl algebra elements, Universal enveloping algebra elements, .... (the sizes of those rise exponentially with the size of their names ;) ).

Originally I was allocating an empty object of every type in each node. (Those include hash tables, etc.). Then I figured out allocating 1000 nodes was 500MB RAM... you get the idea...


Why would you be building nodes before you have something to put in them?

I mean... what's the point of putting empty objects in the expression tree?
Well, I allocate in advance an upper bound for the number of nodes that I will need.

Even if I didn't do that, it would be all the same: a typical expression could easily have 1000 nodes, so I do need to save up on RAM.

By the way, managing a tree of expressions is by no means trivial. For example, a rational number also happens to be a polynomial, and a polynomial is a weyl algebra element. So, you often need to convert an expression within a node from one type to another, which means in effect you gotta do lots of new/delete every time you make a type conversion.

So, I originally implemented the parser the way I did to minimize memory managenent (empty objects = no new/delete worries).

This post in fact describes my retro-fit to fix the unexpected memory consumption (it wasn't expected cause in the beginning I did not know I will need to implement such complicated data types).
Topic archived. No new replies allowed.