Improving Performance when working with vector STL containers

Hello everyone,

I'm working in a code and I do need to improve the performance of some functions that deal with vector objects.

Up to now I found there's a big difference between passing a complete vector object as argument to a given function (which is called several times in my code) or passing its pointer. The prototypes could look like:

void function1(vector <MyClass> p); (slow performance)
void function2(vector <MyClass> *p); (fast performance)

and I call these functions like

function1(p);
function2(&p);

I guess that in the first case I'm passing all the vector object to the function, while in the second option I'm just passing the pointer to its first element. Am I right?

Now I'm wondering if this is the same case when working with vectors of pointers. So There is any difference in performance between declaring these two functions?

void function1(vector <MyClass*> p);
void function2(vector <MyClass*> *p);

Does this depend on how much memory is required for MyClass object?

Thanks in advance.
When passing anything by value, a copy of it has to be made.

For containers, this likely means copying the container itself as well as everything inside the container. This can be very expensive.

Typically you want to pass complex objects like containers, strings, etc by const reference, rather than by value:

1
2
3
4
void function1(const vector<MyClass>& p);  // note the &

// or this if a vector of pointers:
void function1(const vector<MyClass*>& p);  // again note the & 
You really should start with doing what's right, then optimising that if it's not fast enough.

The C Programming Language can only pass parameters to a function by value. To achieve the same effect as pass by reference you explicitly pass the address of a variable by value.

C++ is different. As well as pass by value, there's also pass by reference. If you want the parameter changed, you pass by reference (or pass the address by value). If you don't want the parameter modified, you pass objects by const reference.

Clearly it's cheaper to copy a vector of pointers than it is to copy a vector of objects with non-trivial copy constructors. But it sounds like you shouldn't be copying the vector in the first place.
So, clearly I do not want to pass it by value since it is too costly and I've understood (In fact it was quite straightforward) than the same applies for vectors of pointers. Thanks.

In the other hand, I'd realized the use of "const" but It wasn't raelly clear for me the real meaning when working with objects...so, let me see If I understood correctly:

If I do use something like:

void function1(const vector<Myclass>& p);

I'm passing p without copying its value but making its memory address constant. So that means I can modify the value of p inside the function but its memory address cannot be modified. Right?

But what I was proposing...

void function2(vector <MyClass> *p)

I still passing its reference but it is not save because its memory address could be modified (and ones is usually interested on that).

Is that right?

No, the const qualifier just means that you cannot modify the object. This is a guarantee to the caller and allows passing of temporary objects as a parameter.
An object's memory address never changes after it has been created.
Thank again,

So if I do not want to modify the object the correct way to pass it to a function is:

void function1(const vector<MyClass>& p);

rather than:

void function2(vector <MyClass> p);

cause in this last case a copy of the whole object is created.

But if i do want to modify the object inside the function? Then I think I do have 2 options (equivalent?):

option1:
void function1(vector <MyClass> &p);

or option2:
void function2(vector <MyClass> *p);

but then I have to dereference the object every time I use it inside the function, for example, to acces the "i" element:
(*P)[i]

kbw, is that what you mean when passing by reference (option1) or pass by the address by value (option2)?

Are they really equivalent? So, for me option1 looks more "clean" than option2 but maybe there is something I'm missing?
They are the same except for option 2 (like you said), you have to deference it before you use it.
There are several reasons to choose Option 2. The most common would be that you may not have an object at all, and so you can pass a NULL pointer. There is no NULL reference. If there is no reason to require a pointer, I would go with the reference.
Last edited on

but then I have to dereference the object every time I use it inside the function, for example, to acces the "i" element:
(*P)[i]


Once you have determined that the pointer is valid you can then create a reference from it

1
2
3
4
5
6
func(type *p) {
  //check p ok
  type &t = *p;

  //now can do t. or t[] instead of p-> or (*p)[]
}
Last edited on
closed account (S6k9GNh0)
er, that's rather pointless... Generally you wouldn't do that. You'd either require the caller to pass a reference himself, or use the pointer directly, not convert it into a reference... I've seen people who have said *always pass by reference, passing by pointers is deprecated*. I couldn't feel more disagreement but I can see the point of view.
Last edited on
It's not pointless at all. As exiledAussie says there are some instances where it is better to use pointers over references. Lets assume we have one of these situations - do you know what the syntax is like for calling operators on pointers to objects - it's bloody awful.
closed account (S6k9GNh0)
Why would you pass a pointer instead of a reference if you're just going to cast it to a reference? Generally, it's your job to make sure the data is good before passing to a function that relies on that data. A pointer can be checked in the function but this often doesn't happen in C and they leave it up to you to make sure the data is valid.

http://ideone.com/IPnnL

http://ideone.com/UnYsn
Last edited on
Why would you pass a pointer instead of a reference if you're just going to cast it to a reference?


You will be casting to a reference to use it, after you have done whatever you need to with it as a pointer. If you don't need it as a pointer, then pass a reference.

Two reasons for passing via pointer off the top of my head

1) The function accepts a no-data, which is different to a default constructed object.
2) The function wants to take over (or be involved in) the memory management of the argument.

Given that you have either of these situations, or any other where passing via pointer is better, you may still want to cast to a reference to use the object.

I admit I had neglected to consider (*x)[i], in my head I was thinking x->operator[](y), which is why I think the syntax is awful, and almost always cast to reference. However, even considering your syntax, I still think casting to a reference is better as it is much much easier to read without having the (* ) everywhere, but that is just preference I guess.
Thanks to all of you! It's great to see such a discussion! :)

I have to add something else about reference/pointer syntax.

In my case, the vector container stores objects which have some pointer as a member. Then, in this case I've realized that the syntax can get really awful...

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
class MyClass{

      type *p;

};
...
void function1(vector <MyClass> *m);
...
vector <Myclass> m;

void function1(&m){
...
(*(*m)[i].p)
...
}


I know I could dereference the object first, but is just to illustrate how messy can get!

I've decided (until now :) ) to pass it by reference. It could be interesting to keep in mind these situations
1) The function accepts a no-data, which is different to a default constructed object.
2) The function wants to take over (or be involved in) the memory management of the argument.

Given that you have either of these situations, or any other where passing via pointer is better, you may still want to cast to a reference to use the object.


...but I do agree with computerequip...

Why would you pass a pointer instead of a reference if you're just going to cast it to a reference? Generally, it's your job to make sure the data is good before passing to a function that relies on that data. A pointer can be checked in the function but this often doesn't happen in C and they leave it up to you to make sure the data is valid.
Effective STL by Scott Meyers tells some more good ways to improve your STL code.


Just for Information:

To do some good stuff you may want to use memory allocators other than malloc/libc. Although, STL uses its own allocator but for production applications, custom allocators give good boost.

http://developers.sun.com/solaris/articles/multiproc/multiproc.html
http://developers.sun.com/solaris/articles/solaris_memory.html

- read the last section.

these allocators on multi-processor with multiple threads can give 100% more performance compared to malloc. Due to the locking of malloc not that optimized.
Thank a lot!

The Hoard memory allocator seems to be a sweet and easy option to try (once I finish the application I'm working on).

Have anyone use it before?
I'm developing with XCode IDE (Mac). Has anyone tested the Hoard on it?
Hoard is commercial.. be careful.

it gives good boost over libc on solaris if the application has many threads and designed properly. But Solaris 9 and up comes with libumem which nearly matches hoard, so there is a free option. Dont expect it to do anything much if you have a single threaded application.

It compiles well on VS 2010 but didn't test it on mac. I dont have one. :(
Topic archived. No new replies allowed.