Function returns an object

Hi, what happens at the return of this piece of code?

1
2
3
4
5
6
7
8
9
10
11
12
vector<int> MyFunction(...){
  vector<int> vect;
  //do something to populate the vector here
  while(..){
    vect.push_back(something);
  }
  return vect;
}

//Calling my MyFunction
auto returned_vect = MyFunction(...)


At the return ,is there a copy of vect to returned_vect? If there is, how to avoid the copy?

Thanks!

What you're referring to is whether copy elision is happening.

It's mandatory in some cases, and an optional compiler optimization in other cases.
https://en.cppreference.com/w/cpp/language/copy_elision

A language expert can probably give a better description, but that article above explains it in depth.

I think in this case, it's covered by
Non-mandatory elision of copy/move (since C++11) operations
...
In a return statement, when the operand is the name of a non-volatile object with automatic storage duration, which isn't a function parameter or a catch clause parameter, and which is of the same class type (ignoring cv-qualification) as the function return type. This variant of copy elision is known as NRVO, "named return value optimization".

This NRVO is a compiler optimization and not mandatory. But I might be wrong, and I encourage any one else to also reply.

Studying the assembly produced can also help demystify things.

You can also do things like make the constructor have side-effects...
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
// Example program
#include <iostream>
#include <string>
#include <vector>

using std::vector;

struct Thing {
    Thing()
    {
        std::cout << ".\n";
    }
};

vector<Thing> MyFunction(){
  vector<Thing> vect;
  
  for (int i = 0; i < 5; i++)
  {
    vect.push_back({});   
  }

  return vect;
}


int main()
{
    //Calling my MyFunction
    auto returned_vect = MyFunction();
}

The program only prints 5 dots, even with optimization turned off, so it's not doing any unnecessary copying.
Last edited on
A simple way to be sure that no copy is going to happen is passing a reference:
(note the & before the parameter name!)
1
2
3
4
5
6
7
8
9
10
void MyFunction(vector<int> &vect, ...) {
  //do something to populate the vector here
  while(..){
    vect.push_back(something);
  }
}

//Calling my MyFunction
vector<int> vect;
MyFunction(vect, ...);


...either that, or have the function allocate a vector on the heap and return a pointer:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
vector<int> *MyFunction(...) {
  vector<int> *vect = new vector<int>();
  //do something to populate the vector here
  while(..){
    vect->push_back(something);
  }
  return vect;
}

//Calling my MyFunction
vector<int> *vect = MyFunction(...);

// Don't leak the memory!
delete vect;
Last edited on
@kigar64551

Just my 2 cents worth, I don't wish to argue per say: you sound way more experienced/ knowledgeable than me, a back yard basher :+) Welcome to the forum btw, I have enjoyed reading your posts so far :+)

If not using copy elision, I would much prefer returning a reference using the out parameter as in your first code snippet.

Using new with a vector which already puts it's data on the heap, I am not so much of a fan. One reason is that std::vector is a RAII container: using new kind of negates that.

In the past I have said that C++ compilers seem to implement references as const pointers wrapped up in some TMP, no one disagreed then, that might mean the statement was correct. I am yet to see an example of how to do it otherwise. I am aware that the C++ standard does not specify how to implement things, that is left up to the vendors. So using a reference and a pointer as per these two examples might be the same thing.

I wonder which compilers don't implement NRVO, it seems that g++ and clang++ do; I am not sure* that I can test the MS one, I am a Linux guy :+) * Not sure if it's possible to install MS compiler into Visual Studio Code on Linux?

The other thing with this is it depends on which standard one is compiling against. I thought I would mention that, because I always use the latest versions of everything, I sometimes forget about the poor souls still stuck with c++14, or heaven forbid c++11 :+|
> I wonder which compilers don't implement NRVO

The Microsoft compiler does implement non-mandatory copy elision in release builds;
in debug builds, only mandatory copy elision is performed.

Mainstream implementations consistently perform this optimisation where it is permitted;
to the extent that a warning may be generated if it is disabled by a programming construct.

1
2
3
4
5
6
7
8
std::vector<int> baz()
{
    std::vector<int> vec { 0, 1, 2, 3, 4, 5, 6 } ; 

    return std::move(vec) ; // copy elision is disabled
    // *** warning *** : moving a local object in a return statement prevents copy elision
    //                   note: remove std::move call here
}

http://coliru.stacked-crooked.com/a/62e18a3159419583


Note that passing a vector by reference to try and avoid copy construction (first default construct and then assign to it) tends to be sub-optimal, at least post C++11.
Thanks for the link. I think it makes sense that NRVO is doable and not mandatory.

I think you're suggesting that this code printing 5 dots means there's no copy.
However, printing 5 dots just means that 5 Thing objects were created during the push_back.
Then it copies the 5 objects to the container vect.
Whether there's a copy at the return is still unknown.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
// Example program
#include <iostream>
#include <string>
#include <vector>

using std::vector;

struct Thing {
    Thing()
    {
        std::cout << ".\n";
    }
};

vector<Thing> MyFunction(){
  vector<Thing> vect;
  
  for (int i = 0; i < 5; i++)
  {
    vect.push_back({});   
  }

  return vect;
}


int main()
{
    //Calling my MyFunction
    auto returned_vect = MyFunction();
}
1
2
3
4
5
6
void MyFunction(vector<int> &vect, ...) {
  //do something to populate the vector here
  while(..){
    vect.push_back(something);
  }
}


Yeah, it's a bit annoying when I have to declare
 
vector<int> returned_vect; 

before calling the function


----
2nd method

1
2
3
4
5
6
7
8
9
10
11
12
vector<int> *MyFunction(...) {
  vector<int> *vect = new vector<int>();
  //do something to populate the vector here
  while(..){
    vect->push_back(something);
  }
  return vect;
}

//Calling my MyFunction
vector<int> *vect = MyFunction(...);
delete vect;


I'm a novice and coming from C, I like pointers. However, I find my code very messy and confusing when there are references and pointers :(,

Would this work?
1
2
3
4
5
6
7
8
vector<int>& MyFunction(...) {
  vector<int> *vect = new vector<int>();
  //do something to populate the vector here
  while(..){
    vect->push_back(something);
  }
  return *vect;
}


Then when/how is the object deleted/destroyed?
Thanks.




Last edited on
> I find my code very messy and confusing when there are references and pointers

Why do you want to use either references or pointers for this?
Why not:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
#include <iostream>
#include <string>
#include <vector>

struct Thing  {

    Thing()
    {
        std::cout << ".\n"; // side effect of default constructor
    }
};

std::vector<Thing> MyFunction() {

  // return prvalue of a vector containing 5 default constructed objects
  return std::vector<Thing>( 5 ) ; // mandatory copy elision
}

int main() {

    const auto vec = MyFunction() ;
}
If you're worried about a copy happening on the function return value for say a vector, then in the function display the .data() value and also for the returned value from the function.

Consider:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
#include <vector>
#include <iostream>

std::vector<int> MyFunction() {
	std::vector<int> vect (10, 5);

	std::cout << &vect << "  " << vect.data() << '\n';
	return vect;
}

int main() {
	const auto returned_vect{ MyFunction() };

	std::cout << &returned_vect << "  " << returned_vect.data() << '\n';
}


which for me displays:


000000000018F880  0000000000205740
000000000018F880  0000000000205740


where the address of returned_vect is the same as vect in the function - and their .data() values.

Now consider using move():

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
#include <vector>
#include <iostream>

std::vector<int> MyFunction() {
	std::vector<int> vect (10, 5);

	std::cout << &vect << "  " << vect.data() << '\n';
	return std::move(vect);
}

int main() {
	const auto returned_vect{ MyFunction() };

	std::cout << &returned_vect << "  " << returned_vect.data() << '\n';
}


which for me displays:


000000000018FD70  00000000002E5740
000000000018FD88  00000000002E5740


where the addresses of .data() are the same but the address of the vector variables are different.
Last edited on
Would this work?
1
2
3
4
5
6
7
8
vector<int>& MyFunction(...) {
  vector<int> *vect = new vector<int>();
  //do something to populate the vector here
  while(..){
    vect->push_back(something);
  }
  return *vect;
}


Then when/how is the object deleted/destroyed?

That is exactly the problem with this "solution". There would be no straight-forward way to destroy (or even know that you are supposed to destroy) the heap allocated object for the caller of your function!

In theory, the caller could use the & operator to get a pointer from the reference, just like you can get a pointer from a "local" object, and then it can be delete'd. But that "workaround" has a smell to me :-[

Yeah, it's a bit annoying when I have to declare
vector<int> returned_vect;
before calling the function

But it's extremely common to have functions that take a reference (or pointer (or iterator)) to an existing container object and then perform some kind of operation on that existing container.


After all, I think you can just return an std:vector object by value and let [N]RVO do its job ;-)
Last edited on
Well if you want to play with memory pointers (why??), then use unique_ptr so that you don't need to be concerned about freeing memory.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
#include <vector>
#include <iostream>
#include <memory>
#include <numeric>

auto MyFunction() {
	auto vect {std::make_unique<std::vector<int>>(10)};

	std::iota(vect->begin(), vect->end(), 0);
	return vect;
}

int main() {
	auto v {MyFunction()};

	for (const auto& vv : *v)
		std::cout << vv << ' ';

	std::cout << '\n';
}



0 1 2 3 4 5 6 7 8 9

Last edited on
jeff - right, I should have added a copy ctor to my example as well.
[deleted]
Last edited on
> The addresses of internal Thing and returned Thing are different.
> What exactly should be done at the return to ensure mandatory copy elison?

To see the effect of copy elision, return the original local object; not a second one (line #47).
This is non-mandatory copy elision (copy elision that is permitted, but not required by the standard);
all mainstream implementations do perform this optimisation.

To see copy elision in action, run this (modified) code:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
#include <iostream>
#include <string>
#include <vector>

using namespace std;
using std::vector;

struct Thing {
    Thing(int init = 0) : a(init)
    {
        std::cout << "Object created " << hex << (this) << endl;
    }

    Thing(const Thing &t) {
        std::cout << "Copy from :" << hex << &t << " to : " << this << " a = " << t.a << endl;
        this->a = t.a;
    }

    int a;
};

vector<Thing> MyFunction()
{
  vector<Thing> vect;
  vect.reserve(5) ; // **** reserve enough storage for 5 objects (to avoid later reallocation)

  for (int i = 0; i < 5; i++)
  {
    // vect.push_back(Thing(i));
    vect.emplace_back(i) ; // *** directly construct the Thing object inside the vector, in situ
  }

  cout << "Vector inside function " << hex << &vect << endl;
  return vect; // permissible copy elision
}

Thing MyThingFunction(int i){

    auto t = Thing(i); // non cv-qualified local object with automatic storage duration
    cout << "Thing inside function " << hex << &t << endl;
    // return Thing(i); // *** don't do this (instantiate a second object)
    return t ; // **** return the local object t: permissible copy elision
}

Thing MyThingFunction_2(int i) {

    return MyThingFunction(i) ; // mandatory copy elision: return prvalue of the same type
}


vector<Thing> MyFunction_2() {

   return MyFunction() ; // mandatory copy elision: return prvalue of the same type
}

int main( /* int agrc, char** argv */ )
{

    auto returned_thing  = MyThingFunction_2(1); // *** MyThingFunction_2 => MyThingFunction
    cout << "Thing outside function " << hex << &returned_thing << " a = " << returned_thing.a << endl;

    cout << endl << endl << endl;
    auto returned_vect   = MyFunction_2(); // *** MyFunction_2 => MyFunction
    cout << "Vector outside function " << hex << &returned_vect << endl;

    // *** added ***
    std::cout << "objects in the vector are at: " ;
    for( const auto& th : returned_vect ) std::cout << std::addressof(th) << ' ' ;
    std::cout << '\n' ;

    //vector<Thing> referenced_vect;
    //MyFunction1(referenced_vect);
}


http://coliru.stacked-crooked.com/a/767501ea3abf125a
Last edited on
Yeah, realized my mistake after posting.
Thanks very much for your code, JLBorges!
It's clearer now.
Last edited on
Topic archived. No new replies allowed.