Function returns an object

Dec 4, 2021 at 7:17pm
Hi, what happens at the return of this piece of code?

1
2
3
4
5
6
7
8
9
10
11
12
vector<int> MyFunction(...){
  vector<int> vect;
  //do something to populate the vector here
  while(..){
    vect.push_back(something);
  }
  return vect;
}

//Calling my MyFunction
auto returned_vect = MyFunction(...)


At the return ,is there a copy of vect to returned_vect? If there is, how to avoid the copy?

Thanks!

Dec 4, 2021 at 8:10pm
What you're referring to is whether copy elision is happening.

It's mandatory in some cases, and an optional compiler optimization in other cases.
https://en.cppreference.com/w/cpp/language/copy_elision

A language expert can probably give a better description, but that article above explains it in depth.

I think in this case, it's covered by
Non-mandatory elision of copy/move (since C++11) operations
...
In a return statement, when the operand is the name of a non-volatile object with automatic storage duration, which isn't a function parameter or a catch clause parameter, and which is of the same class type (ignoring cv-qualification) as the function return type. This variant of copy elision is known as NRVO, "named return value optimization".

This NRVO is a compiler optimization and not mandatory. But I might be wrong, and I encourage any one else to also reply.

Studying the assembly produced can also help demystify things.

You can also do things like make the constructor have side-effects...
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
// Example program
#include <iostream>
#include <string>
#include <vector>

using std::vector;

struct Thing {
    Thing()
    {
        std::cout << ".\n";
    }
};

vector<Thing> MyFunction(){
  vector<Thing> vect;
  
  for (int i = 0; i < 5; i++)
  {
    vect.push_back({});   
  }

  return vect;
}


int main()
{
    //Calling my MyFunction
    auto returned_vect = MyFunction();
}

The program only prints 5 dots, even with optimization turned off, so it's not doing any unnecessary copying.
Last edited on Dec 4, 2021 at 8:18pm
Dec 4, 2021 at 8:17pm
A simple way to be sure that no copy is going to happen is passing a reference:
(note the & before the parameter name!)
1
2
3
4
5
6
7
8
9
10
void MyFunction(vector<int> &vect, ...) {
  //do something to populate the vector here
  while(..){
    vect.push_back(something);
  }
}

//Calling my MyFunction
vector<int> vect;
MyFunction(vect, ...);


...either that, or have the function allocate a vector on the heap and return a pointer:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
vector<int> *MyFunction(...) {
  vector<int> *vect = new vector<int>();
  //do something to populate the vector here
  while(..){
    vect->push_back(something);
  }
  return vect;
}

//Calling my MyFunction
vector<int> *vect = MyFunction(...);

// Don't leak the memory!
delete vect;
Last edited on Dec 4, 2021 at 8:51pm
Dec 5, 2021 at 3:06am
@kigar64551

Just my 2 cents worth, I don't wish to argue per say: you sound way more experienced/ knowledgeable than me, a back yard basher :+) Welcome to the forum btw, I have enjoyed reading your posts so far :+)

If not using copy elision, I would much prefer returning a reference using the out parameter as in your first code snippet.

Using new with a vector which already puts it's data on the heap, I am not so much of a fan. One reason is that std::vector is a RAII container: using new kind of negates that.

In the past I have said that C++ compilers seem to implement references as const pointers wrapped up in some TMP, no one disagreed then, that might mean the statement was correct. I am yet to see an example of how to do it otherwise. I am aware that the C++ standard does not specify how to implement things, that is left up to the vendors. So using a reference and a pointer as per these two examples might be the same thing.

I wonder which compilers don't implement NRVO, it seems that g++ and clang++ do; I am not sure* that I can test the MS one, I am a Linux guy :+) * Not sure if it's possible to install MS compiler into Visual Studio Code on Linux?

The other thing with this is it depends on which standard one is compiling against. I thought I would mention that, because I always use the latest versions of everything, I sometimes forget about the poor souls still stuck with c++14, or heaven forbid c++11 :+|
Dec 5, 2021 at 4:09am
> I wonder which compilers don't implement NRVO

The Microsoft compiler does implement non-mandatory copy elision in release builds;
in debug builds, only mandatory copy elision is performed.

Mainstream implementations consistently perform this optimisation where it is permitted;
to the extent that a warning may be generated if it is disabled by a programming construct.

1
2
3
4
5
6
7
8
std::vector<int> baz()
{
    std::vector<int> vec { 0, 1, 2, 3, 4, 5, 6 } ; 

    return std::move(vec) ; // copy elision is disabled
    // *** warning *** : moving a local object in a return statement prevents copy elision
    //                   note: remove std::move call here
}

http://coliru.stacked-crooked.com/a/62e18a3159419583


Note that passing a vector by reference to try and avoid copy construction (first default construct and then assign to it) tends to be sub-optimal, at least post C++11.
Dec 5, 2021 at 8:48am
Thanks for the link. I think it makes sense that NRVO is doable and not mandatory.

I think you're suggesting that this code printing 5 dots means there's no copy.
However, printing 5 dots just means that 5 Thing objects were created during the push_back.
Then it copies the 5 objects to the container vect.
Whether there's a copy at the return is still unknown.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
// Example program
#include <iostream>
#include <string>
#include <vector>

using std::vector;

struct Thing {
    Thing()
    {
        std::cout << ".\n";
    }
};

vector<Thing> MyFunction(){
  vector<Thing> vect;
  
  for (int i = 0; i < 5; i++)
  {
    vect.push_back({});   
  }

  return vect;
}


int main()
{
    //Calling my MyFunction
    auto returned_vect = MyFunction();
}
Dec 5, 2021 at 8:58am
1
2
3
4
5
6
void MyFunction(vector<int> &vect, ...) {
  //do something to populate the vector here
  while(..){
    vect.push_back(something);
  }
}


Yeah, it's a bit annoying when I have to declare
 
vector<int> returned_vect; 

before calling the function


----
2nd method

1
2
3
4
5
6
7
8
9
10
11
12
vector<int> *MyFunction(...) {
  vector<int> *vect = new vector<int>();
  //do something to populate the vector here
  while(..){
    vect->push_back(something);
  }
  return vect;
}

//Calling my MyFunction
vector<int> *vect = MyFunction(...);
delete vect;


I'm a novice and coming from C, I like pointers. However, I find my code very messy and confusing when there are references and pointers :(,

Would this work?
1
2
3
4
5
6
7
8
vector<int>& MyFunction(...) {
  vector<int> *vect = new vector<int>();
  //do something to populate the vector here
  while(..){
    vect->push_back(something);
  }
  return *vect;
}


Then when/how is the object deleted/destroyed?
Thanks.




Last edited on Dec 5, 2021 at 8:59am
Dec 5, 2021 at 9:35am
> I find my code very messy and confusing when there are references and pointers

Why do you want to use either references or pointers for this?
Why not:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
#include <iostream>
#include <string>
#include <vector>

struct Thing  {

    Thing()
    {
        std::cout << ".\n"; // side effect of default constructor
    }
};

std::vector<Thing> MyFunction() {

  // return prvalue of a vector containing 5 default constructed objects
  return std::vector<Thing>( 5 ) ; // mandatory copy elision
}

int main() {

    const auto vec = MyFunction() ;
}
Dec 5, 2021 at 10:33am
If you're worried about a copy happening on the function return value for say a vector, then in the function display the .data() value and also for the returned value from the function.

Consider:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
#include <vector>
#include <iostream>

std::vector<int> MyFunction() {
	std::vector<int> vect (10, 5);

	std::cout << &vect << "  " << vect.data() << '\n';
	return vect;
}

int main() {
	const auto returned_vect{ MyFunction() };

	std::cout << &returned_vect << "  " << returned_vect.data() << '\n';
}


which for me displays:


000000000018F880  0000000000205740
000000000018F880  0000000000205740


where the address of returned_vect is the same as vect in the function - and their .data() values.

Now consider using move():

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
#include <vector>
#include <iostream>

std::vector<int> MyFunction() {
	std::vector<int> vect (10, 5);

	std::cout << &vect << "  " << vect.data() << '\n';
	return std::move(vect);
}

int main() {
	const auto returned_vect{ MyFunction() };

	std::cout << &returned_vect << "  " << returned_vect.data() << '\n';
}


which for me displays:


000000000018FD70  00000000002E5740
000000000018FD88  00000000002E5740


where the addresses of .data() are the same but the address of the vector variables are different.
Last edited on Dec 5, 2021 at 10:34am
Dec 5, 2021 at 1:04pm
Would this work?
1
2
3
4
5
6
7
8
vector<int>& MyFunction(...) {
  vector<int> *vect = new vector<int>();
  //do something to populate the vector here
  while(..){
    vect->push_back(something);
  }
  return *vect;
}


Then when/how is the object deleted/destroyed?

That is exactly the problem with this "solution". There would be no straight-forward way to destroy (or even know that you are supposed to destroy) the heap allocated object for the caller of your function!

In theory, the caller could use the & operator to get a pointer from the reference, just like you can get a pointer from a "local" object, and then it can be delete'd. But that "workaround" has a smell to me :-[

Yeah, it's a bit annoying when I have to declare
vector<int> returned_vect;
before calling the function

But it's extremely common to have functions that take a reference (or pointer (or iterator)) to an existing container object and then perform some kind of operation on that existing container.


After all, I think you can just return an std:vector object by value and let [N]RVO do its job ;-)
Last edited on Dec 5, 2021 at 1:37pm
Dec 5, 2021 at 1:52pm
Well if you want to play with memory pointers (why??), then use unique_ptr so that you don't need to be concerned about freeing memory.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
#include <vector>
#include <iostream>
#include <memory>
#include <numeric>

auto MyFunction() {
	auto vect {std::make_unique<std::vector<int>>(10)};

	std::iota(vect->begin(), vect->end(), 0);
	return vect;
}

int main() {
	auto v {MyFunction()};

	for (const auto& vv : *v)
		std::cout << vv << ' ';

	std::cout << '\n';
}



0 1 2 3 4 5 6 7 8 9

Last edited on Dec 5, 2021 at 1:53pm
Dec 5, 2021 at 4:32pm
jeff - right, I should have added a copy ctor to my example as well.
Dec 7, 2021 at 5:18am
[deleted]
Last edited on Dec 7, 2021 at 5:22am
Dec 7, 2021 at 5:49am
> The addresses of internal Thing and returned Thing are different.
> What exactly should be done at the return to ensure mandatory copy elison?

To see the effect of copy elision, return the original local object; not a second one (line #47).
This is non-mandatory copy elision (copy elision that is permitted, but not required by the standard);
all mainstream implementations do perform this optimisation.

To see copy elision in action, run this (modified) code:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
#include <iostream>
#include <string>
#include <vector>

using namespace std;
using std::vector;

struct Thing {
    Thing(int init = 0) : a(init)
    {
        std::cout << "Object created " << hex << (this) << endl;
    }

    Thing(const Thing &t) {
        std::cout << "Copy from :" << hex << &t << " to : " << this << " a = " << t.a << endl;
        this->a = t.a;
    }

    int a;
};

vector<Thing> MyFunction()
{
  vector<Thing> vect;
  vect.reserve(5) ; // **** reserve enough storage for 5 objects (to avoid later reallocation)

  for (int i = 0; i < 5; i++)
  {
    // vect.push_back(Thing(i));
    vect.emplace_back(i) ; // *** directly construct the Thing object inside the vector, in situ
  }

  cout << "Vector inside function " << hex << &vect << endl;
  return vect; // permissible copy elision
}

Thing MyThingFunction(int i){

    auto t = Thing(i); // non cv-qualified local object with automatic storage duration
    cout << "Thing inside function " << hex << &t << endl;
    // return Thing(i); // *** don't do this (instantiate a second object)
    return t ; // **** return the local object t: permissible copy elision
}

Thing MyThingFunction_2(int i) {

    return MyThingFunction(i) ; // mandatory copy elision: return prvalue of the same type
}


vector<Thing> MyFunction_2() {

   return MyFunction() ; // mandatory copy elision: return prvalue of the same type
}

int main( /* int agrc, char** argv */ )
{

    auto returned_thing  = MyThingFunction_2(1); // *** MyThingFunction_2 => MyThingFunction
    cout << "Thing outside function " << hex << &returned_thing << " a = " << returned_thing.a << endl;

    cout << endl << endl << endl;
    auto returned_vect   = MyFunction_2(); // *** MyFunction_2 => MyFunction
    cout << "Vector outside function " << hex << &returned_vect << endl;

    // *** added ***
    std::cout << "objects in the vector are at: " ;
    for( const auto& th : returned_vect ) std::cout << std::addressof(th) << ' ' ;
    std::cout << '\n' ;

    //vector<Thing> referenced_vect;
    //MyFunction1(referenced_vect);
}


http://coliru.stacked-crooked.com/a/767501ea3abf125a
Last edited on Dec 7, 2021 at 5:51am
Dec 7, 2021 at 9:05am
Yeah, realized my mistake after posting.
Thanks very much for your code, JLBorges!
It's clearer now.
Last edited on Dec 7, 2021 at 9:06am
Topic archived. No new replies allowed.