deep copy logic c++

I just want to make sure if I understand the idea behind deep copy correctly, just a first year CS student.

For example, if object A = {1, 2, 3}, and we create a new object B = {} with no elements.

B = A (using assignment operator)or B(A) (using copy constructor). Now B = {1, 2, 3} because we copied the elements of object A into object B.

Now we add a new element, 4 to object A such that A = {1, 2, 3, 4}. Does that mean object B now have to contain 4 as well (i.e, B = {1, 2, 3, 4})?


Please consider that our data structure is an array. I know we cannot just set elements to an object, but imagine we already added those elements using public functions of the class like insert.

Last edited on
Does that mean object B now have to contain 4 as well

No and it has nothing to do with whether or not B is a deep copy of A. A copied object reflects the state of the original object only at the time of copying and does not capture any subsequent changes to the original object. For this we'd require a reference:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
# include <iostream>
# include <vector>

int main()
{
    std::vector<int> A {1, 2, 3};
    auto B = A; // B is std::vector<int>
    auto& C = A; //C is std::vector<int>&

    A.push_back(4);

    std::cout << "A: ";
    for (const auto& elem : A)std::cout << elem << " ";

    std::cout << "\nB: ";
    for (const auto& elem : B)std::cout << elem << " ";

    std::cout << "\nC: ";
    for (const auto& elem : C)std::cout << elem << " ";
}

Ok, yeah you are right. In order to make object B to have the same elements as object A, object B reference = object A.

@gunnerfunner, my professor specifically wants me to implement a deep copy using my copy constructor for one purpose: if we manipulate one object, then the other one gets changed too. Does that mean I can make it work by only using the & sign? There is nothing else can be done in case of deep copy?



two separate things are going on here: (a) copy ctor and (b) references
you can leave the copy ctor as is (or even use the compiler generated one as in the following example) but to reflect subsequent changes in the original variable the variable being copied to must be a reference to the original:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
# include <iostream>
# include <string>

struct DeepCopy
{
    std::string m_name;
};

int main()
{
    DeepCopy a{"James"};
    auto& b = a; // using the complier generated copy ctor
    a.m_name = "Jane";
    std::cout << b.m_name << "\n";
}
Ok, so there are some terminology issues here. An object has member values:

1
2
3
4
5
6
struct A
{
  int x, y;
};

A a;

Now, if I make a copy of a, all the member values get copied, just as we expect:

1
2
3
A b = a;
assert( b.x == a.x );
assert( b.y == a.y );


However, for objects that use references or pointers to other data, the compiler cannot assume that you wish your reference to be invalidated; it copies the reference (or pointer) but does nothing to the data being referred to.

1
2
3
4
5
6
7
8
9
10
11
12
struct B
{
  int* x;
};

int y = 7;
B b;
b.x = &y;

B c = b;

assert( b.x == c.x )  // both point to y 

This is where the idea of a deep copy comes in. Sometimes you need dynamic data to store your data, but you are the sole keeper of that data.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
struct string
{
  char*  data;
  size_t length;

  string( const char* s )
  {
    length = strlen( s );
    data = new char[ length + 1 ];
    strcpy( data, s );
  }
};

string s = "Hello world!";

So, now what happens if we do our normal copy?

1
2
3
string z = s;

assert( s.data == z.data );  // hey, this is true, so no failure 

We call this a shallow copy. Both 's' and 'z' point to the same internal data.

This causes some interesting behavior:

1
2
3
4
5
string s = "Hello world!";
string z = s;
z.data[0] = '\'';

std::cout << s << "\n";  // prints: 'ello world! 


It is worse than that though. It is a serious problem:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
struct string
{
  char*  data;
  size_t length;

  string( const char* s )
  {
    length = strlen( s );
    data = new char[ length + 1 ];
    strcpy( data, s );
  }
  
  ~string()
  {
    delete[] data;
  }
};
1
2
3
4
bool is_longer_than_5( string s )
{
  return s.length > 5;
}
1
2
3
4
string greeting = "Hello world!";

if (is_longer_than_5( greeting ))
  std::cout << s.data << "\n";  // WARNING WILL ROBINSON! 

*This might not actually fail on your compiler because it is so simple that the code doesn't really have an opportunity to clobber the memory first.

When you copied greeting into the argument s, you did a shallow copy. When s was destroyed, so was greeting.

Hence, line 4 is trying to access invalid memory. You'll get a host of likely problems, all terminating with a crash due to invalid memory access.

If you had made a deep copy, this could be avoided:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
struct string
{
  char*  data;
  size_t length;

  string( const char* s = "" )
  {
    length = strlen( s );
    data = new char[ length + 1 ];
    strcpy( data, s );
  }

  string( const string& s ) { copy( s ); }
  string& operator = ( const string& s ) { copy( s ); return *this; }

  void copy( const string& s )
  {
    delete[] data;
    length = s.length;
    strcpy( data, s.data );
  }
  
  ~string()
  {
    delete[] data;
  }
};

Now when you create a copy of a string, it also copies the referenced data.

1
2
3
4
5
6
7
string s = "Hello world!"
string z = s;
z.data[9] = 'm';
z.data[10] = 's';

std::cout << s.data << "\n";  // prints "Hello world!"
std::cout << z.data << "\n";  // prints "Hello worms!" 

Implementing a deep copy is required when you know that you are the sole keeper of some referenced data, and invokes the Rule of Three. https://en.wikipedia.org/wiki/Rule_of_three_(C%2B%2B_programming)
These days it is the Rule of Five. For your assignment, you need only worry about the Three.

This was all typed in off the top of my head. Typos may have occurred.
Hope this helps.
Last edited on
Topic archived. No new replies allowed.