Lazy copying - deep copy problem

Hi,
I have a class which is using lazy copying - when a copy constructor is called, it creates shallow copy and when one method is called it creates a deep copy and add some more data.

I'm stuck in part where I should create a deep copy from that shallow copy.

Deep copy should look like this:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
  
  m_count = copy.m_count;
  k = copy.k;
  m_record = new TRecord * [k*SIZE];
  char * temp;
  for(int i=0;i<m_count;i++) {
    m_record[i] = new TRecord;
    temp = new char[12];
    strcpy(temp, copy.m_record[i]->Id);
    m_record[i]->Id = temp;

    temp = new char[strlen(copy.m_record[i]->Name) + 1];
    strcpy(temp, copy.m_record[i]->Name);
    m_record[i]->Name = temp;

    temp = new char[strlen(copy.m_record[i]->Surname) + 1];
    strcpy(temp, copy.m_record[i]->Surname);
    m_record[i]->Surname = temp;

    m_record[i]->p_count = copy.m_record[i]->p_count;
    m_record[i]->k = copy.m_record[i]->k;

    m_record[i]->Place = new TPlace * [m_record[i]->k*10];
    for(int j=0;j<m_record[i]->p_count;j++) {

      m_record[i]->Place[j] = new TPlace;

      temp = new char[strlen(copy.m_record[i]->Place[j]->Date) + 1];
      strcpy(temp, copy.m_record[i]->Place[j]->Date);
      m_record[i]->Place[j]->Date = temp;

      temp = new char[strlen(copy.m_record[i]->Place[j]->Street) + 1];
      strcpy(temp, copy.m_record[i]->Place[j]->Street);
      m_record[i]->Place[j]->Street = temp;

      temp = new char[strlen(copy.m_record[i]->Place[j]->City) + 1];
      strcpy(temp, copy.m_record[i]->Place[j]->City);
      m_record[i]->Place[j]->City = temp;
    }
  }


but I don't know how to implement it that method. I have tried to create a temporary object and fill it with *this
temp.m_count = this->m_count;...
and at the end
*this=temp;
but it doesn't work.

How should I do it? Thanks for your replies.
Last edited on
How should I do it?


The easiest, safest, and best way to do it would be to not use new or char arrays or pointers at all, but instead use std::vector and std::string.

1
2
3
4
5
6
7
8
9
struct TRecord
{
    std::string Name;     // better than   char* Name
    std::string Surname;  // better than   char* Surname
    // ... etc
};

//.. in your owner class:
std::vector<TRecord>  m_record;  // MUCH better than TRecord** m_record 



If you do that, all copies will be deep copies and you will not have to write any copy constructor or assignment operator, as the compiler provided versions will take care of all of it for you. You also will not have to worry about memory leaks or anything (currently you are leaking tons of memory because you are not deleting anything).
I'm forbidden to use STL and string in this task.

Why I don't create deep copy in first place? Because there are so many copies and just few of them are changed and they take so much memory.

I'm not leaking anything. Don't worry, everything is deleted in destructor.
Last edited on
I'm forbidden to use STL and string in this task.


Ah... of course. Acadamia loves to teach people the wrong way to do things.

Why I don't create deep copy in first place? Because there are so many copies and just few of them are changed and they take so much memory.


Oye... so you want to mix deep and shallow copies? That's atrocious. If your class is telling you to do this, it is teaching you really bad habits.

All copies should be deep copies. If you want to have a shallow copy... don't copy it. Instead use a pointer to refer to the original object:

1
2
3
4
5
6
MyClass a;

MyClass b = a;  // should be a DEEP copy!

// don't want a deep copy?  use a pointer!
MyClass* c = &a;  // not a deep copy. 


I'm not leaking anything. Don't worry, everything is deleted through destructor.


- Unless the code you posted is from the copy constructor, it leaks memory.

- If you have shallow copies and are deleting them in the destructor, you will cause your program to crash because you'll be deleting the same data multiple times.


You're in real ugly territory here, and the best way to do it right is to clearly define who owns what memory. This becomes very complicated when you "share" ownership (ie: shallow copy) because then memory has no clear owner and the task of cleaning it up becomes awkward and error prone.

My advice:

1) Don't have anything do shallow copies. Ever. The closest thing to shallow copies you should ever do are "copy on write" semantics, but even that is usually not a good idea.

2) Memory should be owned and managed by whichever class defines it. In this case, since TRecord has a bunch of char*'s, then the TRecord class should be the one responsible for allocating/destroying that data:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
struct TRecord  // or you can make this a class with private members if you want encapsulation
{
    char* Name;

    // default ctor (to make sure the pointer is always valid
    TRecord()
    {
        Name = new char[1];
        strcpy( Name, "" );
    }

    // copy ctor (to deep copy)
    TRecord(const TRecord& copy)
    {
        Name = new char[ strlen(copy.Name) + 1 ];  // thanks LowestOne for correction
        strcpy( Name, copy.Name );
    }

    // assignment operator (to deep copy on assignment)
    //   note there are much better ways to implement this, I'm just showing this because
    //   it's the easiest way to illustrate.  Again, the "best" way to implement this is to
    //   NOT implement it and let std::string worry about this crap.
    TRecord& operator = (const TRecord& copy)
    {
        if(&copy == this) return *this;  // to prevent self-assignment

        delete[] Name;  // old data must be deleted before we make it point to new data
        Name = new char[ strlen(copy.Name) +1 ];  // thanks LowestOne for correction
        strcpy( Name, copy.Name );
    }

    // dtor
    ~TRecord()
    {
        delete[] Name;
    }
};



If you have that set up, the "owner" class which has the array of TRecords and can just copy the TRecords in full without having to copy each individual member.


Likewise, you should also do this with TPlace... so then TRecord does not have to be responsible for copying TPlace's members.


3) Don't allocate pointers to pointers just for a single object. This made me want to vomit:

1
2
3
m_record = new TRecord * [k*SIZE];
//...
   m_record[i] = new TRecord;  // BARF! 


There is no reason for m_record[i] to be a pointer here. You are just making it needlessly complicated and slowing it down by doing extra allocation and cleanup. Just make your array and array of objects instead of an array of pointers.

I also don't know why you are allocating k*SIZE elements, but then only assigning m_count elements. This raises a few questions:
- What is k and why does it exist?
- What is the difference between SIZE and m_count?

Unless this is a size/capacity thing where you are overallocating in anticipation of future growth... but even then...

Anyway I went on a tangent here. The point is... allocate an array of object, not of pointers. What's more, since those objects all take care of their own copying, assigning them is easy:

1
2
3
4
5
6
7
  m_count = copy.m_count;
  k = copy.k;
  m_record = new TRecord[k*SIZE];  // allocate objects, not pointers.
       // m_record should be a TRecord*, not a TRecord**

  for(int i = 0; i < k*SIZE; ++i)
    m_record[i] = copy.m_record[i];  // let the class's assignment operator do the actual work 
Last edited on
- Unless the code you posted is from the copy constructor, it leaks memory.

No, my copy constructor does shallow copy.
- If you have shallow copies and are deleting them in the destructor, you will cause your program to crash because you'll be deleting the same data multiple times.

I have a flag where it says if the object is original or copy. If it is original, destructor destroys it.

I did deep copies for everything, but it took too much memory and that's why I have to to this this way.

I really appreciate your help, but I really have to do this this way :-/
Last edited on
Hi Disch, just going to point this out:
Name = new char[ strlen(copy.Name) + 1];

Anyway, maybe you can help me. I've read about move constructors, but the context was odd, can you explain:
TRecord(const TRecord&& copy)

The purpose, as I understand, is when we know that the old "Name" isn't going to be used anymore we can do this something like this:
1
2
  Name = copy.Name;
  copy.Name = NULL;

Thanks

Edit: If there was a move ctor, how would I get that called rather than just the copy ctor?
Last edited on
I have a flag where it says if the object is original or copy. If it is original, destructor destroys it.


Then the code you posted has leaks. You'd have to check to see if the data was original or not, and it it was, delete it before reallocating.

But of course even that is problematic because you can't delete something that has a copy or else the copy's pointers will go bad.

Example:

1
2
3
4
5
MyClass a;  // the original
MyClass b = a;  // a shallow copy (refers to a's data)

MyClass c;
a = DeepCopy(c);  // Huge problem.  What happens now? 


This is a "you can't win" situation.

- If you delete a's data, then the b object will go bad because it's just a shallow copy of a and all the data it points to no longer exists.

- If you don't delete a's data, then you leak memory because b's dtor won't delete it either.

The only way around this is to do some kind of reference counting memory management (like std::shared_ptr... which of course you can't use).


Again... this is really, really ugly territory you're in. You really should not be doing it this way. It's a minefield and you have to be extremely careful to get it right.

I did deep copies for everything, but it took too much memory and that's why I have to to this by this way.


I find that very hard to believe.

Even if each record is 1000 bytes (based on what I see here, it's not) and if you have 100,000 records ... that's only 100 MB of RAM.

That's nothing. On a machine with only 2 GB of RAM, that's only 5% of available memory.


What's more likely is that you were using tons of memory because your code was full of memory leaks.



EDIT:

Another reason why you might have been using too much memory is this:

 
m_record = new TRecord * [k*SIZE];


What is SIZE? If it's sizeof(TRecord), then you are doing it wrong, and allocating waaaaaaaaaay too much memory.

new is not malloc. You do not specify the object size. The number you give it is the number of elements you want... not the number of bytes.
Last edited on
This is a "you can't win" situation.

Ok, you are right. So I could do it using reference counting, but still I don't how to do those deep copies.

Even if each record is 1000 bytes (based on what I see here, it's not) and if you have 100,000 records ... that's only 100 MB of RAM.

It is a homework and I upload it on server, which tests it and it is limited.

What is SIZE?

SIZE is defined size of the array (1000). In tests program calls much much more than 1000 (if it was more than 1000 I realloc it).
Last edited on
So I could do it using reference counting,


You could but you really shouldn't. This is yet another added layer of complexity to an already overly complex program. The more things like this you add, the more confusing the code will be, and the more bugs/memory leaks you'll have.

The simplest solution here really is the best solution. And the simplest solution is to not do any shallow copies.

but still I don't how to do those deep copies.


I illustrated it in my previous post with TRecord. If you have specific questions about what I'm doing there, you can ask them and I can clarify... but I don't know how else to further explain it.

It is a homework and I upload it on server, which tests it and it is limited.


I find it very hard to believe you are exceeding the limit unless you have memory leaks. I really do think that leaks are your problem here.

How many records is the test running, and what are your memory limits? Do you know?

SIZE is defined size of the array (1000)


Then what is k? And why do you multiply SIZE by k?

What's the difference between k*SIZE and m_count?
@ LowestOne

I didn't see your reply until just now.

Thanks for the correction. But your question has nothing to do with this thread. To prevent this thread from being drailed, please repost your question in a new thread.

If you want me to respond to your new thread, PM me a link to it.
Last edited on
Topic archived. No new replies allowed.