I want to try an write my own custom string class (for learning purposes) and I've been doing some research on it. The std::string class uses the Copy-On-Write idiom and I was thinking of doing the same, but I also read that it's not so great when multiple threads have to access the same instance. Should I stay away from CoW?
For my class, my idea is this...
Each String will contain:
- A Node* which contains the string's data.
- A String* that comes after it in the sequence.
- A String* that is the last String in the sequence.
Each Node contains:
- A char* which is the data.
- An int which keeps track of how many times that data is referenced.
So each node is managed by the string. Only when a node's reference count reaches 0 is it deleted. When a string adds two strings together, rather than creating a new string and copying the data into it, it simply links the two strings with a pointer. It's when the string is cast to a char*, then all the data is copied. My second thought was to use a linked list of single chars, but that seemed kind of wasteful...
If it is for learning purposes only, I wouldn't worry about thread safety -- just worry about the single threaded case.
That is a fairly complicated memory management scheme. While it makes concatentation simple, I think you
will find that it complicates every other string manipulation function. You can optimize away some of the
overhead by using expression templates to store intermediate results.