Here is an issue I can't quite see a cleaner solution of than a tedious one.
I'm working on a networking library. At times I need to send over data stored in a struct. The struct consists of fixed-size data types as well as dynamically-sized data such as a few strings and a map. Naturally, I would want to make it easy for myself - just pump the block of data in a package and send away.
The problem is of course that the data is dynamic. I can't - immediately - find out the size of the struct because it refers to fixed-size types as well as a few pointers, but not to the data behind it. Also, none of the dynamic data will be located anywhere near the object's pointer so I need a way to gather all that together in a snugly fit memory block. At the same time, at the receiving end I would only know the size and the data struct. A new object would need to be created into which this information is smoothly pasted.
With a fixed-size object, I wouldn't see any problem - all data is neatly kept together and easily memcpy'd into a package, and later on, into a newly created object. But for dynamic data structures, I would somehow have to take down the dynamic arrays and sequence them, then build them up again at the remote end of the connection.
I can't imagine I'm the first with such a problem, and that a specific kind of class has been created to solve all this. Can anyone give me any pointers? (sic)
What I would do is write a serialize() method that will return an std::string. If there are more objects or pointers to objects in this object, they should also have a serialize() method (ideally, the method would take a pointer to a string so that it will write to it, instead of a local string).
You will need to write a few things, such as a function that will write a 32-bit integer as a string of bytes. Also some extra info, like the size of the object or string that's coming next, etc.
There are also some serialization libraries available, if you prefer to use them. I think Boost has one.
Hmm, didn't know about the serialization possibilties within Boost. I've looked at those, but on retrospect I need to firmly enforce the smallest possible size on a message without compression, for a number of differently formatted messages.
So I'll go for my own serialization methods. I'd much rather use a buffer than a string, but you probably suggested a string because it allows itself to easy resizing? Are there any objections to storing non-ASCI characters in a string if the string is not null-terminated?
Think of strings as std::vector<char>s, but with more capabilities. In other words, yes, you can store nuls without problems, and std::string::size() returns the size of the structure, not the distance to the first nul.
I like to use strings as file buffers for writing because I can then pass them to write() in a single line: file.write(str.c_str(),str.size());
AFAIK, you can't do that with vectors.
One smaller concern though - I'm wondering if this resizing - in particular, string growth - means that in the background, there is a continuous process going on of reserving new memory, copying the entire string, then dropping the old one. It seems rather an inefficient approach if I decide to add only one or two bytes to a string in multiple steps.
On a related note, I haven't been able to find the maximum size of a string. Is there one?
string::capacity() returns the number of bytes the string can store without requiring a resize. You can call string::reserve() to force the string to allocate a certain number of bytes.
D'oh! I feel like a noob. I thought I'd read the string documentation.
I know I'm going off-topic in my questions, so if the standard here is to open a new topic, let me know...
Do you happen to know what rules the string class applies when deciding the string capacity? I assume it has to do with the initial size allocation, if one was passed.
The string is initialised with 11 characters but its capacity is reported as 15 in 'a possible output'? This suggests that the capacity may be set to some value which is higher than the length to which it was originally set...
From an efficiency point of view, this makes sense to me - it's too tedious to copy data around on every size change.