Deserialization problem

Hi!

I'm getting a strange error when trying to deserialize an object. The strange thing about it is that serializing/deserializing works perfectly in one application, but another which uses the exact same code (same files even) gets an access violation exception.

I'm sure I've done something wrong, but I can't figure out what. And I'm seriously confused as to how the same code can work in one application while it fails in another.

Here is the function that fails:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29

void Model::Deserialize(std::ifstream* _pFile)
{
	Parts.clear();
	Name.clear();
	Materials.clear();
	UINT sizebuf = 0;
	_pFile->read((char*)&sizebuf, sizeof(UINT));
	_pFile->read((char*)&Name, sizebuf);
	_pFile->read((char*)&sizebuf, sizeof(UINT));

	for(int i = 0; i<sizebuf; i++)
	{
		UINT forsizebuf = 0;
		CUSTOMMATERIAL* pMaterial = new CUSTOMMATERIAL;
		_pFile->read((char*)&forsizebuf, sizeof(UINT));
		_pFile->read((char*)pMaterial, forsizebuf);
		Materials.push_back(*pMaterial);
	}

	_pFile->read((char*)&sizebuf, sizeof(UINT));

	for(int i = 0; i<sizebuf; i++)
	{
		Part* pPart = new Part;
		Parts.push_back(*pPart->Deserialize(_pFile));
	}
}


And the serialization function:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
void Model::Serialize(std::ofstream* _pFile)
{
	UINT sizebuf = (UINT)sizeof(Name);
	_pFile->write((char*)&sizebuf, sizeof(UINT));
	_pFile->write((char*)&Name, sizeof(Name));

	sizebuf = (UINT)Materials.size();
	_pFile->write((char*)&sizebuf, sizeof(UINT));
	for(std::vector<CUSTOMMATERIAL>::iterator itr = Materials.begin(); itr != Materials.end(); ++itr)
	{
		sizebuf = (UINT)sizeof((*itr));
		_pFile->write((char*)&sizebuf, sizeof(UINT));
		_pFile->write((char*)&(*itr), sizebuf);
	}

	sizebuf = (UINT)Parts.size();
	_pFile->write((char*)&sizebuf, sizeof(UINT));
	for(std::vector<Part>::iterator itr = Parts.begin(); itr != Parts.end(); ++itr)
	{
		(*itr).Serialize(_pFile);
	}
}


To be more specific, this is the part where the exception gets thrown:

1
2
3
4
5
6
7
8
9
10
11
12

_pFile->read((char*)&sizebuf, sizeof(UINT));

for(int i = 0; i<sizebuf; i++)
	{
		UINT forsizebuf = 0;
		CUSTOMMATERIAL* pMaterial = new CUSTOMMATERIAL;
		_pFile->read((char*)&forsizebuf, sizeof(UINT));
		_pFile->read((char*)pMaterial, forsizebuf);
		Materials.push_back(*pMaterial);                   // Exception
	}


This piece of code first reads in from the file how many materials there are, and then loops that many times, creating a new material each loop and filling it with the data from the file. CUSTOMMATERIAL is a struct containing a few strings, a couple of ints, and a direct3d material structure, which in turn contains only floats.

This is my first attempt at serialization. I thought it was going well until I tried deserializing into another application. I've gone over the code of both applications several times, and I've been unable to find any differences between the apps that would cause this piece of code to crash in only one of them.

Any help would be appreciated.
1. Name looks suspicious - Line 5 in the deserializer implies that it's a std::string, but Line 9 seems to indicate that it is a char[] - if it's really a std::string, you need to read() it into a character array first and then copy it into Name - otherwise, the std::string will have no idea how much space to allocate - it might be running past that section on luck (or bad luck, depending on how you look at it) atm

2. you are right about that vector block - it's a mess on both sides
2a. first not that even though a vector not officially sanctioned to be one chunk of contiguous memory, so many developers have assumed so and so many compiler writers have implemented as such that it's become the the de facto standard - that means you can write the number of elements, as you have done, and then write, instead, the entire block of the array by doing the something like this:
1
2
3
sizebuf = (UINT)sizeof(Materials[0]) * Materials.size();
_pFile->write((char*)&sizebuf, sizeof(UINT));
_pFile->write((char*)&(Materials[0]), sizebuf);

in other words, you do not need to iterate with a loop - of course, this is assuming that Materials is struct-like without weird pointers or strings in it!

2b. now, reading it out into a vector will be a little tricky, but also doable - think about it a little before trying it out - you want to think of it in terms of a Materials[] (contiguous array) first and then adapt it to a vector

3. same applies to your array Parts

4. I highly recommend that you comment out most of your serialize and deserialize code and test it a little bit at a time, adding bits at a time simultaneously on both ends - that's almost the only way you can ensure that all the parts work as you expect
also, please read the man pages on read() carefully - if it's anything like the standard fread(), it may not read all the bytes at once

in other words, say you want to read 100 bytes - read() may first read only 23bytes - you would then have to call it additional times (carefully, with the right arguments) until the entire 100 bytes is read
Thanks for the tips! I think I can see why it went wrong now. I've obviously handled the strings wrong. Is there not a way to allocate space in a string, though? (just curious, using a buffer works too)

2a is a great tip, but I suppose i'll have to do it manually anyway since the Material struct does contain strings. The Parts class contains strings aswell and also other objects, so i'll have to treat that manually too.

I suppose I need to read up a bit about memory management so that I stop doing these mistakes.
np - the trick is to be certain that you understand how the compiler computes the sizeof() objects (in fact, if you understand this thoroughly, not only will serialization be easier to do, but you will understand how to properly do forward declaration in header files to decrease the number of header dependencies, but that's a different story...)

in general, the fastest way to stream objects is to have fixed size objects - since you control both the serializer and the deserializer, you can be as creative as you like in how you manage to do this

however, keep in mind the more you "chunk" or your serialization, the faster it will run - 2a. is an example of chunking

variable length strings is one of the things you need to watch out for - one trick is to write it out in two parts - the length of string (n) and then n bytes (chunk of characters) - reading it in will be a little inefficient - read it into a local variable char buff[256]; (or whatever length you know your maximum string will be) and then copy the results into your string data member

alternatively, you can dynamically allocate a char* when you deserialize n - you could check this with your current char* and reallocate a larger struct if n is bigger than your current n - just remember to delete this memory when you are done - either way, you will have the copy the results of this buffer into your string data member
Last edited on
Topic archived. No new replies allowed.