is it safe to use memcpy to serialise/deserialise PODs?

Hi

I need to serialise and deserialise some POD structs. I was planning to use something like the following:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
#include <iostream>
#include <cstring>

using namespace std;

struct settings
{
   char name[240];
   int age;
   bool isMale;
};

void serialise(settings toSerialise, char* writeBuffer)
{
    const char *pS = static_cast<char*>(static_cast<void*>(&toSerialise));
    memcpy(writeBuffer, pS, sizeof(toSerialise));
}

void deserialise(char* serialised, settings& toDeserialise)
{
    char *pDeserialised = static_cast<char*>(static_cast<void*>(&toDeserialise));
    memcpy(pDeserialised, serialised, sizeof(toDeserialise));
}

int main()
{
    
    settings original{"bob", 92, true};
    char serialised [sizeof(original)];
    serialise(original, serialised);
    
    settings copy;
    
    deserialise(serialised, copy);
    
    cout << copy.name<<endl;
    cout << copy.age<<endl;
    cout << copy.isMale;
    


   return 0;
}


I was wondering whether this is safe? I've seen different comments online, with some saying its ok and others saying its unsafe (because the 'padding' may differ between compiler/architecture).

Thanks
Last edited on
> is it safe to use memcpy to serialise/deserialise PODs?

Yes, provided the serialisation and deserialisation is done on the same machine with the same program (or two programs built with the same build tools and with identical compiler options).

In general, strongly favour using a text representation for the data.
The Importance of Being Textual: http://www.catb.org/~esr/writings/taoup/html/ch05s01.html
there are multiple copy tools in <algorithm> that do nearly the same thing. I do not know if they are always safe, though -- you can mess up anything if you play at the byte level.
you can go in reverse too, allocate an array of bytes and pull offsets into it into your object so it is always serialized from the bottom up, and the alignment of the objects is under your direct control.

Binary is far superior to text in performance and space. Converting to numeric formats and back to text repeatedly is extremely slow and when done in bulk kills large systems (been there, done that, databases and the tools that work with them can spend 2-3 days doing this exact thing just to copy data from one system to another, on relatively small systems (we only had about 100 million records).

if you do not care about performance and space, go text, but it is in the top 5 things that make software so darn slow on the supercomputers we call desktops these days.

another example, I found a sha c++ code online and more than tripled its speed by getting rid of number to text (in this case hex text for the final output) in favor of a lookup table of the numeric value to the text value.
Last edited on
> Binary is far superior to text in performance and space.
> if you do not care about performance and space, go text.

Performance requirement is a design constraint;
in the vast majority of cases, it is a gross blunder to treat performance as the overriding design goal.
Topic archived. No new replies allowed.