I'm working on a second version of my serializer. I want to be able to de/serialize some classes that I don't have control over. For example,
1 2 3 4 5 6 7 8 9
class Foo{
int x, y;
std::string z;
};
class A : public Serializable{
Foo foo;
//...
};
The serializer will have full access to A, but not to Foo. However, the serializer will work by processing the sources with libclang as a prebuild step to then generate code. So, since Foo is fully visible, it's fairly easy to generate something like
namespace generated{
struct Foo{
int x
int y;
std::string z;
};
}
void serialize(const Foo &foo){
auto &casted = (const generated::Foo &)foo;
serialize(casted.x);
serialize(casted.y);
serialize(casted.z);
}
void deserialize(Foo &foo, Source &src){
auto &casted = (generated::Foo &)foo;
deserialize(casted.x, src);
deserialize(casted.y, src);
deserialize(casted.z, src);
}
Now, my question is, ignoring the types of members and stuff like polymorphic classes, what sort of things should I look out for in the class to ensure that I can replicate its memory layout in a struct?
By not insisting on lay-out compatibility between non-standard-layout classes, the standard leaves the door open for layout optimisations.
For instance, the lay-out for ths non-standard-layout class could be optimised: struct A { char c1 = 'a' ; double d1 = 0 ; protected: char c2 = 'b' ; double d2 = 1 ; char c3 = 'c' ; } ; // not standard-layout
The specification of standard-layout is quite liberal. It is possible that std::string or std::vector are implemented as standard-layout types.
? I can see valid arguments both to have both as standard-layout, and to have neither as standard-layout. Having just one and not the other seems nonsensical and arbitrary.
Plus, don't most classes have either only private or only public data members? Like you said, std::string and std::vector might be implemented like that. Why wouldn't one want their layout to be optimizable? I don't know, it just seems like it leaves a lot of classes on the wrong side of the category for no reason.
> what sense does it make to make mixed accessibility a sufficient condition for such optimizations?
The primary idea behind standard layout is to support interoperability with C (and other programming languages which provide interoperability with C's memory model). Ergo, the the same access control rule (explicitly disable non-C-compatible optimisations for standard layout types, but allow it for other types).
Yeah, that's obvious. The question is, why is the requirement that all members must have the same accessibility and not, say, that they must all be public?
> why is the requirement that all members must have the same accessibility and not,
> say, that they must all be public?
It came from directly from this C++11 requirement:
Nonstatic data members of a (non-union) class with the same access control are allocated so that later members have higher addresses within a class object. The order of allocation of non-static data members with different access control is unspecified. https://timsong-cpp.github.io/cppwp/n3337/class.mem#14
Note: This relaxed specification offers more flexibility. For instance, the standard specifies that std::mutex, std::condition_variable etc., even if they are opaque types, must be standard layout types.
Maybe we will get a way to explicitly request that we want the ordering that gives the smallest size, and other optimizations like packing, in the future. But it wouldn't necessarily always be most efficient. In some rare cases the programmer might actually know what he's doing and carefully order the members to avoid "cache misses" or "false sharing".