Hacking member accessibility

I'm working on a second version of my serializer. I want to be able to de/serialize some classes that I don't have control over. For example,
1
2
3
4
5
6
7
8
9
class Foo{
    int x, y;
    std::string z;
};

class A : public Serializable{
    Foo foo;
//...
};
The serializer will have full access to A, but not to Foo. However, the serializer will work by processing the sources with libclang as a prebuild step to then generate code. So, since Foo is fully visible, it's fairly easy to generate something like
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
namespace generated{
struct Foo{
    int x
    int y;
    std::string z;
};
}

void serialize(const Foo &foo){
    auto &casted = (const generated::Foo &)foo;
    serialize(casted.x);
    serialize(casted.y);
    serialize(casted.z);
}

void deserialize(Foo &foo, Source &src){
    auto &casted = (generated::Foo &)foo;
    deserialize(casted.x, src);
    deserialize(casted.y, src);
    deserialize(casted.z, src);
}

Now, my question is, ignoring the types of members and stuff like polymorphic classes, what sort of things should I look out for in the class to ensure that I can replicate its memory layout in a struct?
> what sort of things should I look out for in the class to ensure that I can
> replicate its memory layout in a struct?

It should be a standard layout class type.
https://en.cppreference.com/w/cpp/language/classes#Standard-layout_class
Gotcha. The same-accessibility requirement is kind of annoying, but I guess it'll have to do.
The same-accessibility requirement is kind of annoying

I read somewhere that access specifiers don't affect memory layout in practice. I'm fairly sure it doesn't affect layout in GCC or Clang.
I guess it's the usual excessive caution.
By not insisting on lay-out compatibility between non-standard-layout classes, the standard leaves the door open for layout optimisations.
For instance, the lay-out for ths non-standard-layout class could be optimised:
struct A { char c1 = 'a' ; double d1 = 0 ; protected: char c2 = 'b' ; double d2 = 1 ; char c3 = 'c' ; } ; // not standard-layout

The specification of standard-layout is quite liberal. It is possible that std::string or std::vector are implemented as standard-layout types.

http://coliru.stacked-crooked.com/a/b5ea98363833be71
But... what sense does it make to make mixed accessibility a sufficient condition for such optimizations? Why optimize this class
1
2
3
4
5
6
7
8
struct A{
    char c1 = 'a';
    double d1 = 0;
protected:
    char c2 = 'b';
    double d2 = 1;
    char c3 = 'c';
};
and not this one
1
2
3
4
5
6
7
8
struct A{
protected:
    char c1 = 'a';
    double d1 = 0;
    char c2 = 'b';
    double d2 = 1;
    char c3 = 'c';
};
? I can see valid arguments both to have both as standard-layout, and to have neither as standard-layout. Having just one and not the other seems nonsensical and arbitrary.

Plus, don't most classes have either only private or only public data members? Like you said, std::string and std::vector might be implemented like that. Why wouldn't one want their layout to be optimizable? I don't know, it just seems like it leaves a lot of classes on the wrong side of the category for no reason.
Last edited on
> what sense does it make to make mixed accessibility a sufficient condition for such optimizations?

The primary idea behind standard layout is to support interoperability with C (and other programming languages which provide interoperability with C's memory model). Ergo, the the same access control rule (explicitly disable non-C-compatible optimisations for standard layout types, but allow it for other types).
Yeah, that's obvious. The question is, why is the requirement that all members must have the same accessibility and not, say, that they must all be public?
> why is the requirement that all members must have the same accessibility and not,
> say, that they must all be public?

It came from directly from this C++11 requirement:
Nonstatic data members of a (non-union) class with the same access control are allocated so that later members have higher addresses within a class object. The order of allocation of non-static data members with different access control is unspecified.
https://timsong-cpp.github.io/cppwp/n3337/class.mem#14


Note: This relaxed specification offers more flexibility. For instance, the standard specifies that std::mutex, std::condition_variable etc., even if they are opaque types, must be standard layout types.
That doesn't shed any light into the rationale.
JLBorges wrote:
For instance, the lay-out for ths non-standard-layout class could be optimised:
struct A { char c1 = 'a' ; double d1 = 0 ; protected: char c2 = 'b' ; double d2 = 1 ; char c3 = 'c' ; } ; // not standard-layout

When you say "optimized", do you mean by the compiler, or by the programmer?

I don't think the above struct can be optimized by the compiler as much as in the link you posted:

JLBorges wrote:
http://coliru.stacked-crooked.com/a/b5ea98363833be71

I think the best you could hope for would be ...

 
struct { char c1; char c2; double d1; double d2; char c3; }

... because the order of the members with the same access control has to be preserved.

In practice I don't think any compiler made use of this optimization opportunity because it's necessary to keep a consistent ABI.

In C++23 the compiler will no longer be allowed to do this optimization.

P1847 wrote:
We propose to remove the implementation's license of member reordering in case access control is mixed.
https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2021/p1847r4.pdf

wg21bot wrote:
Adopted 2021-06.
https://github.com/cplusplus/papers/issues/600

Maybe we will get a way to explicitly request that we want the ordering that gives the smallest size, and other optimizations like packing, in the future. But it wouldn't necessarily always be most efficient. In some rare cases the programmer might actually know what he's doing and carefully order the members to avoid "cache misses" or "false sharing".
Last edited on
I mean, that's just sensible IMO. That rule made no sense.
Topic archived. No new replies allowed.