floatoperator[](int i) const {
if(i >= 0 && i <= 2)
return (&x)[i];
//... some error handler
}
so basically its part of a vector class which you are supposed to access each member of the vector with v[i], question is what is this return (&x)[i]? I understand it works for x, but how could you do access y and z using this?
ps this is legit code from pbrt project as far as i know
If the object representation of Vector has the same layout as an array of 3 float (this is the usual case: members x, y and z are allocated without any padding between them), we can access it as if it is an array of 3 float.
Some compilers may generate tighter code for indexed array element access (with the construct (&x)[i]).
For instance, with g++ 6.1, x86_64, -O3:
(&x)[i]:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
float Vector::operator[]( int i ) const
{
// confirm that the object layout allows accessing x, y, z as if they were in an array of three floats
static_assert( offsetof( Vector, y ) == sizeof(float) && offsetof( Vector, z ) == sizeof(float) * 2, "unexpected layout" ) ;
if( i >= 0 && i <= 2 ) return (&x)[i] ;
elsereturn 0 ; // else ... some error handler
/*
cmpl $2, %edx
ja .L4
movslq %edx, %rdx
movss (%rcx,%rdx,4), %xmm0
ret
.L4:
pxor %xmm0, %xmm0
ret
*/
}
float Vector::operator[]( int i ) const
{
switch(i)
{
case 0 : return x ;
case 1 : return y ;
case 2 : return z ;
default: return 0 ; // else ... some error handler
}
/*
cmpl $1, %edx
je .L8
cmpl $2, %edx
je .L9
testl %edx, %edx
je .L13
pxor %xmm0, %xmm0
ret
.L13:
movss (%rcx), %xmm0
ret
.L9:
movss 8(%rcx), %xmm0
ret
.L8:
movss 4(%rcx), %xmm0
ret
*/
}
I wasn't aware its a valid hack, can I safely assume that as long as in a class or struct, you define variables in such way that reads like Type a, b, c; they will always be assigned continuously?
I did a test outside the class, obviously, it doesn't work that way, can someone elaborate a bit more, is there any official document on this?
or its just some compiler specific optimization? I've been off of c/c++ for a while but I never recall there's such thing you can do in the examples above, and it seems quite popular because other projects have it as well.
> I safely assume that as long as in a class or struct, you define variables in such way that reads like Type a, b, c;
> they will always be assigned continuously?
AFAIK, that is not guaranteed by the standard.
Though in practice, for a standard-layout type, adjacently declared member objects within the same access control and natural alignment would be allocated immediately after each other in order of declaration.
The static_assert in the snippet posted earlier asserts that this assumption is true.
I was just casually reading some code didn't intend to test it out really but puzzled by that line simply.
putting that Vector code away, now i couldnt understand why the simple testing code prints such peculiar output, basically it just swaps y and z, why is this then?
int x, y, z;
//int x = 1, y = 2, z = 3;
x = 1, y = 2, z = 3;
printf("x = %d, y = %d, z = %d\n", x, y, z);
printf("x = %d, y = %d, z = %d\n", (&x)[0], (&x)[1], (&x)[2]);
it prints
x = 1, y = 2, z = 3
x = 1, y = 3, z = 2
of course no one writes that, but why y and z swapped?
so back to the original post, for Vector example, can I say it is safe to do so across the compilers (define several members with the same type all at once and assume they are assigned consecutively), it's just the code itself is not very readable in that way but nothings wrong with it?
It's a reasonable assumption to make for standard layout classes where you can be confident no padding will be inserted, but as you've stated it I would have to agree with helios - it isn't safe at all to make that unqualified assumption.
This kind of check can be safely used in all situations where the macro offsetof can be used.
offsetof is well-documented: the macro can be used for any non-static member objects of any standard layout type, even if the address-of operator is overloaded for the class and/or the type of the members.
A is a standard layout type, offsetof can be used to determine the offset (in bytes) of a non-static member object.
There is no need to special-case on a class by class basis; if it is a standard layout type, offsetof can be used to determine the offset (in bytes) of any non-static member object.
struct B
{
int v ;
operatorint() const { return v ; }
private: void* operator& () const { returnnullptr ; }
};
struct A
{
B a {0} ;
B b {1} ;
B c {2} ;
private:
void foo();
public:
B d {3} ;
B e {4} ;
intoperator[] ( std::size_t pos ) const ;
private: void* operator& () const { returnnullptr ; }
};
I understand that, but your check only ensures that &last_member = &first_member + sum_of_sizes_of_all_members_but_one. When there are access modifiers in the middle of the data members, the compiler is free to reorder their layout somewhat:
A::a
A::d
A::b
A::c
A::e
> the compiler is free to reorder their layout somewhat:
Only in C++98, where there was no concept of standard layout types, and layout rules for POD structs explicitly specified access control region.
In C++11 and later, the standard layout rule is simpler and stricter:
Non-static data members of a (non-union) class with the same access control are allocated so that later members have higher addresses within a class object
Note that there is no mention of "same access control region".
Yes, the point of something like this would not be apparent to every programmer.
Nevertheless, there are programmers who would not find this to be utterly pointless; they would be able to use it to their advantage.
Therefore, for std::complex<>, the IS specifies this requirement:
If z is an lvalue expression of type cv std::complex<T> then:
— the expression reinterpret_cast<cv T(&)[2]>(z) shall be well-formed,
— reinterpret_cast<cv T(&)[2]>(z)[0] shall designate the real part of z, and
— reinterpret_cast<cv T(&)[2]>(z)[1] shall designate the imaginary part of z.
Moreover, if a is an expression of type cv std::complex<T>* and the expression a[i] is well-defined for an integer expression i, then:
— reinterpret_cast<cv T*>(a)[2*i] shall designate the real part of a[i], and
— reinterpret_cast<cv T*>(a)[2*i + 1] shall designate the imaginary part of a[i].
Even though there would be programmers who find this to be an utterly pointless requirement that needlessly constrains implementations.