operator[] return (&x)[i];

Forum

Forum
Beginners
operator[] return (&x)[i];

operator[] return (&x)[i];

Pages: 12

came across this code, couldn't understand some part

first you've got this public members which look like this

  Vector(float _x=0, float _y=0, float _z=0)
		: x(_x), y(_y), z(_z) {
		...
	}

then there's a operator[] override

float operator[](int i) const {
		if(i >= 0 && i <= 2)
		    return (&x)[i];
                //... some error handler
	}

so basically its part of a vector class which you are supposed to access each member of the vector with v[i], question is what is this return (&x)[i]? I understand it works for x, but how could you do access y and z using this?

ps this is legit code from pbrt project as far as i know

JLBorges (13770)

If the object representation of Vector has the same layout as an array of 3 float (this is the usual case: members x, y and z are allocated without any padding between them), we can access it as if it is an array of 3 float.
Some compilers may generate tighter code for indexed array element access (with the construct (&x)[i]).

For instance, with g++ 6.1, x86_64, -O3:

(&x)[i]:

float Vector::operator[]( int i ) const
{
    // confirm that the object layout allows accessing x, y, z as if they were in an array of three floats
    static_assert( offsetof( Vector, y ) == sizeof(float) && offsetof( Vector, z ) == sizeof(float) * 2, "unexpected layout" ) ;

    if( i >= 0 && i <= 2 ) return (&x)[i] ;
    else return 0 ; // else ... some error handler

    /*
        cmpl	$2, %edx
        ja	.L4
        movslq	%edx, %rdx
        movss	(%rcx,%rdx,4), %xmm0
        ret
    .L4:
        pxor	%xmm0, %xmm0
        ret
    */
}

switch-case:

float Vector::operator[]( int i ) const
{
    switch(i)
    {
        case 0 : return x ;
        case 1 : return y ;
        case 2 : return z ;
        default: return 0 ; // else ... some error handler
    }

    /*
        cmpl	$1, %edx
        je	.L8
        cmpl	$2, %edx
        je	.L9
        testl	%edx, %edx
        je	.L13
        pxor	%xmm0, %xmm0
        ret
    .L13:
        movss	(%rcx), %xmm0
        ret
    .L9:
        movss	8(%rcx), %xmm0
        ret
    .L8:
        movss	4(%rcx), %xmm0
        ret
     */
}

weee (13)

I wasn't aware its a valid hack, can I safely assume that as long as in a class or struct, you define variables in such way that reads like Type a, b, c; they will always be assigned continuously?

I did a test outside the class, obviously, it doesn't work that way, can someone elaborate a bit more, is there any official document on this?

or its just some compiler specific optimization? I've been off of c/c++ for a while but I never recall there's such thing you can do in the examples above, and it seems quite popular because other projects have it as well.

JLBorges (13770)

> I safely assume that as long as in a class or struct, you define variables in such way that reads like Type a, b, c;
> they will always be assigned continuously?

AFAIK, that is not guaranteed by the standard.

Though in practice, for a standard-layout type, adjacently declared member objects within the same access control and natural alignment would be allocated immediately after each other in order of declaration.

The static_assert in the snippet posted earlier asserts that this assumption is true.

weee (13)

> Would you mind showing us your experiment?

just like this

<code>
int main()
{
int x, y, z;
//int x = 1, y = 2, z = 3;
x = 1, y = 2, z = 3;

printf("x = %d, y = %d, z = %d\n", x, y, z);
printf("x = %d, y = %d, z = %d\n", (&x)[0], (&x)[1], (&x)[2]);

return 0;
}
</code>

strangely it prints
x = 1, y = 2, z = 3
x = 1, y = 3, z = 2

I use visual c++ 2015 command line, the commented line wouldn't make a difference in this sample, so what happened here??

cire (8284)

I safely assume that as long as in a class or struct, you define variables in such way that reads like Type a, b, c;

Not realizing your assumption in the test code seems like an odd way of testing it.

weee (13)

I was just casually reading some code didn't intend to test it out really but puzzled by that line simply.

putting that Vector code away, now i couldnt understand why the simple testing code prints such peculiar output, basically it just swaps y and z, why is this then?

weee (13)

int x, y, z;
//int x = 1, y = 2, z = 3;
x = 1, y = 2, z = 3;

printf("x = %d, y = %d, z = %d\n", x, y, z);
printf("x = %d, y = %d, z = %d\n", (&x)[0], (&x)[1], (&x)[2]);

it prints
x = 1, y = 2, z = 3
x = 1, y = 3, z = 2

of course no one writes that, but why y and z swapped?

weee (13)

ok thanks all

so back to the original post, for Vector example, can I say it is safe to do so across the compilers (define several members with the same type all at once and assume they are assigned consecutively), it's just the code itself is not very readable in that way but nothings wrong with it?

helios (17575)

can I say it is safe to do so across the compilers (define several members with the same type all at once and assume they are assigned consecutively)

No, this is not safe at all.

cire (8284)

It's a reasonable assumption to make for standard layout classes where you can be confident no padding will be inserted, but as you've stated it I would have to agree with helios - it isn't safe at all to make that unqualified assumption.

http://en.cppreference.com/w/cpp/concept/StandardLayoutType

JLBorges (13770)

This can be checked at compile time, and array-like access used if and only if the compiler can verify that it is safe.

For instance:

#include <cstddef>
#include <type_traits>
#include <stdexcept>

struct A
{
    int a = 0 ;
    int b = 1 ;
    int c = 2 ;
    int d = 3 ;
    int e = 4 ;

    int operator[] ( std::size_t pos ) const ;
};

namespace
{
    // **** assert: members a, b, c, d, e are of the same unqualified type
    constexpr bool is_array_like = offsetof(A,e) == offsetof(A,a) + sizeof(A::a) + sizeof(A::b) + sizeof(A::c) + sizeof(A::d) ;
    using is_array_like_flag = std::conditional < is_array_like, std::true_type, std::false_type >::type ;

    inline int get( const A& obj, std::size_t pos, std::true_type ) // array like
    {
        if( pos > 4 ) throw std::out_of_range( "out of range" ) ;
        return std::addressof( obj.a )[pos] ;
    }


    inline int  get( const A& obj, std::size_t pos, std::false_type ) // not array like
    {
        if( pos > 4 )
        switch(pos)
        {
            case 0 : return obj.a ;
            case 1 : return obj.b ;
            case 2 : return obj.c ;
            case 3 : return obj.d ;
            case 4 : return obj.e ;
            default: throw std::out_of_range( "out of range" ) ;
        }
    }
}

int A::operator[] ( std::size_t pos ) const { return get( *this, pos, is_array_like_flag{} ) ; }

helios (17575)

I think it's dangerous to give a check like that to a newbie without mentioning for what kinds of classes/structs it can be used.

JLBorges (13770)

This kind of check can be safely used in all situations where the macro offsetof can be used.

offsetof is well-documented: the macro can be used for any non-static member objects of any standard layout type, even if the address-of operator is overloaded for the class and/or the type of the members.

helios (17575)

What if A is this?

struct A
{
    int a = 0 ;
    int b = 1 ;
    int c = 2 ;
private:
    void foo();
public:
    int d = 3 ;
    int e = 4 ;

    int operator[] ( std::size_t pos ) const ;
};

JLBorges (13770)

> What if A is this?

A is a standard layout type, offsetof can be used to determine the offset (in bytes) of a non-static member object.

There is no need to special-case on a class by class basis; if it is a standard layout type, offsetof can be used to determine the offset (in bytes) of any non-static member object.

Q: What if A is this?

struct B
{
    int v ;
    operator int() const { return v ; }
    private: void* operator& () const { return nullptr ; }
};

struct A
{
    B a {0} ;
    B b {1} ;
    B c {2} ;
private:
    void foo();
public:
    B d {3} ;
    B e {4} ;

    int operator[] ( std::size_t pos ) const ;

    private: void* operator& () const { return nullptr ; }
};

A: standard layout type; offsetof can be used to determine the offset of any non-static member object.
http://coliru.stacked-crooked.com/a/09251dc451f81457

helios (17575)

I understand that, but your check only ensures that &last_member = &first_member + sum_of_sizes_of_all_members_but_one. When there are access modifiers in the middle of the data members, the compiler is free to reorder their layout somewhat:
A::a
A::d
A::b
A::c
A::e

Last edited on

JLBorges (13770)

> the compiler is free to reorder their layout somewhat:

Only in C++98, where there was no concept of standard layout types, and layout rules for POD structs explicitly specified access control region.

In C++11 and later, the standard layout rule is simpler and stricter:

Non-static data members of a (non-union) class with the same access control are allocated so that later members have higher addresses within a class object

Note that there is no mention of "same access control region".

helios (17575)

Alright, I have no objections, then. Other than finding this utterly pointless.

JLBorges (13770)

Yes, the point of something like this would not be apparent to every programmer.
Nevertheless, there are programmers who would not find this to be utterly pointless; they would be able to use it to their advantage.

Therefore, for std::complex<>, the IS specifies this requirement:

If z is an lvalue expression of type cv std::complex<T> then:
— the expression reinterpret_cast<cv T(&)[2]>(z) shall be well-formed,
— reinterpret_cast<cv T(&)[2]>(z)[0] shall designate the real part of z, and
— reinterpret_cast<cv T(&)[2]>(z)[1] shall designate the imaginary part of z.

Moreover, if a is an expression of type cv std::complex<T>* and the expression a[i] is well-defined for an integer expression i, then:
— reinterpret_cast<cv T*>(a)[2*i] shall designate the real part of a[i], and
— reinterpret_cast<cv T*>(a)[2*i + 1] shall designate the imaginary part of a[i].

Even though there would be programmers who find this to be an utterly pointless requirement that needlessly constrains implementations.

Pages: 12