Pointer to non-static member variable?

Aug 10, 2012 at 3:51am
I was looking at a problem on Stack Overflow, and came across the following:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
class A
{
public:
        A(int _a, char _b, char _c, char _t):a(_a), b(_b), c(_c), t(_t){}

private:
        char t;
        int a;
public:
        char b;
        char c;

        static void print(){

                int A::*pa = &A::a;
                char A::*pb = &A::b;
                char A::*pc = &A::c;
                char A::*pt = &A::t;
                printf("data member : %p, %p, %p, %p\n", pa, pb, pc, pt);
        }
};


with results (g++ 4.7):
data member : 00000004, 00000008, 00000009, 00000000


I don't think I've come across this syntax; what are pa,pb,pc, and pt? It seems like the print function outputs the offset of each member variable from the beginning of A, but pa outputs 0x00000004, when 0x00000001 would be expected (given t is a char). Anyone have a reference to a good description of this functionality?
Last edited on Aug 10, 2012 at 3:52am
Aug 10, 2012 at 4:28am
what are pa,pb,pc, and pt?

They are pointers to members

It seems like the print function outputs the offset of each member variable from the beginning of A,

That is correct.

when 0x00000001 would be expected (given t is a char).

That is not correct: offset of a cannot be 1 because a is an int, which, on your and almost everyone's system, is aligned to 4 byte boundary (its memory address must be divisible by 4)

Aug 10, 2012 at 4:30am
I've never seen this syntax before either. But it looks like you are correct in your assessment of it outputting the offset of each member.

but pa outputs 0x00000004, when 0x00000001 would be expected (given t is a char)


Structure padding. It's putting the int on the next 4-byte boundary. That is very typical.
Last edited on Aug 10, 2012 at 4:30am
Aug 10, 2012 at 4:59am
Thanks guys - have a couple interviews coming up, trying to cram as much obscure knowledge as I can :)
Aug 10, 2012 at 2:57pm
OK, here's are a couple of interview questions:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
#include <cstdio>

struct A
{
    int i, j, k, l ;
};

int main()
{
    int A::*points_to_member_at_offset_0 = &A::i ;
    int A::*does_not_point = nullptr ;
    std::printf( "%ld\n", points_to_member_at_offset_0 ) ;
    std::printf( "%ld\n", does_not_point ) ;
}

void foo( A& a, int A::*pm, int v )
{
    if( pm != nullptr )
         a.*pm += v ;
}


1. If a particular implementation prints 0 for line 12, what could it print for line 13? How would this compiler evaluate the condition in the if statement on line 18?

2. If a particular implementation prints 0 for line 13, what could it print for line 12? How would it evaluate the dereference operator on line 19?
Last edited on Aug 10, 2012 at 3:00pm
Aug 10, 2012 at 5:23pm
I'm pretty sure pointer-to-members with a null value are represented as -1, but your compiler tries to trick you:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
class A
{
public:
	int a;
};

int main()
{
	union
	{
		int A::*union_ptr;
		int union_value;
	}u;
	int A::*a_ptr = &A::a;
	int A::*A_null_ptr = nullptr;
	u.union_ptr = nullptr;
	std::cout<<u.union_value<<std::endl;
	std::cout<<A_null_ptr<<std::endl;
	std::cout<<a_ptr<<std::endl;
	std::getchar();
	return 0;
}

My output:
a_ptr = 1.
A_null_ptr = 0.
union_value = -1.

Both union_ptr and A_null_ptr hold the nullptr, but only union_value prints the actual value because it's forced to print it as an int.

Pointers-to-members are trippy.
Aug 10, 2012 at 6:05pm
trying to cram as much obscure knowledge as I can
That's plain stupid.

Edit: Learn COBOL then.
Last edited on Aug 10, 2012 at 6:13pm
Aug 10, 2012 at 6:10pm
Still haven't seen a great article describing the rules of pointer-to-member, but it seems like they aren't implicitly castable to long, so I would have to assume the lines

1
2
std::printf( "%ld\n", points_to_member_at_offset_0 ) ;
std::printf( "%ld\n", does_not_point ) ;


are going to give misleading results. If the first gives 0 though, I would expect the next to give -1 (as BlackSheep said), but it seems this would be entirely compiler dependent. It could theoretically be implemented as '5'; the compiler would know that while values '3', and '4', represent the 4th and 5th data members, values 6, 7, etc would represent the 6th, 7th, etc data members. Or for your 2nd question, 'null' could be '0' when viewed as an integer, with 1, 2, etc representing the 1st, 2nd, etc member variables.

The logic in foo() seems like it will work as expected regardless of the compiler. Whatever the conversion of nullptr to type int A::* results in during the execution of main, the result will be the same when evaluating the line if( pm != nullptr ) (which I believe could be simply written as if( pm ), given that conversion to bool does seem to be defined, and evaluates as 'true' even if the integral value is '0'). Likewise, regardless of the internal implementation of pointer-to-member, a.*pm += v ; seems like it would work in any case, as this is the proper syntax for this type of construct.
Aug 10, 2012 at 7:57pm
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
#include <cstdio>
#include <cstddef>

struct A
{
    int i, j, k, l ;
};

int main()
{
    // just being safe
    static_assert( sizeof( int A::* ) <= sizeof(long),
                   "casting pointers to member variables to a long loses information" ) ;

    std::printf( "%ld %ld\n", offsetof(A,i), offsetof(A,j) ) ; // 0 4 (say)

    int A::*pi = &A::i ;
    int A::*pj = &A::j ;
    int A::*p0 = 0 ;

    std::printf( "%ld %ld %ld\n", pi, pj, p0 ) ;

    // possible implementation 1:   0 4 -1  (assuming the above offsets)
    // --------------------------
    // current versions of many compilers (eg. GNU, Microsoft)

    // a pointer to member variable holds the offset of the member,
    // and a null pointer holds an invalid offset (typically -1)

    // dereference: add the numeric value of the address of the object
    // to the numeric value of the pointer to member
    // to get the numeric value of the address of the bound member variable



    // possible implementation 2:   1 5 0 (assuming the above offsets)
    // --------------------------
    // Stroustrup/Lippman's original cfront, older versions of many compilers

    // the pointer holds the offset of the member plus a constant (typically 1),
    // and a null pointer holds a zero

    // dereference: add the numeric value of the address of the object
    // to the numeric value of the pointer to member minus the constant
    // to get the numeric value of the address of the bound member variable

    // The cfront implementation is described in Lippman's 'Inside the C++ Object Model'
}



> It could theoretically be implemented as '5'; the compiler would know that while values '3', and '4', represent the 4th and 5th data members ...

No. Because this is well defined:
1
2
struct B ; // declared, not defined
int B::*ptr_int_member = 0 ; // How many non-static member variables does B have? 



Far more interesting question: How could pointers to non-static member functions be implemented?

Hint: If the pointer points to a virtual function, it behaves as expected - polymorphically. The same pointer may be null, or may point to either virtual or non-virtual functions.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
#include <iostream>

struct A
{
    virtual ~A() {}
    virtual void foo() const = 0 ;
    void bar() const { std::cout << "A::bar\n" ; }
};

void baz( const A& a, void (A::*pfn)() const ) { (a.*pfn)() ; }

int main()
{
    struct B : A
    {
        virtual void foo() const override { std::cout << "*** main::B::foo\n" ; }
        void bar() const { std::cout << "*** main::B::bar\n" ; }
    };

    void (A::*pfn)() const = 0 ;

    B b ;

    pfn = &A::foo ;
    baz( b, pfn ) ; // *** main::B::foo

    pfn = &A::bar ;
    baz( b, pfn ) ; // A::bar
}
Last edited on Aug 10, 2012 at 8:23pm
Aug 10, 2012 at 9:40pm
Far more interesting question: How could pointers to non-static member functions be implemented?


Would posting a link to the Fast Delegates webpage qualify as a spoiler?
Aug 10, 2012 at 10:06pm
Cubbi, I've read that in the past, definitely an interesting article, one I think I will review :)

JLBorges, int A::*p0 = 0 ; is defined, but doesn't seem to set p0 to "0", it sets it to -1 (same as nullptr). If an implementation stored the nullptr (0) value as '5' in p0, it would be no different than storing it as -1, except that the dereference logic would have to be unnecessarily complicated/inefficient, such as

1
2
3
4
5
6
7
8
9
10
11
12
if (mvp == 5)
{
    return 0;
}
else if (mvp > 5)
{
    return obj + mvp - 1;
}
else // mvp < 5
{
    return obj + mvp;
}


Wouldn't this still be a (silly) standards-compliant implementation, or am I not understanding correctly?

Without using the fast delegates article, my recollection is something along the lines that for non-virtual functions, the address (or offset from A) is stored in the fp. For virtual functions, the index to lookup in the vtable, along with the offset required for the 'this' pointer to make it look like the appropriate type of object is stored. IIRC, this can be a single union. Some implementations (in the past?) had generated a thunk function to offset the this pointer and then call the appropriate function. I believe this would result in member function pointers being the same size as non-member function pointers, but you'd potentially have a lot of auto-defined functions.
Last edited on Aug 10, 2012 at 10:09pm
Aug 11, 2012 at 2:32am
> If an implementation stored the nullptr (0) value as '5' in p0, it would be no different than storing it as -1,
> except that the dereference logic would have to be unnecessarily complicated/inefficient, such as ---
> Wouldn't this still be a (silly) standards-compliant implementation

Yes it would be. I stand corrected. Thanks.

1
2
3
4
5
6
7
8
if (mvp == 5)
{
    // return 0;
    return -1 ; // the key point is that the null pointer must be
                // distinguishable from pointer to member at an offset of zero
  
}
else ...



> For virtual functions, the index to lookup in the vtable, along with ...

The key point again being that the null pointer must be distinguishable from the pointer to member function with an offset of zero in the vtable.

Aug 31, 2012 at 9:45pm
As a follow up, the studying paid off in the end, so finally gainfully employed again :) Thanks all! I did have one amusing situation though, where an interviewer asked me:

"If 2 binaries both link to the same shared library, are there 2 copies of the shared library in memory or just one used by both binaries?"

I fairly confidently answers "2 copies, one for each binary". He explained I was wrong and that, in fact, only 1 copy ever exists in memory, and proceeded to use the rest of his interview time building on and asking questions about behavior related to this shared-in-memory library. I even started to doubt myself by the end...suffice it to say, I didn't take that job!
Sep 1, 2012 at 5:12am
> "If 2 binaries both link to the same shared library,
> are there 2 copies of the shared library in memory or just one used by both binaries?"

>> I fairly confidently answers "2 copies, one for each binary".

I think the two of you used two different meanings of 'in memory' - in physical memory (pages in RAM/swap) or in virtual memory (virtual pages in a process address space).

Typically there is only one object in physical memory for each non-writable section (code,read only data).

If the implementation supports COW, initially there is only one object in physical memory for each writable section; but shadow objects are created on a per-process basis as writes into memory take place.


Sep 1, 2012 at 9:48pm
Hmm, maybe should have done more investigation. The stack overflow questions I saw seemed to indicate each process received its own copy - I suppose I was wrong in this case after all!

http://www.linuxquestions.org/linux/articles/Technical/Understanding_memory_usage_on_Linux
Topic archived. No new replies allowed.