Struct pointer conversion, possible undefined behavior?

Aug 6, 2020 at 6:55pm
I found a place where some weird casting was happening, and I wanted to confirm if someone knew it was legal.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
// Example program
#include <iostream>

struct Small {
    int data;
};

struct Big {
    Small small;
    int more_data;
};

int main()
{
    Big big { {42}, 100 };
    Small* small = (Small*)(&big);
    
    small->data = 43;
    
    std::cout << small << '\n'
              << &big.small << '\n'
              << small->data << '\n'
              << big.small.data << '\n';
}

0x7d0b336ecec0
0x7d0b336ecec0
43
43


Is the above example undefined behavior (Specifically, casting the Big* to a Small*)?
My guess is that this is actually legal, because of "reinterpret_cast" shenanigans, but I'm not sure.

Second code is the dual of this:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
// Example program
#include <iostream>

struct Small {
    int data;
};

struct Big {
    Small small;
    int more_data;
};

int main()
{
    Small small { 42 };
    Big* big = (Big*)(&small);
    
    big->small.data = 43;
    
    std::cout << &small << '\n'
              << &big->small << '\n'
              << small.data << '\n'
              << big->small.data << '\n';
}

0x7bc5f88d45c0
0x7bc5f88d45c0
43
43


Is this legal, so long as big->more_data is not accessed?
Again, my guess is that it is legal, and this is why the address of the first member of a struct is the same as the address of the struct.

And does anyone know what this construct or method is called, regardless of its legality (if it has a "name")? Would you just call it "object slicing", despite not using inheritance?

PS: Obviously this is horrible code that is hard to read. I'm trying to make existing code better.
Last edited on Aug 6, 2020 at 6:58pm
Aug 6, 2020 at 7:17pm
Additionally, I believe part of cppreference might be wrong, but I don't know enough to say how it should be fixed, because it invalidates what the whole example is saying. Maybe a cppreference editor will see this.
(edit: it appears cppreference is not wrong, the comment maybe could be a bit clearer)

On https://en.cppreference.com/w/cpp/language/reinterpret_cast, in the second code example under "Notes" there is a line of code that says:

1
2
S s = {};
auto p = reinterpret_cast<T*>(&s); // value of p is "pointer to s" 

Why would p not be a T*? Why is it saying that p is an S*?

This does not match what I get:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
// Example program
#include <iostream>
#include <typeinfo>

struct S { int x; };
struct T { int x; int f(); };

int main()
{
    S s = {};
    auto p = reinterpret_cast<T*>(&s); // value of p is "pointer to s" [Edit: s is not S, see doug4]
    
    std::cout << typeid(p).name() << '\n';
    std::cout << typeid(&s).name() << '\n';
}

P1T
P1S

(The strings generated are implementation-defined, but the point is that one is "pointer to t" and the other one is "pointer to s".)
Last edited on Aug 6, 2020 at 8:16pm
Aug 6, 2020 at 7:20pm
Aug 6, 2020 at 7:26pm
Careful. "Pointer to s" and "Pointer to S" mean different things in this context.

"Pointer to s" means a pointer to the object named 's'.

"Pointer to S" is a type which can point to an object of type S.

So, cppreference, while slightly confusing, is correct. p is pointing to s. It is a "pointer to type T" that is pointing to the object s, which happens to actually be of type S.

Aug 6, 2020 at 7:31pm
zapshe, it's hard to say whether those examples are applicable, because one is talking about conversions, and the other is about inheritance.

doug4, okay, yes I see that now, thanks.

But that's even more interesting, because it suggests that my OP is in fact undefined behavior, because it says
auto i = p->x; // class member access expression is undefined behavior; s is not a T object
Last edited on Aug 6, 2020 at 7:33pm
Aug 6, 2020 at 7:38pm
its not senseless; it gives you a form of very simple polymorphism, and it can be done in C, which is probably what happened here (looks like something a C coder would do when pulled in to work in c++ without knowing it deeply).

Ill give you that c++ has a better way to do this. But what is being accomplished has uses; its just the syntax that is questionable.
Aug 6, 2020 at 7:42pm
If one struct STARTS with the identical fields as contained by a shorter struct, it is acceptable to cast a pointer to an object of the longer struct to a pointer to the shorter struct. It is similar to inheritance, from a C perspective.

Casting the other direction is undefined behavior.

Rather than having Small be a member of Big (as in your example), consider common fields.

1
2
3
4
5
6
7
8
9
10
11
12
struct Small {int a; int b;};
struct Big { int a; int b; int c;};

Big b;
Small* s_ptr = (Small*)(&b);  // legal

Small s;
Big* b_ptr = (Big*)(&s)    // technically legal because you can cast anything you
                           // want with unsafe C-style casts
b_ptr->c = 6;  // Undefined.  Possibly not technically undefined, but certainly unsafe.
               // You will be overwriting data you don't own.

Aug 6, 2020 at 7:52pm
Ganado wrote:
Is the above example undefined behavior (Specifically, casting the Big* to a Small*)?

casting on its own doesn't do much, the more interesting question is whether the subsequent member access expression small->data is well-defined. In your case, it is, because Big is standard-layout and so pointer-interconvertible with its first member small.

on cppreference, it's under https://en.cppreference.com/w/cpp/language/static_cast "if the original pointer value points to an object a, and there is an object b of the target type (ignoring cv-qualification) that is pointer-interconvertible (as defined below) with a, the result is a pointer to b." as well as https://en.cppreference.com/w/cpp/language/reinterpret_cast you already linked

(and yes, as jonnin correctly notes, this rule has its roots in supporting C programming idioms)
Last edited on Aug 6, 2020 at 7:55pm
Aug 6, 2020 at 8:34pm
Written here:

http://www.cplusplus.com/doc/oldtutorial/typecasting/


The only guarantee is that a pointer cast to an integer type large enough to fully contain it, is granted to be able to be cast back to a valid pointer...

The conversions that can be performed by reinterpret_cast but not by static_cast are low-level operations, whose interpretation results in code which is generally system-specific, and thus non-portable. For example:

class A {};
class B {};
A * a = new A;
B * b = reinterpret_cast<B*>(a);


This is valid C++ code, although it does not make much sense, since now we have a pointer that points to an object of an incompatible class, and thus dereferencing it is unsafe.



So while still valid C++, it will also bring undefined behavior apparently.


I'd still say senseless, as there seems to be no real reason to do this. Polymorphism in C++ is cleaner, and it wouldn't even make sense to try and implement it with C.
Aug 6, 2020 at 8:44pm
Cubbi: Okay, that's starting to make more sense. I didn't realize the pointer-interconvertible note in the reinterpret_cast article applied here, but now I see that it does. Kind of confusing that clicking on the "pointer-interconvertible" link takes you to the static_cast page, but it is what it is.

I wasn't aware of the pointer-interconvertibility and "standard layout" rules, so I'll read up on them more. For now, things seem clear, thanks, so I'll mark this as solved unless something else comes up.

zapshe: Actually, after reading Cubbi's post/links, I think that code excerpt is fine (b is safe to dereference), but useless because class A and B don't have any members.
Two objects a and b are pointer-interconvertible if:
 • they are the same object, or
 • one is a union object and the other is a non-static data member of that object, or
 • one is a standard-layout class object and the other is the first non-static data member of that object, or, if the
   object has no non-static data members, any base class subobject of that object, or
 • there exists an object c such that a and c are pointer-interconvertible, and c and b are pointer-interconvertible.


jonnin wrote:
(looks like something a C coder would do when pulled in to work in c++ without knowing it deeply).
Yes, that's most likely exactly what it was.
Last edited on Aug 6, 2020 at 11:25pm
Topic archived. No new replies allowed.