All of the introductory material on inheritance I've come across teaches that if "class A inherits from class B", that "class A is-a class B", which seems to suggest that they're one in the same class and that when you instantiate B, you don't actually instantiate A -- but that's just me inferring what's happening underneath the hood. What actually happens?
But after watching [1], I had to refine my understanding of inheritance to understand vtables/vptrs and dynamic casting. Specifically, we have to separate the derived object from the base object and that the derived object actually "contains" the base object (the base object is known as the "subobject"). Whether "contains" means "has a pointer/reference to the base object", I don't really know.
that the derived object actually holds a reference to the base object
No, that's not the case. The this pointer actually points to the entire data including the base class(es) and the virtual function table. The inherited class may override virtual function within the virtual table during the construction (before the body of the constructor is involved).
All of the introductory material on inheritance I've come across teaches that if "A inherits from B", that "A is-a B", which sort of implies that they're one in the same object.
A and B are not objects. They are types (classes are types).
Humans have inherited many traits from our mammal ancestors.
Humans are mammals.
Not all mammals are humans.
That said, in programming inheritance is mainly a tool to define a common interface (in a base class) that other (derived) classes can implement in different ways. Then you can write code that only use this common interface and it will be able to handle all objects of these classes without having to change the code. At least that's the idea.
ElusiveTau wrote:
But after watching [1], I had to refine my understanding of inheritance to understand vtables/vptrs and dynamic casting.
Virtual tables/pointers is an implementation detail. It's how it's implemented in practice but it doesn't have to be. This is usually not something you need to be aware of.
Dynamic casting is sometimes useful but it kind of takes the beauty out of it because you do special handling of certain classes and no longer rely on just the common interface.
ElusiveTau wrote:
Specifically, we have to separate the derived object from the base object and that the derived object actually holds a reference to the base object (the base object is colloquially known as the "subobject").
The "derived object" contains the "base object". There is normally no need to store a reference/pointer to it inside the object.
"Subobject" is a standard term. Member variables are also subobjects.
Array elements are subobjects of the array.
a simple example of the most basic difference between them:
class A has method foo.
class B inherits a.
class B is both a B and an A. Class B object can say b.foo() same as an A object can say a.foo().
consider class C, which has an A inside it:
c.foo() //... no such thing!
c.a.foo(); //ok
ok, so in practice both b and c can get to an a.foo, sure. but this is only one of many examples where b and c are not exactly the same thing, and when you get into more advanced concepts like polymorphism, you will see complete breakdown of any thoughts that these two are virtually identical. At the beginning, when you just get into classes and objects, they will be more similar than different, that is true.
the mammal example is great:
I am a mammal.
A dog is a mammal.
I can have a dog.
but I am not a dog.
it is critical that you understand they are not the same -- even if the differences are unclear at first -- because c++ supports BOTH.
The "derived object" contains the "base object". There is normally no need to store a reference/pointer to it inside the object.
But what does it precisely mean for a derived object to "contain" a base object?
As I understand it, the derived object uses pointers and offset calculations to point to functions (and data members) that should be called/accessed and the collection of function implementations or data members that were declared-in/provided-by the base class, effectively constitutes the base class "subobject".
Peter87 wrote:
Virtual tables/pointers is an implementation detail. It's how it's implemented in practice but it doesn't have to be. This is usually not something you need to be aware of.
It's something that comes up in interviews and the low-level implementation detail helps me understand what's being displayed in an IDE (Visual Studio, for instance, displays the base-class subobject and its __vfptr in the Local variables).
I also recently delved into the topics of polymorphism and object/slicing and found it confusing to understand without the notion of subobjects.
it is critical that you understand they are not the same -- even if the differences are unclear at first -- because c++ supports BOTH.
I think I may have given you the wrong impression. When I wrote in my original post (before Edit #1) that the derived object has a pointer/reference to the base class subobject, I didn't mean to describe a "composition" relationship (in the UML lingo) where an object of the base class was constructed and the derived object has a pointer to it.
The ambiguity beginners (like myself) face is understanding what is precisely meant when someone (i.e., the guy in the video) says the derived object "contains" the base class subobject.
I see. A lot of tutorials, esp. at the beginner level, the authors are a little careless with words like this, trying to explain it simply as a starting point. The things you are asking about, honestly, are at least intermediate if not advanced topics. That is a good thing -- you are coming in strong, but it may make some beginner material seem weird at times.
But what does it precisely mean for a derived object to "contain" a base object?
It means that the "base object" is stored inside the "derived object". It's not stored separately.
Note that an object, the way the C++ standard defines it, occupies one "region of storage", not multiple. This is not necessarily how we always think about it. For example, we would probably think of a std::string object as containing the characters that we store in it but technically the std::string object and the array where the characters are stored would be two separate objects (ignoring short-buffer optimizations).
In the above example the length, capacity and data pointer is stored inside the std::string object (they affect the value returned by sizeof) but the char array is a separate object that is not stored inside the std::string object (it doesn't affect the value returned by sizeof).
ElusiveTau wrote:
As I understand it, the derived object uses pointers and offset calculations to point to functions (and data members) that should be called/accessed and the collection of function implementations or data members that were declared-in/provided-by the base class, effectively constitutes the base class "subobject".
The functions are not stored inside the objects. Virtual functions are normally looked up using a vtable. The vtables are stored per class. Each object just needs a pointer to its vtable (vptr).
I think the offsets needed to access members/subobjects are just static type information (or possibly information that can be looked up in the vtable) and don't need to be stored inside each object.
Thanks for the input, folks. I try to minimize the number of open-ended questions but felt this one needed to be asked since there is very little written on it. I'll leave this topic unsolved in case I have more thoughts on it in the future, when I have more experience.