My impression is that type information such as int, float, and int* are not included in binaries. Type information are only present in C++ to let the compiler know which operations are allowed for a particular piece of data.
On the other hand, I see that binary analysis tools are actually able to recover (or at least guess?) type information from binaries.
(A concrete example: ollydbg/cheatEngine is seemingly capable of doing data structure analysis given any random starting address)
Could someone help me reconcile these observations and/or correct my misconceptions?
During development type information is often included to allow debuggers to get enough information in order to be useful. When the program is finished and the binary is intended for an end user we often compile without debug information but some type information might still be required for things like dynamic_cast, exceptions and typeid to work correctly.
On some occasions, the type information of a particular piece of data is explicitly included in the binary (e.g., when the data is an instance of a polymorphic class).
However, when the type information is altogether missing (for example, for plain old data types such as int, double), analysis tools are nevertheless able to easily? recover the type information based on the data's usage.
> when the type information is altogether missing (for example, for plain old data types such as int, double),
> analysis tools are nevertheless able to easily? recover the type information based on the data's usage.
Yes.
For instance, x64:
1 2 3 4 5
movl (address of) some_variable, %eax # some_variable ought to be a 32-bit value
movq (address of) another_variable, %rcx # another_variable ought to be a 64-bit value
movl (%rcx), %eax # contents of rcx (another_variable) is a pointer
# contents of eax (some_variable) is a 32-bit integer
# ergo, another_variable is a pointer to a 32-bit integer