Casting unsigned int to pointer

I have class attribute that saves a pointer to an unknown object. Instead of saving it as a void *, it is saved as an unsigned int. Suppose I reach an operation that knows what type is being reference, say, myClass. What is the preferred method for casting the unsigned int to a myClass*?

I tried using a reinterpret_cast to do this (reinterpret_cast<myObject *>myAddress;), and it works, but it is taking a very long time to run.

I used to save an offset in the unsigned int instead of an address. I then added the offset to a buffer address and casted with a static_cast, and that was much faster... like this:

1
2
3
myClass *myObject;
unsigned int myOffset;
myObject = static_cast<myObject*>(buffer) + myOffset;


The reason I changed is because is I'm looking for performance improvements, and I figured getting rid of the pointer math would help, and a prototype suggested it would. However, the reinterpet_cast appears to be taking 5 times as long as the addition plus the static_cast. Do reinterpret casts really have that much overhead? I have googled on casting unsigned ints to pointers, and I cannot find guidance on this anywhere.
I'm confused.
Why are you casting a void * to an unsigned int? There's no guarantee that int is big enough to hold a pointer. Why don't you just save it as a void * and then use a traditional cast? (T *)pointer
That's why I'm using an unsigned int, and not an int.

The reason I'm doing this is because I'm in the process of a design transition, where the value was an offset into an array, and the array was being saved as an unsigned int (not an int per se). I'm replacing the offset with an actual address, and for my first go around I still have some instances that use the object as a container for offsets, not void *s. In any case, it turns out my performance problem was a data alignment issue. That said, I would still like to know what the proper practice is for casting an unsigned int to a pointer... assuming it is proper practice, as I cannot find any examples of it on the internet (so far).
int and unsigned int have the same size. If a number doesn't fit in int, it won't fit in unsigned integer, either (e.g. 2^31 can be assigned to a 32-bit signed int without data loss).
Pointers may be of a different size than int. For example, Borland C++ uses (last time I checked) 16-bit ints. If you were to use such an implementation, the program wouldn't work.

You say you were previously storing a reference to an element as pointer to the start of the array casted to an int (signed, unsigned, whatever) and an offset in that array. That doesn't explain why you cast the pointer in the first place. Unless you really like performing casts, I don't really see the point.

And no, casting pointers to integers and viceversa makes you score 8 in the Bean Scale of Bad Programming.
Yup, sizeof( void* ) == sizeof( unsigned ) is guaranteed I think by the standard, but not the general pointer case.

And since reinterpret_cast<>() generates no code whatsoever, I don't think it is the source of your slowness.

But unsigned is not necessarily the same as unsigned int. Is it? I'm not really sure.
Last edited on
Yes, unsigned and unsigned int are the same.
The size of pointers in general are not guaranteed to be anything. A char* could be a different size than an int*. A void* is guaranteed to be castable to other pointer types (AFAICR), so it follows that a void* should be no smaller than the largest of any other type of pointer's significant bits.

The size of int has nothing to do with the size of a pointer, and it is not guaranteed (AFAICR!) to be the same size as a void*.

The preponderance of casts between void* and int in existing code is due to hysterical raisons on Intel 80x86 hardware and the programming environments available on DOS and Windows to the hoi polloi, and examples by the lazy and "ought to know better".

If you plan to stay on Windows and Intel hardware, then have fun. Otherwise, tread softly. (Or, don't be upset when complaints come in from people on other hardware.)

Hope this helps.
Aren't ints (unsigned or not) and pointers (of any type) guaranteed to be the same size, which is the native CPU's register size (usually 4 bytes these days)?

I've written a few programs in both Windows and Mac OS X (which is a Unix kernal) with that assumption, and have had no reason to believe otherwise...

You should just be able to typecast an integer directly into a pointer...

myClass * thePointer = (myClass*)uintMemoryValue;

I'm not sure what your myOffset variable is for, there's probably a simpler way. But if you absolutely insisted on keeping an offset that holds the distance in memory between two objects, you could do something like this:

myClass * thePointer = (myClass*)(uintMemoryValue + myOffset);

Though this code should at least have a comment above it explaining what the heck myOffset has to do with anything...

Pointer arithmetic and type-casting like this is pretty fast when compiled in C/C++; I'm pretty sure that "type" information is transparent at the assembly level, everything is just offsets into memory in some form or other. Type-safety is a compiler thing, not a compiled thing. There is probably another thing your code is doing that's causing the slowdown.
Last edited on
When in doubt consult the oracle (not me! I mean the c++ standard.)
It's tough read, but so far I have found:
A pointer can be explicitly converted to any integral type large enough to hold it. The mapping function is
implementation-defined [Note: it is intended to be unsurprising to those who know the addressing structure
of the underlying machine. ]
A value of integral type or enumeration type can be explicitly converted to a pointer.) A pointer converted
to an integer of sufficient size (if any such exists on the implementation) and back to the same pointer type
will have its original value; mappings between pointers and integers are otherwise implementation-defined.

@cheif: read Duoas' post; he knows what he is talking about. You are a C programmer, aren't you? C-style casts as you use are very dangerous in C++ because contrary to what you state, some casts actually do generate code (in particular when inheritance comes into play).

I'm curious. Why would a char * have a different size than an int *?
The language just maps onto the underlying (virtual) machine.

If your machine uses 16bit pointers for code and 32bit pointers for data, or cannot mix pointers to signed and unsigned quantities or uses different size pointers for different kinds of objects, C++ will support those architectures.

It just so happens that WIN32 and i386 POSIX are based on a quite orthogonal machine and non of these things really matter, but they might on some other architecture.
Last edited on
In addition to being different sizes, they can have different bit patterns. That is why you should not use memset() (or the like) to set pointer values. Use an explicit assignment to zero (which the compiler transforms to the actual value, zero or not).

Example
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
typedef struct node_tag
  {
  int a, b;
  struct node_tag* next;
  }
  node_t;

node_t* zero_node( node_t* node )
  {
  /* zero all the usual bits */
  memset( node, 0, sizeof( node_t ) );

  /* now set the internal pointer to NULL *
   * (which might be different than zero) */
  node->next = 0;

  return node;
  }

Again, Intel and Motorola hardware typically make this a moot point, since NULL pointers are all zeroes on said hardware. But elsewhere the same does not hold true.

Hope this helps.
Last edited on
@ above memset is a builtinfunction ???

if than in which library ?
In the C standard library ( cstring )
You are a C programmer, aren't you?


Indeed ;). And also a C++, BREW, Java, and J2ME programmer.

The mapping function is implementation-defined [Note: it is intended to be unsurprising to those who know the addressing structure of the underlying machine. ]


I would not be surprised if some very small devices (like, smaller than cell phones) do mean things like try to save bytes when allocating pointers in heap vs stack... I've written virtual machines that do the same, and seen enough grievous atrocities against accepted standards in commercial architectures to know that stuff can change on you... fun, fun times.

But since trster didn't mention a small device...

unsurprising to those who know the addressing structure

If the program will probably use between 32KB and 2GB of ram, and the register size is 4 bytes...

C-style casts as you use are very dangerous in C++ because contrary to what you state, some casts actually do generate code (in particular when inheritance comes into play).


Correct, a C++ programmer can overload typecasting operations, and hide code and heavy weight processes behind simple syntax for the sake of code candy... that's one of the reasons I prefer C-style architecture over C++, because you can actually understand exactly what it's doing when you read it...

That could be what is going on to cause the slow down you mention, overloaded type-casting.
Last edited on
You're right about the casting operators; I hadn't even thought about that.

The case I was thinking of is inheritance where casts can generate code even if no typecasting operators are defined.

Eg,

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
class B1 {
   B1() : x( 3 ) {}
   virtual ~B1() {}
   int x;
};

class B2 {
   B2() : y( 4 ) {}
   virtual ~B2() {}
   int y;
};

class D: public B1, B2 {
    D() : z( 2 ) {}
    virtual ~D();
    int z;
};

int main() {
   D* d = new D;
   B2* b = static_cast<B2*>( d ); 
}


The dynamic_cast in and of itself generates at least an add instruction
because the B2 portion of D is located after the B1 portion, hence b != d.
The add instruction "fixes up" the "this" pointer.

If the above cast were done with a C-style cast (or a reinterpret_cast), the compiler would not generate the add instruction and b would not point to the B2 portion but rather the D portion.

[NB: I hope what I said here is correct. I do not have a compiler installed on this machine to test.]

You are right indeed, I had forgotten about that one :).

Typecasting an object that has multiple inheritance can add a small instruction to offset the address... Another fine reason to avoid multiple inheritance ;).

Although if you end up encountering that kind of problem, you will less likely have a slow down, and more likely have a segment fault that results in a crash. (Pro-tip: multiple inheritance can be really, really bad if someone tries to implement a memory management system to work alongside it. Use Composition instead! Composition will create relatively the same compiled code and memory structure with just a little bit more syntax in the source files, and no memory offset bugs.)
Last edited on
Topic archived. No new replies allowed.