const_cast issue

Forum

Forum
General C++ Programming
const_cast issue

Aug 26, 2009 at 10:21pm

This program, compiled in Xcode (gcc 4.0):

#include <iostream>
using namespace std;

int main()
{
const int i = 10;
int *iP;
iP = const_cast<int *>(&i);
*iP = 20;
cout << "address " << iP << " contains value: " << *iP << endl;
cout << "address " << &i << " contains value: " << i << endl;
return 0;
}

produces this output:

address 0xbffff568 contains value: 20
address 0xbffff568 contains value: 10

However, when I debug it, I see i go from 10 to 20. But it prints 10, as you can see. The same address containing two different values?! I don't think so!

So is this a compiler optimization at play here, where the compiler is "hard compiling" the i to be 10 even though I'm using const_cast to cast away constness on the pointer to i to "sneak around" and change i? It seems to me that this should fail at compile/run time or i should change. Something isn't right in Denmark.

Is this a compiler bug? Am I misusing const_cast?

Aug 26, 2009 at 10:33pm

helios (17607)

It seems i is an alias for the literal 10, while at the same time being a variable in memory. Referencing i always gives 10 because its interpretation as an alias is done before its interpretation as a variable. Accessing the value by other means gives the real value. Try *(int *)&i.

Last edited on Aug 26, 2009 at 10:34pm

Aug 26, 2009 at 10:49pm

Jacko (12)

Helios,

Using:
cout << "address " << &i << " contains value: " << *(int *)&i << endl;

in place of:
cout << "address " << &i << " contains value: " << i << endl;

indeed worked. Why?

It says cast the address of i to an int* and then dereference the resulting int* it to get i, right? I have no idea why that works.

Aug 26, 2009 at 11:10pm

Jacko (12)

This also works, using C++ style cast in place of C style cast :

cout << "address " << &i << " contains value: " << *(const_cast<int *>(&i)) << endl;

Still not sure why.

And it seems to me my original problem I submitted represents a bug, i.e., the compiler should either (1) allow changing the const via the pointer or (2) report an error at compile-time or throw an exception at run-time.

Am I wrong? Or is this just another weirdness of C++, a design decision?

Aug 26, 2009 at 11:21pm

Jacko (12)

But maybe, Helios, it's like you said: it's a matter of how i is "interpreted", and there are rules in the C++ specification about this, I'm guessing. If you are accessing const int i in the raw, it's an alias. But once you start asking for the address of i, the interpretation is that i is a variable.

Though...these didn't work:
cout << "address " << &i << " contains value: " << *(&i) << endl;
cout << "address " << &i << " contains value: " << *(static_cast<const int *>(&i)) << endl;

In summary:

cout << "address " << iP << " contains value: " << *iP << endl;
cout << "address " << &i << " contains value: " << i << endl;
cout << "address " << &i << " contains value: " << *(&i) << endl;
cout << "address " << &i << " contains value: " << *(int *)&i << endl;
cout << "address " << &i << " contains value: " << *(const_cast<int *>(&i)) << endl;
cout << "address " << &i << " contains value: " << *(static_cast<const int *>(&i)) << endl;

yielded:

address 0xbffff568 contains value: 20
address 0xbffff568 contains value: 10
address 0xbffff568 contains value: 10
address 0xbffff568 contains value: 20
address 0xbffff568 contains value: 20
address 0xbffff568 contains value: 10

Aug 26, 2009 at 11:26pm

helios (17607)

Like I said, i is the int alias. The compiler will replace its appearances at compile time with the integral literal 10. However, the compiler still needs to allocate the variable on the stack in case someone wants to access it with a pointer or something. This variable isn't constant (obviously), and there's nothing that can prevent you from writing to it at run time, since it's in a writable part of memory (the stack).

So, i is always replaced at compile time with 10. &i gives the address of the variable. (int *) casts the pointer to a non-const 'int *'. * dereferences that pointer, therefore giving back the real value of the variable created on the stack.
The cast to (int *) is necessary, otherwise the compiler can figure out that *&i is the same as i, and again replace it with 10.

Aug 26, 2009 at 11:35pm

Jacko (12)

So it's all about what optimizations the compiler can generate? If the compiler can figure out that something *shouldn't* change, it will proceed to generate machine code accordingly, assuming it doesn't change. So you have to trick the compiler? Couldn't a compiler figure out the (int *) cast trick too? Where does this end, playing chess with the compiler developers? When do you know for sure your code will work? and will continue to work when a new compiler comes out? God I hope there's no code like this in our missile silos!

And...isn't the fact that the address of i, and the contents of i don't match up...a B-U-G? <the crowd hushes>

In one line of code (the original cout line for i), &i and the the contents of i don't jibe!!

Aug 26, 2009 at 11:51pm

helios (17607)

Couldn't a compiler figure out the (int *) cast trick too?

No. Once you tell it a pointer points to a non-const, that's it. It can no longer be sure what the "real" type of the pointer is.

You should keep in mind that it's never a good idea to cast a const T * to a T *. The compiler can only enforce write permissions through the type system when the constness of the pointer target remains always the same.

And...isn't the fact that the address of i, and the contents of i don't match up...a B-U-G?

No. And shut up. You're the one who cast a const T * to a T *.
The compiler is still doing what it should. You told it that i will remain with a constant value of 10. If you then go around its back and forcefully change the value, that's a different thing.
For example, did you know that it's possible to make a reference point to something else? It's even more hackish than this, but it's possible. Does that mean you should do it? Of course not.

Here's a relevant excerpt from The C++ Programming Language:

Depending on how smart it is, a compiler can take advantage of an object being a constant in several
ways. For example, the initializer for a constant is often (but not always) a constant expression
(§C.5); if it is, it can be evaluated at compile time. Further, if the compiler knows every use of the
const , it need not allocate space to hold it. For example:
const int c1 = 1;
const int c2 = 2;
const int c3 = my_f(3); // don’t know the value of c3 at compile time
extern const int c4; // don’t know the value of c4 at compile time
const int *p = &c2 ; // need to allocate space for c2
Given this, the compiler knows the values of c 1 and c 2 so that they can be used in constant expressions.
Because the values of c 3 and c 4 are not known at compile time (using only the information
available in this compilation unit; see §9.1), storage must be allocated for c 3 and c 4 . Because the
address of c 2 is taken (and presumably used somewhere), storage must be allocated for c 2 . The
simple and common case is the one in which the value of the constant is known at compile time and
no storage needs to be allocated; c 1 is an example of that. The keyword e x t e r n indicates that c 4 is
defined elsewhere (§9.2).
It is typically necessary to allocate store for an array of constants because the compiler cannot,
in general, figure out which elements of the array are referred to in expressions. On many
machines, however, efficiency improvements can be achieved even in this case by placing arrays of
constants in readonly
storage.

Aug 27, 2009 at 3:11am

Jacko (12)

Hey now! First, I hope the "shut up" was meant in the funny way teenagers say it! ;-)

I also know it's not a good idea to do what I'm doing - I'm creating a const int and almost immediately de-const'ing it and changing it for crying out loud! I'm just trying to learn more about const_cast and what it can do, and I immediately came across this phenomenon of seeing that something and *(&something) are not always the same!

I'm also aware a const doesn't necessarily need storage, as in something simple like:

const int x 77;
y = x * 2; // only other place x is used, say

Anyway, this is where I always have trouble with C++: it's not just that you have to think like a compiler, but that you have to know which inconsistencies are allowed and which are not, which rule trumps which other rule(s). (In this example, the inconsistency between *(&i) and i being allowed in the final analysis, vs. the compiler allowing a change to a const through a pointer.) Everything you're saying is making sense, but I don't always know how to get there on my own. (There goes one Mensa membership down the drain!)

So, the compiler has access to all my code, and it's all in the same scope, main(). Couldn't a really smart compiler see that I'm accessing both i and &i and decide that that consistency must be maintained, and provide an error or warning at compile-time, or just outright disallow the optimization and make the cout statement actually look up i instead of trusting it's 10? Or that I'm const_cast'ing a pointer to it to a non-const int* and it is therefore open to being changed? Why does const reign supreme - that the compiler can optimize this as a literal, even though it can see, in the same scope, that it is being changed? (I think fully "getting" constness is something required to enter The C Plus Plus Club.) Is that just a spec'd behavior of C++?

I don't always know how to think like a C++ compiler. Sometimes it seems it would be easier to program straight in assembler, as, as hard as it is, there's no guesswork about which rule is going to trump whatever other rule. You're just moving things from registers and memory locations and so forth. Very mechanical. But I'm being facetious (sort of!). The abstraction that C++ provides is tremendously helpful, of course. But sometimes it's like driving a really high-performance sports car that requires you to climb on the hood and change the spark plugs while driving. Or like a tool that requires three hands to hold it. The tool can be cumbersome. And sometimes it can be harder working the tool itself than the problem the tool is being used to solve. (Java is great from that point of view: no pointers, all objects live on the heap and die automatically and everything but primitives are references. So you can concentrate on the problem you're solving, albeit with lesser performance than C++, no doubt. Good to know both C++ and Java.)

Let me let this sink in overnight. Maybe it will gel. Thanks.

By the way, how do you change what a reference references? I'd like to know it! (Not to use it, but just to know it.)

Aug 27, 2009 at 3:31am

jsmith (5804)

Well, by declaring a variable "const" you are explicitly allowing the compiler to make optimizations under the assumption that the variable will not change values. In this particular case, the compiler saw in the cout that you were outputting a const, so rather than read a memory location for the value, it "knew" what the value should have been.

You have to be very careful with const_cast for these reasons. (Usually the only times I need it are when I'm interfacing to a const incorrect function/library.)

EDIT: 2 additional things.

First, normally, const int x = 77; would allow the compiler to just use the constant 77 without the need to even allocate memory for the "variable". However, by taking the address of x, it forces the compiler into allocating memory for x, and since the compiler does not know if you will dereference it at any point, it must put 77 there.

So this explains the behavior you saw.

Second, when you are wondering what rule trumps what rule, you should remember that compilers are dumb. The answer to your "really smart compiler...." question is no. When compiling a line of code inside a function, the compiler does not remember what you've done previously in the function, nor does it know what you will do on subsequent lines of code: it only knows what you are doing on the line of code it is compiling.

Last edited on Aug 27, 2009 at 3:47am

Aug 27, 2009 at 3:54am

helios (17607)

First, I hope the "shut up" was meant in the funny way

It was meant in a funny way.

Couldn't a really smart compiler see that I'm accessing both i and &i and decide that that consistency must be maintained, and provide an error or warning at compile-time

Getting a pointer to a const is too common. The warning would come up too often and be ignored.
What could produce a warning could be a const T * being cast to a T *, be that implicit or explicitly. Of course, C++ is designed to let you shoot yourself in the foot if you really want to. If you're explicitly doing the cast, it's because you know what you're doing.

or just outright disallow the optimization and make the cout statement actually look up i instead of trusting it's 10?

That could be done, but again, you told the compiler that the value of i wouldn't be changing. You're the one who broke the contract. The compiler is free to optimize this if it wants.

even though it can see, in the same scope, that it is being changed?

It can't see it. Once you get the pointer to the variable/const, the compiler can no longer tell what's going on. Pointers can be used in so many different ways, the compiler doesn't even bother.

By the way, how do you change what a reference references? I'd like to know it! (Not to use it, but just to know it.)

Don't stare at it for too long or you might get a headache: http://www.cplusplus.com/forum/beginner/3958/page2.html#msg18630
It only works when compiled by VC++. I haven't tested it in different versions, so it may even only work when compiled by VC++ 9.0 (2008).
Also read the thread from the beginning. The discussion is pretty interesting. My opinion on references has improved a bit since then. I don't hate them anymore. I don't love them, but I don't hate them.

Last edited on Aug 27, 2009 at 3:56am

Aug 27, 2009 at 4:01am

Jacko (12)

Couldn't a given compiler not do this optimization in version, say, 3.0 (i.e., look-up i in memory rather than "hard-compile" it as "10"), and then institute this optimization in version 3.5, making the code act differently than the developer intended or expected? If the answer is yes, I guess you'd say that was the developer's fault? If the answer is no, is it that it is indeed spec'd that a const NOT be evaluated by inspecting memory, but by hard-compiling it as 10?

I wonder what other scenarios allow for different results in the behavior in a compiled program for the same source code using different compilers. Sounds dangerous. And if these ambiguities exist, I wonder if there's a way to code that is the most conservative, to avoid anomalies from one platform to anoter. In the case of the simple program I wrote above, if Compiler A hard-compiled i as 10, and Compiler B looked up the value of i in the cout statement in memory, despite its const-ness, (provided those two interpretations of C++ compiler specification are equally valid) I gotta think about how one would code it in the most conservative way to avoid different behaviors when compiled with A and B. Not sure...

By the way, having Compilers A and B is not unusual, right? I've worked in environments where the same source code was used in Xcode, Visual Studio, and various compilers on Unix and Linux. I guess there are always preproceesor directives to account for differences. But that assumes you're catching all the issues before your customer does.

Aug 27, 2009 at 4:25am

helios (17607)

It could, but if it happens and your code breaks, it's because you're not following the standard as you should and it's your own fault. If you are following the standard and it still breaks then yes, it's a compiler bug.

I wonder what other scenarios allow for different results in the behavior in a compiled program for the same source code using different compilers.

The source of most of them is between the screen and the chair.
Some of them aren't (entirely) the programmer's fault:

1
2

int a=10;
std::cout <<*(char *)&a<<std::endl;

The result of this can't be known without knowing the platform's endianness. At the same time, the result of this can be used to figure it out.

And if these ambiguities exist, I wonder if there's a way to code that is the most conservative

Again, by following the standard.
Apart from that, Mozilla wrote a list of things to avoid when writing C++. Most of them are related to incomplete implementations of the standard, including, for example, the keyword export.

By the way, having Compilers A and B is not unusual, right? I've worked in environments where the same source code was used in Xcode, Visual Studio, and various compilers on Unix and Linux. I guess there are always preproceesor directives to account for differences. But that assumes you're catching all the issues before your customer does.

Conditional compilation is done mainly for these reasons:
1. The code will be compiled for different OSs and you need to interface with different APIs.
2. You need to account for low-level differences, such as endianness.
3. You're testing different versions of a piece of code to, for example, find the fastest.
4. You want to enable/disable features at compile time based on switches.
5. (Rare) You need to account for compiler bugs or lack of compliance. For example, MinGW has a non-compliant swprintf().
6. The source may be compiled or included by a C or a C++ compiler.

Last edited on Aug 27, 2009 at 4:26am

Aug 27, 2009 at 4:47am

Jacko (12)

OK, so this is something I don't understand: suppose you're coding a program like I wrote above and you're expecting that the compiler may optimize it, and that's what you actually want. So you write it, compile it, and test it and it turns out it indeed does optimize it and it prints out 10. Then, it's compiled using another compiler that doesn't optimize and it breaks, printing out 20. Whose fault is that? The coder tested it, expecting the optimization. And the compiler doesn't have to optimize it, right?

Also, jsmith, you say compilers are dumb, doing things line by line, with each line inspected in isolation. But at least type, existence of variables, etc. is carried from line to line, right? A compiler expects a return statement in a non-void function. Members of structs and classes need to be checked against their definitions, line by line. So there is some level of inspection beyond lines in isolation. I'd like to better understand this. Any resource you suggest?

Aug 27, 2009 at 12:38pm

jsmith (5804)

To your first question: the programmer is at fault, because the const_cast and subsequent modification of the
referenced variable extends into the realm of "undefined behavior". The standard does, in places, allow for compiler implementation dependent behavior. The real answer is that a programmer should never rely on
the compiler to make or not make a particular optimization unless said optimization can be controlled via
command-line option.

To your second question: you are right. Compilers are great at syntax checking and some minimal semantic
checking. But in general, as was mentioned, the compiler cannot follow semantics through pointers. I suggest
reading up on compiler construction.

Topic archived. No new replies allowed.