I need a math wiz to help solve this ^^

> That test misses some very important points

Such as?

> and throws in a red-herring by using the optimizer on a common case.

Optimization is not a red-herring on any discussion on the run-time performance of a C++ construct. Ignoring it is naivety.

Tip: Cast an int to a unsigned char pointer like:

int CompileInt(int a, int b, int c, int d) {
int result = 0;
unsigned char * cp = (unsigned char *)&result;
cp[0] = a;
cp[1] = b...

And inverse to decompile. You will get the same values from bitshifting, but i think this is easier.
EDIT: Corrected.

Last edited on

No you won't. You'll get the same values as the union, which is a mistake.

@JLBorges
I have no interest in wasting time with you.

> I have no interest in wasting time with you.

A sagacious decision.

Trying to establish that compiler optimisations are irrelevant in assessing run-time performance would be a waste of time.

I never said any such thing, jerk.
I expect you to continue to twist words, so have at it now.

@Duoas , Yes, my fault, sorry. Don't you get the same value with a bitshift anyways? And i don't think my idea is a mistake, if you get sure you don't go out of bounds (an int's size is 4 times a char's size, and a unsigned char's size is the same of a char's size which is 8 bits, and an generic int is 32 bits, but you can get problems with endianness)

Last edited on

It depends on your machine's endianness, which might actually be something really weird.

Moschops (7244)

I don't fully understand all this "bit shifting" and "undefined behavior". Could someone explain this to me as if I were a 5 year old?

I'm a little late, but here's some text on undefined behaviour. It's taken from Van der Linden's "Expert C Programming".

undefined— The behavior for something incorrect, on which the standard does not impose any requirements. Anything is allowed to happen, from nothing, to a warning message to program termination, to CPU meltdown, to launching nuclear missiles (assuming you have the correct hardware option installed).

with the following rather neat example:

The original IBM PC monitor operated at a horizontal scan rate provided by the video controller chip. The flyback transformer (the gadget that produces the high voltage needed to accelerate the electrons to light up the phosphors on the monitor) relied on this being a reasonable frequency. However, it was possible, in software, to set the video chip scan rate to zero, thus feeding a constant voltage into the primary side of the transformer. It then acted as a resistor, and dissipated its power as heat rather than transforming it up onto the screen. This burned the monitor out in seconds. Voilà: undefined software behavior causes system meltdown!

Gonna be my new signatures on forums. LOL

closed account (zvRX92yv)

Win quote!

Anyway, say you have a number, 5230.
If you move the digits to the right, once, then it becomes 523, which is 10 times less than before.

That's because decimal is a base-10 number system.

Whereas binary is a base-2 number system, so, when you move it to the right or left, it halves or doubles.

But it depends on the machine's endian, meaning the machine could store the byte where the first digit is to the right or the left, little-endian is the first digit ending on the right (common?) and big-endian is the first digit ending on the right.

Easy. xD

Last edited on

> and throws in a red-herring by using the optimizer
> I never said any such thing, jerk.

Ah! So 'red herring' actually meant 'relevant to the issue'.

Glad that could be sorted out, even if it resulted in the introduction of personal abuse into a technical discussion.

@viliml > it isn't undifened behaviour at all!
> "union" means that all of the elements in it occupy the same memory! band
> since an int is 4 times bigger than a char array of 4,
> the behaviour is complety defined, as long as you initiate at least one member.

It isn't undefined behaviour because the type involved is a char and standard explicitly makes a special case:

If a program attempts to access the stored value of an object through a glvalue
of other than one of the following types the behavior is undfined:

... <elided>

--- a char or unsigned char type.
...

This results in well defined behaviour:

struct A // 'standard layout' struct
{
    int i ;
    char c ;
    double g ;
};

struct B // 'standard layout' struct
{
    int j ;
    char d ;
    long long h ;
};

union U
{
    A a ;
    B b ;
    char data[ sizeof(A) ] ;
};

void foo( A& aa, U& uu, std::fstream& stm )
{
    uu.a = aa ;

    // accessing the stored value of an object through an lvalue of type
    // char or unsigned char is well defined.
    stm.seekp(0) ;
    stm.write( reinterpret_cast<char*>( &aa ), sizeof(aa) ) ; // well defined
    stm.seekp(0) ;
    stm.write( u.data, sizeof(A) ) ; // well defined
    stm.seekg(0) ;
    stm.read( u.data, sizeof(A) ) ; // well defined because A is a 'standard layout' type

    // well defined because A and B are 'standard layout' types
    // with a common intial sequence
    B& bb = uu.b ;
    int i = bb.j  ; // well defined
    char c = bb.d ; // well defined
}

Last edited on

OK, I'll waste a little fun. Thanks for being so honest.
http://www.youtube.com/watch?v=HAe3FpLGvBY

cire (8284)

I can see where you might interpret it that way, JLBorges, and if the standard made an explicit exception for the legality of acessing a non-active type as it does for POD-structs that contain a common initial sequence in 9.5 I would subscribe to that interpretation. But it doesn't make accessing a non-active member legal if it's of type char, unsigned char, or an aggregate of char.

Your quote is describing part of the definition and behavior of lvalues and rvalues. There is no mention of an exception to the way union members may be accessed.

Last edited on

> Your quote is describing part of the definition and behavior of lvalues and rvalues.

My quote is from the part that specifies what kinds of aliasing is permitted under the standard:

If a program attempts to access the stored value of an object through a glvalue of other than one of the following types the behavior is undeﬁned:
... <elided>
The intent of this list is to specify those circumstances in which an object may or may not be aliased.

> There is no mention of an exception to the way union members may be accessed.

Contrary to what has been stated in this thread, the standard does not have any words to this effect: "You can't write to one member of a union and read from another." It does say:

In a union, at most one of the data members can be active at any time, that is, the value of at most one of the data members can be stored in a union at any time.

How that data member can be accessed is subject to the aliasing rules that were already specified earlier in the standard. These aliasing rules didn't cover one special case, which is specially stated:

If a standard-layout union contains two or more standard-layout structs that share a common initial sequence, and if the standard-layout union object currently contains one of these standard-layout structs, it is permitted to inspect the common initial part of any of them. Two standard-layout structs share a common initial sequence if corresponding members have layout-compatible types and either neither member is a bit-ﬁeld or both are bit-ﬁelds with the same width for a sequence of one or more initial members.

If standard layout types are involved,

union U // standard layout union
{  
    A a ; // A is a standard layout type
    B b ; // B is a standard layout type
};

and the union currently contains an object of type A, it is permissible to inspect the member b provided it is permissible that an object of type A may be aliased through a glvalue of type B. In particular, the object representation of any object of a standard layout type can be treated as an array of char:

For any object (other than a base-class subobject) of trivially copyable type T, whether or not the object holds a valid value of type T, the underlying bytes making up the object can be copied into an array of char or unsigned char. If the content of the array of char or unsigned char is copied back into the object, the object shall subsequently hold its original value. [ Example:

#define  N  sizeof(T)
char  buf[N];
T  obj;
// obj initialized to its original value
std::memcpy(buf,  &obj,  N); // between these two calls to std::memcpy, obj might be modiﬁed
std::memcpy(&obj,  buf,  N); // at this point, each subobject of  obj of scalar type holds its original value

— end example ]

This program is well-formed:

int main()
{
    // the dynamic type of j is the same as the the dynamic type of i
    { union U { int i ; int j ; } ; U u ; u.i = 7 ; int v = u.j ; }

    // the dynamic type of j is a cv-qualifed version of the the dynamic type of i
    { union U { int i ; volatile int j ; } ; U u ; u.i = 7 ; int v = u.j ; }

    // the dynamic type of j is an unsigned type corresponding to the the dynamic type of i
    { union U { int i ; unsigned int j ; } ; U u ; u.i = 7 ; unsigned int v = u.j ; }

    // the dynamic type of j is a cv-qualifed unsigned type corresponding to the the dynamic type of i
    { union U { int i ; volatile unsigned int j ; } ; U u ; u.i = 7 ; unsigned int v = u.j ; }

    // the dynamic type of j[0] is char
    { union U { int i ; char j[sizeof(int)] ; } ; U u ; u.i = 7 ; int v = u.j[0] ; }

    // the dynamic type of j is similar to the dynamic type of i
    // (as defined by permissible qualification conversions)
    { union U { int* i ; const int* j ; } ; int v = 7 ; U u ; u.i = &v ; int v2 = *u.j ; }
}

Note: These permissible aliasing rules are popularly called 'strict' aliasing. From the gcc manual:

-fstrict-aliasing
Allow the compiler to assume the strictest aliasing rules applicable to the language being compiled. For C (and C++), this activates optimizations based on the type of expressions. In particular, an object of one type is assumed never to reside at the same address as an object of a different type, unless the types are almost the same. For example, an unsigned int can alias an int, but not a void* or a double. A character type may alias any other type.

Last edited on

cire (8284)

If this is accurate, it would seem to be legal with c++11 (but not prior to.) I've been using the final working draft (N3337) which differs:

If a standard-layout union contains several standard-layout structs that share a common initial sequence, and if an object of this standard-layout union type contains one of the standard-layout structs, it is permitted to inspect the common initial sequence of any of standard-layout struct members.

Last edited on

viliml (791)

So, is it defined dor undefined behaviour? I don't see any reason why writing into one member of an union and then reading the other will be undefined behaviour, if they occupy the same memory, or if the one you write into is bigger(in memmory size) than the one you read from!