Clarification on Unions

I was looking for a bit of guidance on when to use unions. The way that I've understood it from reading my book is that you should just use unions to store variables to save space(as they share the same address). Now tbh, this doesn't sound right hence why i'm asking.
It's used where a record can have different kinds of content in different circumstances or where different interpertations of the same underlying data are used.

For example, imagine your wrote a calculator that had variables that could either be a string or a number. You could use a union to hold the data, but interpret is differently depending on what kind it actually is.

One actual instance of where a union is used is in the storage of an IP4 address, in_addr. The union allows you to set the bytes individually or the entire thing.

1
2
3
4
5
6
7
8
9
10
typedef struct in_addr
{
    union
    {
        struct { u_char s_b1,s_b2,s_b3,s_b4; } S_un_b;
        struct {  u_short s_w1,s_w2; } S_un_w;
        u_long S_addr;
    }
} S_un;
} IN_ADDR;
FWIW... C++ and unions don't mix well; you should avoid using them if possible.
1
2
3
4
5
6
    union
    {
        struct { u_char s_b1,s_b2,s_b3,s_b4; } S_un_b;
        struct {  u_short s_w1,s_w2; } S_un_w;
        u_long S_addr;
    }


You should never do this. People think this is a great idea but it's bad because there's no guanatee that memory is aligned the way you expect -- the compiler is free to rearrange things at will.

You should only ever access the union member which was last written. So if you modify S_addr -- then S_addr is the only member you can read. The others can be considered corrupt and unreliable.

jsmith is right... unions don't have much of a place in C++. They are an alternative way to sort of mimic a ghetto inheritance structure:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
struct Parent
{
  enum type_t
  {
    TYPE_A
    TYPE_B
  };
  type_t type;   // type to determine which union member to use

  union
  {
    struct ChildA { /*...*/ };
    struct ChildB { /*...*/ };
  };
};


Of course with C++ inheritance and downcasting abilities, such a design is unnecessary.
It is non-portable, yes. But I don't agree these facilities should never be used. And it should be said that this was about explaining the feature.

C++ is a toolbox, all present features are there for a good reason borne from experience, admittedly, this one was inherited from C, but one might imagine it was introduced into C for a reason.

Not all C++ environment are Object Oriented. Not all C++ environments are large enough to accomodate prodigous use of the heap. Remember, C++ is for tiny enviornments too.

[EDIT]
And if you want to ban union, you should also ban reinterpret_cast as the criticisms are the same.
Last edited on
And it should be said that this was about explaining the feature.

C++ is a toolbox, all present features


No no... you misunderstand. It's not that it's not portable... it's that it's not standard. What you explained was not a language feature, it just so happens that it tends to work with compilers more often than not. There's no guarantee that a 100% standards compliant compiler won't explode with that code.

admittedly, this one was inherited from C, but one might imagine it was introduced into C for a reason.


I don't even think unions can safely be used as you described in C. I'm not 100% on that though.

AFAIK, their intended purposes was to be more for something like I described. Who really knows, though.
Unions were added to the language to optimize memory usage.

You can safely do

1
2
3
4
union {
    uint32_t ip;
    uint8_t   octets[ 4 ];
};


since all "members" will be positioned in memory at the same starting address.

(I specifically used the above types to avoid size of datatype issues).

The real problem with unions that I find (and will be going away in C++0x IIRC)
is that you cannot put any types inside unions that have constructors. That
leaves out std::string, all STL containers, and pretty much any useful class/structs
that the user declares unless the user does not provide any constructors AND the
type does not contain any members which have constructors.

That goes directly against the RAII "rule".

You can safely do
[snip]
since all "members" will be positioned in memory at the same starting address.


Not according to the e-books I've read. They say specifically that you should never read any union member other than the one last written to.

Can anyone with a copy of the language standard chime in with a quote/reference with something on point so we can have a definative word?
I agree with you both on both points, RAII and portability across environments and possibly compilers. User defined types aside, unions are the same in C++ as they are in C. Plus there's a packing issue.

But I disagree that they shouldn't be used.

reinterpret_cast is no nicer, no more portable and simply subverts the type system albeit in a controlled manner.

There are bit-fields too, non-portable and so on.

union does not stand alone as an anathema.
@Disch:

Yes, in the general case that is definitely true. Given

1
2
3
4
union {
    float f;
    int    x;
};


on a 32-bit machine, if you assign a value to x, you should not trust the value of f; it could even be NaN at that point.

Topic archived. No new replies allowed.