Size of Union and Struct

Hello,

I want to calculate the size of struct and union. Is the below true?
In the union, the output is 20, I expected being 24 (20 is size and 4 for padding)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
struct date
{
    unsigned int day; //4
    unsigned int month;//4
    unsigned int year;//4
};
struct  student
{
    int id; //4 + 4 padding
    char* name; //8
    char* family; //8
    date birth; //12  + 4 padding
 // sum of is 40
};


1
2
3
4
5
6
7
union s
{
    int id;//8
    char f[20]; //20 + 4 padding
    char f2[20];//20+ 4 padding
    char f3[20];//20+ 4 padding
};
It looks correct for the structs.

Your mistake with the union was to think alignof(int) == 8.
if you didn't catch it, padding on structs is not consistent across compilers! you can usually set flags to specify the pad to use, or you can just add it to the struct with a dummy byte array variable (see often in older code).
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
#include <iostream>
// #pragma pack(1)

struct test1
{
    short s; // 2 bytes
             // 2 padding bytes
    int   i; // 4 bytes
    long  d; // 4 bytes
    char  c; // 1 byte
             // 3 padding bytes
};

struct test2
{
    int   i; // 4 bytes
    char  c; // 1 byte
             // 3 padding bytes
    long  d; // 4 bytes            
    short s; // 2 bytes
             // 2 padding bytes
};

struct test3
{
    long  d; // 4 bytes
    int   i; // 4 bytes
    short s; // 2 bytes
    char  c; // 1 byte
             // 1 padding byte
};

int main()
{
    const int size1 = sizeof(struct test1);
    const int size2 = sizeof(struct test2);
    const int size3 = sizeof(struct test3);

    std::cout << size1 << "\n" << size2 << "\n" << size3 << std::endl;
    return 0;
}


Hello. Take a look at the code above. According to its messy structures, there is a difference of size. Why? This is because of padding added to satisfy alignment constraints - the data structure alignment. You should minimize the size of structures by sorting members by alignment (like in the third structure). However each compiler may choose to align data differently. You have a good explanation at the link below ++

https://en.wikipedia.org/wiki/Data_structure_alignment

On Visual Studio 2022 (x64) I have this output :

16
16
12


Using a compiler on line -> https://www.onlinegdb.com/online_c++_compiler :

24
24
16


If you are on VS, you can add #pragma pack(1), you have no more padding :

11
11
11
Last edited on
For VS 2022 X64 I get


16
16
12


with default Struct Member Alignment
Geckoo wrote:
Using a compiler on line -> https://www.onlinegdb.com/online_c++_compiler :

24
24
16

Note that sizeof(long) == 8 with that compiler.
For the OP:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
struct date
{
	unsigned int day; //4
	unsigned int month;//4
	unsigned int year;//4
	//sum is 12
};

struct  student
{
	int id; //4 + 4 padding = 8
	char* name; //8
	char* family; //8
	date birth; //12 + 4 padding = 16
	// sum is 40
};

union s
{
	int id; //4
	char f[20]; //20
	char f2[20]; //20
	char f3[20]; //20
	// sum is 20 - greatest of each element
};

int main() {
	const int size1 = sizeof(struct date);
	const int size2 = sizeof(struct student);
	const int size3 = sizeof(union s);

	std::cout << size1 << "\n" << size2 << "\n" << size3 << std::endl;
	return 0;
}


with VS 2022 as x64:


12
40
20


For info on VS struct padding, see:
https://learn.microsoft.com/en-us/cpp/cpp/alignment-cpp-declarations?view=msvc-170
Last edited on
Note that sizeof(long) == 8 with that compiler.


That's why depending upon a struct to have a certain size/layout across different compilers is fraught with issues.
If the ordering of Student is changed to:

1
2
3
4
5
6
7
8
struct  student
{
	char* name; //8
	char* family; //8
	date birth; //12
	int id; //4
	// sum is 32
};


then sizeof(student) reduces to 32 with no padding! (for VS2022 x64).
The other 'interesting' part of the union is as int is 4 bytes and char[20] is 20 and int 'maps' to char[20] on a bit level, which part of char[20] does it match to? Consider:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
#include <iostream>

union U
{
	unsigned id; //4
	char f[20] {}; //20
	// sum is 20 - greatest of each element
};

int main() {
	U u;

	u.id = 0xffffffff;

	for (size_t i {}; i < 20; ++i)
		std::cout << std::hex << unsigned(u.f[i]) << ' ';

	std::cout << '\n';
}


which for VS2022 x64 displays:


ffffffff ffffffff ffffffff ffffffff 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


which shows that the int 'maps' to bytes 0 - 3 of char[20].
Last edited on
> which part of char[20] does it match to?

The standard requires that it must map to bytes 0 - sizeof(int)-1 of the array of bytes.

Each non-static data member is allocated as if it were the sole member of a non-union class.
[Note 2: A union object and its non-static data members are pointer-interconvertible. As a consequence, all non-static data members of a union object have the same address. — end note]
http://eel.is/c++draft/class.union#general-3
https://cplusplus.com/forum/beginner/285212/#msg1238132

Sorry seeplus. I made a mistake writing output. We have the same. Fixed ++
The #pragma pack(x) preprocessor statement forces a particular alignment/packing of structures. So x is the value for the alignment. For this reason #pragma pack(1) has no padding because the alignment is based on a single byte. Keep in mind that some "clean" alignment are useful when you have a memory mapped interface to a piece of hardware and need to be able to control exactly where the different structure members point. However reduced padding is notably not a good speed optimization since most machines are much faster at dealing with aligned data - and as we said previously, some compilers deal with the alignment differently. Briefly, we can state :

Why do you have to use it ? To reduce the memory of the structure.
Why should you not use it ? This may lead to performance penalty because some systems work better on aligned data. Also some machines will fail to read unaligned data. Code is not always portable...

1
2
3
4
5
6
7
__declspec(align(4)) struct test4 {
    long  d; // 4 bytes
    int   i; // 4 bytes
    short s; // 2 bytes |
    char  c; // 1 byte  |
             // padding |
};


I would like to add some explanation about __declspec(align(x)) which helped me to understand how an alignment of structure works. Using the previous example, the structure has a 4 bytes alignment. Finally there is a free space at the end (one byte). Then sizeof() gives me as output 12 (4 bytes for the long value, 4 bytes for the integer, 2 for the short value, one byte for the char (I got a padding at the end - just one byte = 12). Add another short value (2 bytes), and you have as sizeof() output 16 because the compiler adds another memory allocation. So we have now as sizeof() output 16 including three messy bytes for padding. 

1
2
3
4
5
6
7
8
9
__declspec(align(4)) struct test4 {
    long  d; // 4 bytes
    int   i; // 4 bytes
    short s; // 2 bytes |
    char  c; // 1 byte  |
             // padding | one byte
    short o; // 2 bytes #
             // padding # two bytes
};


Acting this way, I can set the padding at the end (not really important in this context), but it is clean :)

1
2
3
4
5
6
7
8
__declspec(align(4)) struct test4 {
    long  d; // 4 bytes
    int   i; // 4 bytes
    short s; // 2 bytes #
    short o; // 2 bytes #
    char  c; // 1 byte  |
             // padding | 3 bytes
};
Last edited on
Re union, it's also interesting re the 'match-up' between the bytes of the integer and the bytes of char. Consider:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
#include <iostream>

union U
{
	unsigned id; //4
	char f[20] {}; //20
	// sum is 20 - greatest of each element
};

int main() {
	U u;

	u.id = 0x01020304;

	for (size_t i {}; i < 20; ++i)
		std::cout << std::hex << unsigned(u.f[i]) << ' ';

	std::cout << '\n';
}


which for little-endian (intel) displays:


4 3 2 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


so that the least-significant is char[0].
Worth noting that standard C++ has alignas for controlling alignment
https://en.cppreference.com/w/cpp/language/alignas
 
struct alignas(4) test4 { /*...*/ };

Since C++17 new will respect alignment requirements imposed with alignas.
Last edited on
Topic archived. No new replies allowed.