Bits manipulation

The size of this class is supposed to be 4 bytes:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
#include <iostream>

struct PPN { // R6000 Physical Page Number
    unsigned int PFN : 22; // Page Frame Number
    int : 3; // unused
    unsigned int CCA : 3; // Cache Coherency Algorithm
    bool nonreachable : 1;
    bool dirty : 1;
    bool valid : 1;
    bool global : 1;
};

int main()
{
    PPN pn;
    std::cout << sizeof pn << '\n';

    system("pause");
    return 0;
}


But I get 8 as the output. Why, please?
Last edited on
There is no guarantee that an int is 4 bytes.
If you want to be certain about size of a data type use the ones from cstdint.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
#include <iostream>
#include <cstdint>

struct PPN { // R6000 Physical Page Number
    std::uint32_t PFN : 22; // Page Frame Number
    std::int32_t : 3; // unused
    std::uint32_t CCA : 3; // Cache Coherency Algorithm
    bool nonreachable : 1;
    bool dirty : 1;
    bool valid : 1;
    bool global : 1;
};

int main()
{
    PPN pn;
    std::cout << sizeof pn << '\n';

    system("pause");
    return 0;
}

Output 4
Which OS/Compiler?
Though I'm guessing Windows since you have a system(pause) in there.

https://coliru.stacked-crooked.com/a/4443c399418fd5af gives 4
http://cpp.sh/6cek2 gives 4
My "g++ (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0" gives 4

Also, note that bit-fields are not portable.
1
2
1XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX1

On some systems, your 'global' bit might map to bit 0 and on others it might map to bit 31.

If you're manipulating bits in some hardware register, then & | >> << are the only way to go if you want portable code.
Thank you both.
Yes, those two compilers show 4 while my VS 2022 on Windows shows 8!
This gives a size of 4 with VS2022:

1
2
3
4
5
6
7
8
9
struct PPN { // R6000 Physical Page Number
	uint32_t PFN : 22; // Page Frame Number
	uint32_t : 3; // unused
	uint32_t CCA : 3; // Cache Coherency Algorithm
	uint32_t nonreachable : 1;
	uint32_t dirty : 1;
	uint32_t valid : 1;
	uint32_t global : 1;
};


google how to use this for visual studio

#pragma pack(push, 1)
That produces a size of 5...
Thank you.
The other question is how to print the size of each data member in the construct, please.

For instance, something like:
1
2
    PPN pn;
      std::cout << sizeof pn.CCA << '\n';

which doesn't work and says: error: invalid application of 'sizeof' to a bit-field
Last edited on
You mean how many bits have been specified for each? I don't believe you can..
you can go around about ... make a temporary one, set it to -1 and take a rounded log2 or something...
but you know this. You defined it. Stow that somewhere if you need it back?

sizeof is in bytes. its not defined because that don't make no sense for a 3 bit value.

is a bitset of any value to you?
Last edited on
I manipulated the code and removed some bits, for some purpose, but the online compiler still shows 4 as output!
https://coliru.stacked-crooked.com/a/8ae4286d2a6bdeff
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
#include <iostream>

struct PPN { // R6000 Physical Page Number
	uint8_t CCA : 3; // Cache Coherency Algorithm
	uint8_t nonreachable : 1;
	uint8_t dirty : 1;
	uint8_t valid : 1;
	uint8_t global : 1;
};

int main() {
	PPN pn;
	std::cout << sizeof pn << '\n';
}


gives 1 for VS2022.

When the type changes (ie int to bool (1 byte)) there is a new byte/word etc started. That's why having the same type for all of the bits gives the expected result. uint32_t gives 4 and uint8_t gives 1.

See https://coliru.stacked-crooked.com/a/796bbabadba85630

If you want a result of 32 bits then use uint32_t for all. If you want 16 bits then uint16_t for all and for 8 bits use uint8_t for all.
Last edited on
When the type changes (ie int to bool (1 byte)) there is a new byte/word etc started.
So here we should've had 5 bytes (int: 4, bool: 1)!
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
#include <iostream>

struct PPN { // R6000 Physical Page Number
    unsigned int PFN : 22; // Page Frame Number
    int : 3; // unused
    unsigned int CCA : 3; // Cache Coherency Algorithm
    bool nonreachable : 1;
    bool dirty : 1;
    bool valid : 1;
    bool global : 1;
};

int main()
{
    PPN pn;
    std::cout << sizeof pn << '\n';

    system("pause");
    return 0;
}
Poke-a-hole, pahole(1), is a tool that reports the layout of structures as they appear in a binary. See its manual page:
https://linux.die.net/man/1/pahole
A possible output is:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
struct PPN {
	unsigned int               PFN:22;               /*     0: 0  4 */

	/* XXX 3 bits hole, try to pack */
	unsigned int               :3;

	unsigned int               CCA:3;                /*     0:25  4 */

	/* Bitfield combined with next fields */

	bool                       nonreachable:1;       /*     3: 4  1 */
	bool                       dirty:1;              /*     3: 5  1 */
	bool                       valid:1;              /*     3: 6  1 */
	bool                       global:1;             /*     3: 7  1 */

	/* size: 4, cachelines: 1, members: 6 */
	/* sum bitfield members: 29 bits, bit holes: 1, sum bit holes: 3 bits */
	/* last cacheline: 4 bytes */
};
Last edited on
Why is it so important to know the size of the struct? I would have thought that if one was going to write it to a binary file say, then it is up to the coder to know the sizes of the individual items and write/ read them accordingly. It seems to me, to rely on the size of a struct or it's layout is a rabbit hole.
So here we should've had 5 bytes (int: 4, bool: 1)!


You do with VS2022 if you use the #pragma pack from above:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
#include <iostream>

#pragma pack(push, 1)

struct PPN { // R6000 Physical Page Number
	unsigned int PFN : 22; // Page Frame Number
	int : 3; // unused
	unsigned int CCA : 3; // Cache Coherency Algorithm
	bool nonreachable : 1;
	bool dirty : 1;
	bool valid : 1;
	bool global : 1;
};

int main() {
	PPN pn;
	std::cout << sizeof pn << '\n';

	system("pause");
	return 0;
}



5


The default pack size I think is 4 for 32 bit and 8 for 64 bit. pack size is the minimum into which elements are packed. So the first 3 are packed into an int (4 bytes on that system) and then the 4 bools are packed into another 4 bytes - ie 8 bytes. With pack 1, then the first 3 are still packed into 4 bytes but now the bools are packed into 1 byte - hence 5 bytes.

TheIdeasMan wrote:
Why is it so important to know the size of the struct?
I, of course, don't know the reason why it is important to frek, but one possible application is for code that is optimized for cache performance, which can make significant improvements if done correctly. If you guarantee that your struct is 4 bytes large, then that means you can pack an array of 16 of these objects into one cache line (many cache lines are 64 bytes wide).
Last edited on
@Ganado

Cool. Thanks for that :+)

ThingsLearnt++;
Topic archived. No new replies allowed.