Bits manipulation

Feb 12, 2022 at 9:26am
The size of this class is supposed to be 4 bytes:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
#include <iostream>

struct PPN { // R6000 Physical Page Number
    unsigned int PFN : 22; // Page Frame Number
    int : 3; // unused
    unsigned int CCA : 3; // Cache Coherency Algorithm
    bool nonreachable : 1;
    bool dirty : 1;
    bool valid : 1;
    bool global : 1;
};

int main()
{
    PPN pn;
    std::cout << sizeof pn << '\n';

    system("pause");
    return 0;
}


But I get 8 as the output. Why, please?
Last edited on Feb 12, 2022 at 9:26am
Feb 12, 2022 at 9:57am
There is no guarantee that an int is 4 bytes.
If you want to be certain about size of a data type use the ones from cstdint.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
#include <iostream>
#include <cstdint>

struct PPN { // R6000 Physical Page Number
    std::uint32_t PFN : 22; // Page Frame Number
    std::int32_t : 3; // unused
    std::uint32_t CCA : 3; // Cache Coherency Algorithm
    bool nonreachable : 1;
    bool dirty : 1;
    bool valid : 1;
    bool global : 1;
};

int main()
{
    PPN pn;
    std::cout << sizeof pn << '\n';

    system("pause");
    return 0;
}

Output 4
Feb 12, 2022 at 9:59am
Which OS/Compiler?
Though I'm guessing Windows since you have a system(pause) in there.

https://coliru.stacked-crooked.com/a/4443c399418fd5af gives 4
http://cpp.sh/6cek2 gives 4
My "g++ (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0" gives 4

Also, note that bit-fields are not portable.
1
2
1XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX1

On some systems, your 'global' bit might map to bit 0 and on others it might map to bit 31.

If you're manipulating bits in some hardware register, then & | >> << are the only way to go if you want portable code.
Feb 12, 2022 at 11:07am
Thank you both.
Yes, those two compilers show 4 while my VS 2022 on Windows shows 8!
Feb 12, 2022 at 12:02pm
This gives a size of 4 with VS2022:

1
2
3
4
5
6
7
8
9
struct PPN { // R6000 Physical Page Number
	uint32_t PFN : 22; // Page Frame Number
	uint32_t : 3; // unused
	uint32_t CCA : 3; // Cache Coherency Algorithm
	uint32_t nonreachable : 1;
	uint32_t dirty : 1;
	uint32_t valid : 1;
	uint32_t global : 1;
};


Feb 12, 2022 at 1:20pm
google how to use this for visual studio

#pragma pack(push, 1)
Feb 12, 2022 at 1:43pm
That produces a size of 5...
Feb 12, 2022 at 4:20pm
Thank you.
The other question is how to print the size of each data member in the construct, please.

For instance, something like:
1
2
    PPN pn;
      std::cout << sizeof pn.CCA << '\n';

which doesn't work and says: error: invalid application of 'sizeof' to a bit-field
Last edited on Feb 12, 2022 at 4:20pm
Feb 12, 2022 at 4:25pm
You mean how many bits have been specified for each? I don't believe you can..
Feb 12, 2022 at 4:45pm
you can go around about ... make a temporary one, set it to -1 and take a rounded log2 or something...
but you know this. You defined it. Stow that somewhere if you need it back?

sizeof is in bytes. its not defined because that don't make no sense for a 3 bit value.

is a bitset of any value to you?
Last edited on Feb 12, 2022 at 4:51pm
Feb 12, 2022 at 4:58pm
I manipulated the code and removed some bits, for some purpose, but the online compiler still shows 4 as output!
https://coliru.stacked-crooked.com/a/8ae4286d2a6bdeff
Feb 12, 2022 at 5:26pm
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
#include <iostream>

struct PPN { // R6000 Physical Page Number
	uint8_t CCA : 3; // Cache Coherency Algorithm
	uint8_t nonreachable : 1;
	uint8_t dirty : 1;
	uint8_t valid : 1;
	uint8_t global : 1;
};

int main() {
	PPN pn;
	std::cout << sizeof pn << '\n';
}


gives 1 for VS2022.

When the type changes (ie int to bool (1 byte)) there is a new byte/word etc started. That's why having the same type for all of the bits gives the expected result. uint32_t gives 4 and uint8_t gives 1.

See https://coliru.stacked-crooked.com/a/796bbabadba85630

If you want a result of 32 bits then use uint32_t for all. If you want 16 bits then uint16_t for all and for 8 bits use uint8_t for all.
Last edited on Feb 12, 2022 at 5:28pm
Feb 12, 2022 at 5:59pm
When the type changes (ie int to bool (1 byte)) there is a new byte/word etc started.
So here we should've had 5 bytes (int: 4, bool: 1)!
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
#include <iostream>

struct PPN { // R6000 Physical Page Number
    unsigned int PFN : 22; // Page Frame Number
    int : 3; // unused
    unsigned int CCA : 3; // Cache Coherency Algorithm
    bool nonreachable : 1;
    bool dirty : 1;
    bool valid : 1;
    bool global : 1;
};

int main()
{
    PPN pn;
    std::cout << sizeof pn << '\n';

    system("pause");
    return 0;
}
Feb 12, 2022 at 8:52pm
Poke-a-hole, pahole(1), is a tool that reports the layout of structures as they appear in a binary. See its manual page:
https://linux.die.net/man/1/pahole
A possible output is:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
struct PPN {
	unsigned int               PFN:22;               /*     0: 0  4 */

	/* XXX 3 bits hole, try to pack */
	unsigned int               :3;

	unsigned int               CCA:3;                /*     0:25  4 */

	/* Bitfield combined with next fields */

	bool                       nonreachable:1;       /*     3: 4  1 */
	bool                       dirty:1;              /*     3: 5  1 */
	bool                       valid:1;              /*     3: 6  1 */
	bool                       global:1;             /*     3: 7  1 */

	/* size: 4, cachelines: 1, members: 6 */
	/* sum bitfield members: 29 bits, bit holes: 1, sum bit holes: 3 bits */
	/* last cacheline: 4 bytes */
};
Last edited on Feb 12, 2022 at 8:56pm
Feb 13, 2022 at 5:59am
Why is it so important to know the size of the struct? I would have thought that if one was going to write it to a binary file say, then it is up to the coder to know the sizes of the individual items and write/ read them accordingly. It seems to me, to rely on the size of a struct or it's layout is a rabbit hole.
Feb 13, 2022 at 11:19am
So here we should've had 5 bytes (int: 4, bool: 1)!


You do with VS2022 if you use the #pragma pack from above:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
#include <iostream>

#pragma pack(push, 1)

struct PPN { // R6000 Physical Page Number
	unsigned int PFN : 22; // Page Frame Number
	int : 3; // unused
	unsigned int CCA : 3; // Cache Coherency Algorithm
	bool nonreachable : 1;
	bool dirty : 1;
	bool valid : 1;
	bool global : 1;
};

int main() {
	PPN pn;
	std::cout << sizeof pn << '\n';

	system("pause");
	return 0;
}



5


The default pack size I think is 4 for 32 bit and 8 for 64 bit. pack size is the minimum into which elements are packed. So the first 3 are packed into an int (4 bytes on that system) and then the 4 bools are packed into another 4 bytes - ie 8 bytes. With pack 1, then the first 3 are still packed into 4 bytes but now the bools are packed into 1 byte - hence 5 bytes.

Feb 13, 2022 at 4:50pm
TheIdeasMan wrote:
Why is it so important to know the size of the struct?
I, of course, don't know the reason why it is important to frek, but one possible application is for code that is optimized for cache performance, which can make significant improvements if done correctly. If you guarantee that your struct is 4 bytes large, then that means you can pack an array of 16 of these objects into one cache line (many cache lines are 64 bytes wide).
Last edited on Feb 13, 2022 at 4:50pm
Feb 13, 2022 at 5:23pm
@Ganado

Cool. Thanks for that :+)

ThingsLearnt++;
Topic archived. No new replies allowed.