C++ pointers

Hi all,

Running this code:

#include <iostream>

int main() {

	int* pn1 = new int{ 5 };
	int* pn2 = new int{ 25 };
	double* pd = new double{ 9.5 };

	std::cout << pn1 << ' ' << pn2 << ' ' << pd << '\n';

	system("pause");
	return 0;
}

I get this output:

006E5168 006E5198 006EF408
Press any key to continue . . .

What does that output mean precisely?

Last edited on

coder777 (8439)

It is the hexadecimal representation of the pointer value.

frek (576)

So between, say,

006E5168

and

006E5169

there're 16 bits difference, right?

Last edited on

coder777 (8439)

It depends on how many bits a single memory cell can hold. Usually it's 8 bits.

frek (576)

So usually:
1) Each pointer occupies 4 bytes (two Hex units)
2) The capacity space of the first pointer starts from 006E5168 and ends in 006E5170 (two Hex units, four bytes) and for the second pointer, we have 006E5198 and 006E51A0 (two Hex units, four bytes) as start and end points.

Right?

So the two pointers (pn1, pn2) are not contiguous.

coder777 (8439)

Honestly, slowly I'm not sure whether you trolling...

Something like 'Hex units' does not exists and hence 2 of them does not denote 4 bytes. For What hex is see:

https://en.wikipedia.org/wiki/Hexadecimal

Each pointer occupies 4 bytes

In your system it does, in other like 64 bit system there would be 8 bytes. There are also (older) systems that takes 2 bytes.

The capacity space of the first pointer starts from 006E5168 and ends in 006E5170 (two Hex units, four bytes) and for the second pointer, we have 006E5198 and 006E51A0 (two Hex units, four bytes) as start and end points.

That does not make sense. The pointer is a memory address where you can at least store as many data as you allocated (with new, for the first pointer it would be 5 ints).
The variable (like pn1) that holds the pointer has of course a memory address too and thus can be pointed to.

#include <iostream>

int main() {

	int* pn1 = new int{ 5 };

	std::cout << "pn1 " << pn1 << '\n';
	std::cout << "pn1 size " << sizeof(pn1) << '\n';
	std::cout << "pn1 address " << &pn1 << '\n';
	std::cout << "pn1 address size " << sizeof(&pn1) << '\n';

	return 0;
}

pn1 0x438a6e0
pn1 size 8
pn1 address 0x723e58ce7a48
pn1 address size 8

seeplus (6463)

1) Ehh... What is a 'hex unit'? How many bytes a pointer uses is obtained by using sizeof(pointer). No assumptions should be made in the code.

2) There is no requirement for adjacent allocations to be allocated continuous memory locations (although the memory allocated for any allocation will be contiguous). The compiler will allocate memory as it sees fit - based upon available free space, the size of the required memory and any indicated data type alignments etc.

jonnin (11341)

what you are calling a hex unit is just 1 which is the same in any base.
your math is wrong. You are using base 10 not base 16 when you think that 68+2 is 70. It is NOT in base 16.
--------------------------------------------
The capacity space of the first pointer starts from 006E5168 and ends in 006E5170 (two Hex units, four bytes)

lets count!
006E5168
006E5169 1
006E516A 2
006E516B 3
006E516C 4
006E516D 5
006E516E 6
006E516F 7
006E5170 8

a byte is 2 hex digits. 4 bits is 1 hex digit. this is unrelated to counting, though.

Last edited on

seeplus (6463)

0x6E5170 - 0x6E5168 = 8 so the capacity space of the pointer is 8 bytes (assuming 8 bits per byte), 64 bits.

4 bits can be represented as 1 hex digit - so 8 bits (1 byte) can be represented as 2 hex digits. But working in hex digits (eg 4 bits) is not usual!

George P (5604)

Most modern computer architecture is 1 byte = 8 bits, but that can't be assumed to always be true. C/C++ has a way to check for the number of bits in a byte. <climits> has the CHAR_BIT macro constant for the number of bits in a byte.
https://en.cppreference.com/w/cpp/types/climits

So we can now determine the number of bytes (and bits) consumed with each type using the sizeof operator. sizeof returns the number of bytes for a type:

#include <iostream>
#include <climits>

int main()
{
   std::cout << "CHAR_BIT (bits in a byte): " << CHAR_BIT << "\n\n";

   std::cout << "Computing the size in bytes of some C and C++ built-in variable types\n\n";

   std::cout << "Size of bool:               " << sizeof(bool) << '\n';
   std::cout << "Size of char:               " << sizeof(char) << '\n';
   std::cout << "Size of wchar_t:            " << sizeof(wchar_t) << '\n';
   std::cout << "Size of unsigned short int: " << sizeof(unsigned short) << '\n';
   std::cout << "Size of short:              " << sizeof(short) << '\n';
   std::cout << "Size of unsigned long int:  " << sizeof(unsigned long) << '\n';
   std::cout << "Size of long:               " << sizeof(long) << '\n';
   std::cout << "Size of int:                " << sizeof(int) << '\n';
   std::cout << "Size of unsigned int:       " << sizeof(unsigned int) << '\n';
   std::cout << "Size of float:              " << sizeof(float) << '\n';
   std::cout << "Size of double:             " << sizeof(double) << "\n\n";

   std::cout << "The output can change with compiler, processor type and OS\n\n";

   std::cout << "C++11 added several new fundamental variable types.\n";
   std::cout << "The values can change whether compiled as 32 or 64 bit.\n\n";

   std::cout << "Size of char16_t:           " << sizeof(char16_t) << '\n';
   std::cout << "Size of char32_t:           " << sizeof(char32_t) << '\n';
   std::cout << "Size of unsigned long long: " << sizeof(unsigned long long) << '\n';
   std::cout << "Size of long long:          " << sizeof(long long) << '\n';
   std::cout << "Size of long double:        " << sizeof(long double) << '\n';
   std::cout << "Size of nullptr:            " << sizeof(nullptr) << '\n';
}

(Compiled as 32-bit)

CHAR_BIT (bits in a byte): 8

Computing the size in bytes of some C and C++ built-in variable types

Size of bool:               1
Size of char:               1
Size of wchar_t:            2
Size of unsigned short int: 2
Size of short:              2
Size of unsigned long int:  4
Size of long:               4
Size of int:                4
Size of unsigned int:       4
Size of float:              4
Size of double:             8

The output can change with compiler, processor type and OS

C++11 added several new fundamental variable types.
The values can change whether compiled as 32 or 64 bit.

Size of char16_t:           2
Size of char32_t:           4
Size of unsigned long long: 8
Size of long long:          8
Size of long double:        8
Size of nullptr:            4

With Visual Studio the only value that changes between 32 and 64 bit is nullptr. 64-bit nullptr is 8 bytes.

TDM-GCC/MinGW has another type (in addition to nullptr) that can vary in size, long double. 32-bit is 12 bytes, 64-bit is 16 bytes.

To make matters worse with pointers the exact memory layout can vary between compilers and the bitness.

#include <iostream>

int main()
{
   int  x    = 10;
   int* xPtr = &x;

   std::cout << sizeof(x)    << ", " << x    << ",\n" << &x    << "\n\n";
   std::cout << sizeof(xPtr) << ", " << xPtr << ",\n" << &xPtr << '\n';
}

VS 32-bit

4, 10,
00AFF798

4, 00AFF798,
00AFF794

M'ok, the layout is 4 bytes apart, as expected. Memory is allocated from "the bottom up." VS 64-bit

4, 10,
0000001B40FAFAE8

8, 0000001B40FAFAE8,
0000001B40FAFAE0

8 byte difference. As expected. Now let's try MinGW-64 32-bit:

4, 10,
0x6dfeb8

4, 0x6dfeb8,
0x6dfebc

4 byte diff, and the layout is reversed from how VS does its memory layout. MinGW-64 64-bit:

4, 10,
0x72fe14

8, 0x72fe14,
0x72fe18

Despite the memory locations being different the layout is still "kosher."

Now let's try TDM-GCC 9.2. 32-bit:

4, 10,
0x77febc

4, 0x77febc,
0x77feb8

4 bytes, TDM-GCC does "bottom up" memory layouts, similar to VS. Curiouser and curiouser. TDM-GCC 64-bit:

4, 10,
0x78fe1c

8, 0x78fe1c,
0x78fe10

Again, "bottom up" layout, but wait. The difference in 64-bit is 12(!) bytes. Yow!

So without doing any pointer memory tests making assumptions about the number of bits in a byte, the number of bytes, the bitness as compiled and their memory layout for pointers is gonna bite one in the butt. Big time.

Another possible thing to get one's arse chewed on, the pointer memory locations with MinGW/TDM-GCC are fixed at compile time, each time the app is run the pointers are at the same memory location. With VS the memory locations vary with each run.

frek (576)

Thank you all particularly Jonnin.

And the memory addresses allocated by two (or more) contiguous news in the code (as in mine) are not guaranteed to be contiguous, too, in the memory. The compiler just seeks a free memory and assign it to the first new, then do the same in the whole memory for the second new, whether the latter new is right after the first one or elsewhere in the code. Right?

JLBorges (13770)

> And the memory addresses allocated by two (or more) contiguous 'new's in the code
> are not guaranteed to be contiguous, too, in the memory. Right?

Right.

"The order, contiguity, and initial value of storage allocated by successive calls to an allocation function are unspecified." https://eel.is/c++draft/basic.stc.dynamic.allocation#2

frek (576)

Thank you JLBorges.

jonnin (11341)

if you want blocks of memory but the flexibility of a graph/tree/list type pointer chain, you can allocate the memory in a block and carve it up. Vector will do a lot for you here, or you can do it hands-on (tiny bit better performance, but not enough to justify the extra work usually).

typically a really simple vector based memory manager will only do a few things
- its main vectorness handles all the memory
- your manager will track deleted nodes by vector index in a container. if you have any, you will pop them off the container and use those when you need a 'new' one
- if you don't have deleted, the next available index the 'new' one (keep track of this)
- if you run out of memory, push_back some more
- provide free for all access so you can iterate your container vector-wise instead of pointer chain for a touch-all operation (eg: linked list search, write to a file, whatever else you need here).
- you may want to ensure your 'node' object has a 'am i deleted?' boolean and the 'pointer' is now just an integer array index into the vector. (you can still do it pointer-wise but that is less efficient due to the unneeded layer of indirection).

I avoid traditional list/tree/graph where the memory is scattered when dealing with large amounts of data. 'large' grows with computer generations though; today a million items stored in anything is uninteresting as long as you use threads and generally optimize whatever you did reasonably well. Back with single core machines the number was much smaller, and today's baseline PCs have 20 or more 'cpus' (not really, but as far as your software is concerned, yes) so that million split 10+ ways in threads is only 100k per cpu which is trivial due to modern computer speeds... and the number goes up yearly. If you throw your wallet at the problem even a few billion items is trivial today on a server/industrial grade machine. It takes a large problem for stuff to make much difference -- your bottlenecks have become networks more than cpus or page faults.

Last edited on

JLBorges (13770)

> Vector will do a lot for you here ...
> if you run out of memory, push_back some more

This may (if reallocation is required) invalidate pointers to previously allocated blocks.

> or you can do it hands-on

Unless it is a pure learning exercise, strongly favour using the facilities provided by the standard library. It has:

std::pmr::synchronized_pool_resource
https://en.cppreference.com/w/cpp/memory/synchronized_pool_resource

std::pmr::unsynchronized_pool_resource
https://en.cppreference.com/w/cpp/memory/unsynchronized_pool_resource

std::pmr::monotonic_buffer_resource
https://en.cppreference.com/w/cpp/memory/monotonic_buffer_resource

George P (5604)

Consider looking at the C++ smart pointers, they do all the memory management for you on the heap, so no need for new/delete. There are several varieties available, along with a bunch of "helper classes".
https://en.cppreference.com/w/cpp/memory

There are also "allocators," templated classes that encapsulate memory models that are used by the C++ library containers.

jonnin (11341)

This may (if reallocation is required) invalidate pointers to previously allocated blocks.

good point. When doing it this way, I use the vector index instead of a pointer, and forgot to mention that.

smart pointers don't solve the problem. They still give fragmented memory for pointer-chain containers. I suppose allocators may, if you used the vector one, but I would have to look at it a bit.

The PMR stuff would certainly work efficiently. I would need to look at the pros and cons of that against a reshaped vector for a home-made tree class. May depend on the problem?

Last edited on

Topic archived. No new replies allowed.