|Correct me if I am wrong: even though the typical memory address needs only 4 bytes of memory to store ...|
Yeah, that's wrong. There's no such thing as a typical address.
In the beginning, there was just a computer word. That was the size of memory, registers and how much the ALU would use at a time.
The word size varied, but in Unix world, a crisis happened when a computer came along (PDP-11) that had 16bit words, but 8 bit memory. That spawned a whole rethink about the system programming language because it could no longer address individual bytes, only words (every other byte). So the memory model was revised and a new systems programming language borne to deal with it. The new language was C (replacing the old one, B), and the new types were:
char = byte
int = word
pointer = word
a signed/unsigned qualifier
float / double
There was a similar sort of thing again when Intel made their first 16bit processor. There was no 16bit memory for go with it, so they used 8 bit memory. That processor was the 8088. It was only when 16bit memory became available that the 8086 was released.
Those 16bit processors had an innovative addressing scheme to get more that 2^16 addresses by specifying an address with 2 registers. It was a bit too early in time to go full 32bit, but they had this segment/offset, where the 64k segments overlapped each other by 16 bytes. Why 64k? Because the design was based on 8 bit tech. Later on, spare space was used to install Expanded memory cards that would swap additional memory in 4 segments. Bonkers.
When Windows moved from 16 to 32bit, they continued to support 16bit apps in a Windows-on-Windows layer (WoW), and did pretty much the same again when they moved from 32 to 64bit. Windows rans the 32 processor in segment/offset mode, where the segment was always zero, giving a "flat" 32 bit address space. I don't know what they did for 64bit as they took forever to start using 64bit apps.
The original ideal behind C++'s allocator was to encapsulate these different addressing schemes in a portable way. I have no idea what allocators are used for now other than custom heaps.
The main points are:
1. addressing is invented, it can be pretty much anything
2. the software must support whatever the underlying hardware does, or you can't use it
3. it would be nice if 32bit meant 2^32 addresses, and so on, but it's pretty much arbitrary, and hardware isn't always nice