std::deque<> allocates memory in chunks:
The amount of memory fragmentation (and the memory overhead involved in additional book keeping) depends on the chunk size. This is an implementation detail that is not specified by the standard. In general, for deques which can grow to large sizes, larger chunk sizes would result in less fragmentation, and better run-time performance.
Here's a quick test of the (default) mainstream implementations:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37
|
#include <iostream>
#include <deque>
#include <array>
#include <iomanip>
template < std::size_t ELEMENT_SIZE > std::size_t deque_chunksz()
{
std::deque< std::array< char, ELEMENT_SIZE > > deq(10000) ;
auto ptr = std::addressof( deq.front() ) ;
std::size_t i = 1 ;
while( std::addressof( deq[i] ) == ptr + i ) ++i ;
ptr = std::addressof( deq[i] ) ;
std::size_t j = 1 ;
while( std::addressof( deq[i+j] ) == ptr + j ) ++j ;
return j * sizeof(deq[0]) ;
}
template < std::size_t N >
struct print_chunk_size_till : print_chunk_size_till<N/2>
{
print_chunk_size_till()
{ std::cout << std::setw(8) << N << std::setw(15) << deque_chunksz<N>() << '\n' ; }
};
template <> struct print_chunk_size_till<0> {} ;
int main()
{
std::cout << "element size chunk size (bytes)\n\n" ;
print_chunk_size_till<512>{} ;
std::cout << std::setw(8) << 700 << std::setw(15) << deque_chunksz<700>() << '\n'
<< std::setw(8) << 1024 << std::setw(15) << deque_chunksz<1024>() << '\n' ;
}
|
GNU libstdc++:
element size chunk size (bytes)
1 512
2 512
4 512
8 512
16 512
32 512
64 512
128 512
256 512
512 512
700 700
1024 1024 |
http://coliru.stacked-crooked.com/a/2d51e90237197ca5
LLVM libc++:
element size chunk size (bytes)
1 4096
2 4096
4 4096
8 4096
16 4096
32 4096
64 4096
128 4096
256 4096
512 8192
700 11200
1024 16384 |
http://coliru.stacked-crooked.com/a/2d51e90237197ca5
Microsoft (Dinkumware):
element size chunk size (bytes)
1 16
2 16
4 16
8 16
16 16
32 32
64 64
128 128
256 256
512 512
700 700
1024 1024 |
With the LLVM libc++, fragmentation wouldn't be much; it uses a chunk size of 4097 bytes
or 16 * sizeof(value_type), whichever is larger.
With the GNU and Microsoft implementations, the situation is not so good.
GNU uses a chunk size of 512 bytes for small element sizes,
and one element per chunk if the element size is greater than 512.
Microsoft is quite dreadful; it uses a chunk size of 16 bytes for very small element sizes,
and one element per chunk if the element size is greater than 16.
My guess is that these implementations are relics of code written in prehistoric times, when memory was much more scarce than it is today; and no one has bothered to actually measure/optimise deque performance for large sequences (probably because it is not a widely used standard container).