> Each product reference can have unlimited item codes (by group of 3 char - no string please!)
Why? Yes, std::string would use more memory, but would also make programming simpler. There is no good reason not to use std::string unless memory is a real constraint.
> serial numbers (unsigned int)
I would recommend representing the serial numbers too as a
std::string. It is not entirely unlikely that at some time in the future, the manufacturer may introduce a serial number like '238294-A41' - why should we paint ourselves into a corner?
The memory footprint of this
1 2 3 4 5 6 7
|
// inherit from std::pair<> to get convenient utility functions for free
struct item_code_and_number : std::pair<std::string,std::string>
{
using std::pair<std::string,std::string>::pair ; // inherited constructors
std::string item_code ;
std::string serial_number ;
};
|
would be much, much larger than that of
1 2 3 4 5
|
struct item_code_and_number
{
char item_code[3] ;
unsigned int serial_number ;
};
|
However, we need to worry about it only if we have memory constraits that would be violated with the simpler, more malleable, future-proof representation.
> I want to be able to enter the product reference # on my computer and immediately get
> the corresponding list of item codes & serial numbers for that product.
> What is the simplest way?
The simplest, and the recommended way is to use a standard associative container:
std::map<> http://en.cppreference.com/w/cpp/container/map
or
std::unordered_map<> http://en.cppreference.com/w/cpp/container/unordered_map
where the key is the product reference and the mapped value is a sequence of item_code/serial_number pairs.
The simplest and most efficient standard sequence container with a dynamic size is
std::vector<>
1 2 3
|
// our lookup table is an associative container with key: product_reference,
// mapped value: sequence of item_code_and_number pairs
using lookup_table = std::map< unsigned long long, std::vector<item_code_and_number> > ;
|
Adding an entry is simplicity itself:
1 2
|
void add_entry( lookup_table& table, unsigned long long product_ref, std::string item_code, std::string serial_number )
{ table[product_ref].emplace_back( std::move(item_code), std::move(serial_number) ) ; }
|
Lookup is also simple:
1 2 3 4 5
|
std::vector<item_code_and_number> look_up( const lookup_table& table, unsigned long long product_ref )
{
const auto iter = table.find(product_ref) ;
return iter != table.end() ? iter->second : std::vector<item_code_and_number>{} ;
}
|
Now, let us make an estimate of memory usage:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
|
#include <iostream>
#include <utility>
#include <map>
#include <vector>
#include <malloc.h>
// inherit from std::pair<> to get convenient utility functions for free
struct item_code_and_number : std::pair<std::string,std::string>
{
using std::pair<std::string,std::string>::pair ; // inherited constructors
std::string item_code ;
std::string serial_number ;
};
// our lookup table is associative container with key: product_reference,
// mapped value: sequence of item_code_and_number pairs
using lookup_table = std::map< unsigned long long, std::vector<item_code_and_number> > ;
void add_entry( lookup_table& table, unsigned long long product_ref, std::string item_code, std::string serial_number )
{ table[product_ref].emplace_back( std::move(item_code), std::move(serial_number) ) ; }
std::vector<item_code_and_number> look_up( const lookup_table& table, unsigned long long product_ref )
{
const auto iter = table.find(product_ref) ;
return iter != table.end() ? iter->second : std::vector<item_code_and_number>{} ;
}
int main()
{
::malloc_stats() ;
// test memory usage with:
const std::size_t NPRODUCTS = 10'000 ; // ten thousand items
const std::size_t SAMPLE_SZ = 10 ;
const std::size_t PAIRS_PER_ITEM[SAMPLE_SZ] = { 1, 5, 10, 20, 30, 34, 100, 200, 250, 350 } ; // hundred items per product on an average
// these make for one million pairs of strings (10000 * 100 )
std::size_t total = 0 ;
for( std::size_t n : PAIRS_PER_ITEM ) total += n ;
const std::size_t average_entries_per_product = total / SAMPLE_SZ ;
std::cerr << "-----------------\nsizeof(item_code_and_number): " << sizeof(item_code_and_number) << '\n'
<< "average_entries_per_product: " << average_entries_per_product << "\n------------------\n" ;
const std::string dummy_item_code = "abc" ;
const std::string dummy_serial_number = "123456789" ;
lookup_table table ;
int n_pairs_added = 0 ;
for( unsigned long long product_ref = 0 ; product_ref < NPRODUCTS ; ++product_ref )
{
for( std::size_t i = 0 ; i < PAIRS_PER_ITEM[product_ref%SAMPLE_SZ] ; ++i )
{
add_entry( table, product_ref, dummy_item_code, dummy_serial_number ) ;
++n_pairs_added ;
}
}
std::cerr << n_pairs_added << " item_code/serial_number pairs were added\n" ;
::malloc_stats() ;
}
|
...
sizeof(item_code_and_number): 96
average_entries_per_product: 100
------------------
1000000 item_code/serial_number pairs were added
Total (incl. mmap):
system bytes = 127242240
in use bytes = 126244768
... |
http://coliru.stacked-crooked.com/a/a2f65958d371fb3e
For one million entries, on a 64-bit architecture, LLVM libc++, the memory used is about 128 MB.
We can quite easily make a back-of-the-envelope estimate of the memory usage:
sizeof(item_code_and_number) == 96
memory for one million such objects == 96 MB
if the vector doubles its capacity when resized (a typical implementation strategy),
adding an amortised overhead of 50% would give us 144 MB.
Add a few MB for the map overhead, and we would get a ball-park figure of 150 MB or so.
It is easy to see that this scales linearly; with six million entries, the memory usage would still be well under one GB.