We are making a hash table to read in names and phone numbers and we need to break down the values read in to be as small as possible, I got it down to 20, 15 if I add the first number of tel to the hash. What else can I do to break this down further? I have tried summing the values of the phone numbers, but that increased the number of collisions. We have just learned hash tables this week so I only know the basics. We have to get it down to lower than 7.
I have written everything and it all works. Just trying to figure out how to lower the number.
ex. of input
Taylor Marie 939-1512
James Wilis Thomas 261-8342
Hewy Lewy Dewy 623-9921
We have to read in 54 entries and the table has to be 61 spaces.
We have to read in 54 entries and the table has to be 61 spaces.
This is your trouble. A hash table follows the "law?" of the time space tradeoff: to gain time, you waste space, and to save space, you waste time.
A hash table should be significantly larger than the # of inputs unless you know the data very well and can tweak it from the data side, which is tricky and time consuming. This makes what you have been asked to do rather difficult.
You have also discovered that summing things is not a good approach. Sums tend toward an average looking thing -- the sum of 10 uniformly random digits is going to bell curve around 45 or so, do you see why?
try different stuff to see what happens.
to get you started, how about this (its off the top of my head and it may do awful).
unsigned long long x = first letter;
for each letter after first:
{
x ^= next;
x << 8;
x+= (next*next);
}
x%= 61;
one that does pretty well is just ye olde rand.
do something simple (but not sum of digits !!) to convert your text to any old number. srand(number); //every time, not once as you normally do!!! this is critical!! rand()%61 //try it.
something like the above (do not do the final %61 operation here) used to generate the seed will work if the thing you use to make seed is not giving repeat values. <random> is a little better than rand if you want to fool with that. Then the final %61 moves to push it into range, its the last thing you do.
because we know how rand works, any linear congruential generator with the SEED SET FROM THE DATA will kinda work for small problems like this one, usually giving a more or less uniform spread.
These are all quick answers. If it were a real problem I would just tell you to use 200 table slots and let <random> do the dirty work, or md5 or sha1 or something like that (these are complex but you can just drop the code in, they have been coded up for decades).