Would've been nice if you posted the exact specifications with examples instead of just paraphrasing ;D
Count the number of Duplicates
Write a function that will return the count of distinct case-insensitive alphabetic characters and numeric digits that occur more than once in the input string. The input string can be assumed to contain only alphabets (both uppercase and lowercase) and numeric digits.
Example
"abcde" -> 0 # no characters repeats more than once
"aabbcde" -> 2 # 'a' and 'b'
"aabBcde" -> 2 # 'a' occurs twice and 'b' twice (bandB)
"indivisibility" -> 1 # 'i' occurs six times
"Indivisibilities" -> 2 # 'i' occurs seven times and 's' occurs twice
"aA11" -> 2 # 'a' and '1'
"ABBA" -> 2 # 'A' and 'B' each occur twice
Create what I like to call a "frequency hash", which maps some token to a count of how often it was seen.
- every character seen (in this case an upper or lowercase version, up to you), increments the structure's count by 1
- analyze the map for anything that occurred twice or more
Sticking with sets (though I definitely like your use of maps) I came up with the following.
(You'd be amazed by how long it took me to type "indivisibilities" properly!)
Yep, this is true of course with a little math for the index offsets, though I like the readability and extensibility of map. Can use plain arrays and go straight into the count, discarding the generated data structure:
for duplicates only you don't need any magic in ascii.
char count[256] = {0};
for(... the string)
count[string[index]]++;
and then count tells you what was duplicated.
it wastes a little space, but it skips the magic offsets etc.
the type of count may be more efficient on some machines as default word sized (usually, int) or it may be faster as char (or unsigned char). You can poke around with it to see if your hardware favors one or the other.
this is an adaptation of the 'bucket sort' algorithm which can be used to sort your data or count frequency / duplicates or similar tasks for very limited types of data (chars being a good use case).
is this a fastest wins contest, or cleanest code, or just 'do it' contest?