Unless you are getting into chinese, I think there are a very finite number of special cases you'd have to look at if you just look at each character at a time. Don't try to look at each word individually.
If you are using c++ style strings, you could do a find and replace for each particular case instead. find() is useful in this case:
1 2 3 4 5 6 7
wstring SimplifyWSTRING(wstring input)
{
while( input.replace( input.find(L'â') , L'a' ) ); // Something like this
while( input.replace( input.find(L'ă'( , L'a' ) ); // But there are problems
return input;
}
If you want to be general, use a library such as ICU ( http://icu-project.org ) - I bet it has a way to construct a transliterator filter that would strip accents from all accented letters.