neglect character's mark

in wide char,they may have this â ă ã , how can I convert them to just "a" without separate in many cases ?
bump.
I don't think there is a preset function for this. You may need to make a manual function to do this.

1
2
3
4
5
6
7
8
9
10
wchar_t SimplifyWCHAR(wchar_t input)
{
  if (input == L'â') return L'a';
  if (input == L'ă') return L'a';
  if (input == L'ã') return L'a';

  //any other substitutions you want

  return input;
}
Last edited on
There is a lot of word that could be hard for the search function,which ignores the mark.

Well then,I should make separate cases.Lots of work,lol.
Unless you are getting into chinese, I think there are a very finite number of special cases you'd have to look at if you just look at each character at a time. Don't try to look at each word individually.

If you are using c++ style strings, you could do a find and replace for each particular case instead. find() is useful in this case:

1
2
3
4
5
6
7
wstring SimplifyWSTRING(wstring input)
{
    while( input.replace( input.find(L'â') , L'a' ) ); // Something like this
    while( input.replace( input.find(L'ă'( , L'a' ) ); // But there are problems

    return input;
}


Now you can do a word or phrase at once I think.
Last edited on
By "But there are problems", do you mean this?
try{while(/*code*/){}}catch(...){}
Or are you referring to the ( instead of a ) on line 4?
If you want to be general, use a library such as ICU ( http://icu-project.org ) - I bet it has a way to construct a transliterator filter that would strip accents from all accented letters.
Last edited on
Topic archived. No new replies allowed.