Here's the problem, the program is used to reverse text while I am learning C++, and that is an exercise. It worked nicely through English characters. Then I started to think: what if I add Chinese characters into the program? It gives me something pretty strange. Anyway, here's the code:
#include <iostream>
#include <windows.h>
usingnamespace std;
int main()
{
char p[]="Hello. 您好。";
cout << "Text: " << p << endl;
cout << "Reversed text: ";
for(int i=strlen(p)-1;i>=0;i--)
{
if ((int)p[i]<0) //to see is that a Chinese character
{
cout << p[i] << p[i+1];
i--;
}
else {cout << p[i];}
Sleep( 100 );
}
cout << endl;
system("pause >nul");
return 0;
}
The strange thing is there is always a character at the front after reversing which should not exist. Example:
---------- ---------- ---------- ---------- ----------
---------- ---------- ---------- ---------- ----------
It just doesn't seem to make sense, I tried to prevent it by changing the for(int i=strlen(p)-1;i>=0;i--) into for(int i=strlen(p)-2;i>=0;i--), but then it would make an English character disappear if it ends with an English character. Can anyone help to fix this?
You are scanning the string backwards and determining if you have found a Chinese character by seeing if the byte value is >127 (which you aren't doing quite correctly since you are assuming that char is signed). But by then, you have already gone past, and printed, the other bytes that make up the character, and you've printed them in the wrong order. You want to reverse the characters, not the individual bytes that make up a multi-byte character.
So to handle the Chinese characters (or UTF8 generally) you need to scan the string forwards. One possibility is to make an array of the start indices of the characters and then go through that in reverse to print the characters from the string. (see code below)
Alternatively, you could scan the string backwards but always keep three bytes in a buffer queue. That way, if you come across a multi-byte-indicating byte code you haven't already printed out the bytes (in the wrong order).
It works for me. You must have done something wrong (in Dev-C++, at least). You probably ran your old code somehow.
[output]
$ ./reverse
Text: Hello. 您好。
。好您 .olleH
[/code]
I know that, but I have made a new source file for testing the code so that it is a completely new file, therefore, I didn't run my old codes. (T.T)
Never mind, I'll check the settings.