I have a .txt file .
I want to input the chinese character in the file and then do some programming.
After performing the program, I want to output an output file which contains Chinese character.
But I don't know how to input and output Chinese character.
Please help me.
Do I need to convert the BIG5 to Unicode before I get the characters form a file?
I tried to use Firefox to convert the file content to Unicode8/16/32, but it outputs some unknown words.
I tried to use Firefox to convert the file content to Unicode8/16/32, but it outputs some unknown words.
That's not how Firefox works. Firefox will read the file using whatever encoding it is in (or whatever encoding you told it it is in) and will convert it to an internal representation. You can't tell it to specifically convert to some encoding.
What you probably did is tell Firefox that the file was in BIG5.
There's no simple answer to that question. It depends on what you're trying to input from, what the encoding is, and what type of string is it.
The simplest answer I can give you that remains generic enough to be practical is to write the character to a file encoded as UTF-8 (not done programmatically), then load this file and, depending on which type of string you're writing to, write it to the string as it is, or decode the UTF-8 to an array of wchar_t or whatever other type your prefer and then write it to the string.
The simplest possible answer is to hardcode the character. For example, std::string utf8string="\xEF\xBB\xBF"; //Unicode character U+FEFF
I said that was the simplest practical answer. If you already have the characters in some encoding, it may, and I emphasize "may", be simpler to leave the encoding as it is and do the conversion from code. This depends on how Big5 works and on whether you have access to encoding conversion libraries, such as iconv or ICU. But the pipeline is always the same for any value of A:
input as bit stream in encoding A -> conversion routine -> internal structure in Unicode -> output
If you just want to read the file and copy its contents, then neither the encoding nor the kind of data itself make any difference. You can just read it as a binary file and copy the contents to another binary file.
If you want to do just about anything more complex than that, such as displaying it on the screen, then you're probably going to need Unicode, and none of the standard C or C++ functions is going to do the conversion for you.
In ASCII, some number represent a word/something.
Is Unicode just similar to ASCII that some number in Unicode represent a word/something?
Below is my understanding:(Please kindly correct me if I got something wrong.)
If I want to read the file contains Chinese characters, I need to first convert it to Unicode.
For example: 我的名字是小明。
I need to convert it into something like XXXXX YYYYY ZZZZZ(Unicode) in the file.
And then read the Unicode into string.
After doing some functions, I output the Unicode and save it into a output file.
Then use software to convert it back to Chinese Character.
Could you suggest me some software that can perform conversion between Unicode and Characters?