Hi, I've been struggling with this problem.
I google searched a long time but still can't figure this out.
I have a UTF-8 format text file with Japanese phrase in it.
I tried to use wifsream to read the file into wstring but the string holds some garbage information instead of Japanese. Anyone know how to do this?
Note that UTF-8 has 8-bit entries, so they're not wide. So reading a wstring isn't going to work (but it might work if the text file is UTF-16). Also, if you're reading a text file, you probably shouldn't be opening it as binary.
lastly -- even if you get it working, printing the string to the user isn't easy on some platforms (Windows console). wcout is effectively totally useless -- you can't just feed it Unicode strings like you might expect.
Thanks for the reply.
Yes I shouldn't read as binary, forgot to take that out.
Anyway, I do noticed that output to console is difficult.
So I'm actually using Win32 API.
I shall update my first post.
I read the UTF-8 stuff but still not sure of how to do the decoding in C++.
Is there any example?
Note it's a little long/complicated, but it also validates the string to make sure it's valid UTF-8, and it accounts for all edge cases I could think of. Nothing should trip it up -- should be very sturdy.
Oh my god! It works!
I don't understand half of those code, but the conversion was perfect.
It even works with Chinese.
Thank you Disch, you saved this guy in distress.
I've been searching on the Internet for a whole day and wasn't able to find a solution as awesome as this.