This is becoming a frequently-asked question recently
On Linux, and other systems that support UTF-8 at the console driver level, just print it:
1 2 3 4 5 6 7
|
#include <iostream>
int main()
{
char bytesArray[3] = {'\xE6', '\x88', '\x91'};
std::cout.write(bytesArray, sizeof bytesArray);
}
|
demo:
http://ideone.com/BdRpfZ
On more strict systems, you'd have to enable the locale to choose the correct format (after all, why default to UTF-8? It could've been GB18030 just as well). Locale names are OS-dependent. I'm using the POSIX locale for US English below, but any UTF-8 locale would work the same way.
1 2 3 4 5 6 7 8 9 10
|
#include <iostream>
#include <locale>
int main()
{
char bytesArray[3] = {'\xE6', '\x88', '\x91'};
std::locale::global(std::locale("en_US.utf8"));
std::cout.imbue(std::locale());
std::cout.write(bytesArray, sizeof bytesArray);
}
|
That is how C++ is supposed to work with Unicode. (C as well, for that matter, printf() and scanf() deal in multibyte sequences)
Now, on systems that did not bother implementing Unicode for their console output, you have to convert from UTF-8 to wide string, and then output using standard wide character functionality (which also requires a locale to be set)
You can do it C++11 way
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
|
#include <iostream>
#include <locale>
#include <codecvt>
#include <string>
int main()
{
char bytesArray[] = {'\xE6', '\x88', '\x91'};
std::wstring_convert<std::codecvt_utf8<wchar_t>, wchar_t> conv;
std::wstring wide = conv.from_bytes(bytesArray,
bytesArray + sizeof bytesArray);
std::locale::global(std::locale("en_US.utf8"));
std::wcout.imbue(std::locale());
std::wcout << wide << '\n';
}
|
(tested with clang++ on Linux)
Or C way
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22
|
#include <iostream>
#include <locale>
#include <cwchar>
int main()
{
std::locale::global(std::locale("en_US.utf8"));
std::wcout.imbue(std::locale());
char bytesArray[] = {'\xE6', '\x88', '\x91'};
std::mbstate_t state = std::mbstate_t();
const char* end = bytesArray + sizeof bytesArray;
const char* ptr = bytesArray;
int len;
wchar_t wc;
while( (len = std::mbrtowc(&wc, ptr, end-ptr, &state)) > 0)
{
std::wcout << wc;
ptr += len;
}
}
|
(tested with gcc on Linux)
On Windows, you can do C++11 or C way, but you also have to enable wide character output on console using its special non-portable method
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
|
#include <iostream>
#include <codecvt>
#include <string>
#include <fcntl.h>
#include <io.h>
int main()
{
char bytesArray[] = {'\xE6', '\x88', '\x91'};
std::wstring_convert<std::codecvt_utf8<wchar_t>, wchar_t> conv;
std::wstring wide = conv.from_bytes(bytesArray,
bytesArray + sizeof bytesArray);
_setmode(_fileno(stdout), _O_WTEXT);
std::wcout << wide << '\n';
}
|
tested with Visual Studio 2012 but I remember this working with 2010 as well. Note that default console fonts on most installations of Windows do not include those characters. Either get such font, or just print your output to a file, which you can then open with Notepad (but you'll need more than just one Chinese character for autodetection to realize it's dealing with Unicode in this case)