the C and C++ standard requirement for wchar_t is to be the widest possible code point (that's 32 bits if you're talking Unicode). Microsoft introduced wchar_t back in 1995-ish, I guess, when Unicode was still 16-bit, and they never upgraded.
Today, C++ has four kinds of strings:
std::string (can store ascii, iso8559-x, utf-8, gb18030, or any other single byte or multibyte encoding as long as the storage format uses bytes)
std::u16string (can store utf-16, ucs2, or any other 16-bit encoding)
std::u32string (can store utf-32/ucs4 or any other 32-bit encoding if any other one exists)
std::wstring (was *supposed* to store ucs4 or any other 32-bit encoding, and does so on Linux/Unix, but actually stores utf-16 on Windows for backwards compatibility reasons)
I haven't programmed on Windows, but as far as I know,
L"Wide string literal";
is what every Windows API expects in so-called "Unicode" mode.
What's the proper portable way to convert between MS Unicode/Wide character strings, UTF-16, and UTF-8? |
MS wide strings are already UTF-16le. To go to UTF-8, you can use Windows API WideCharToMultibyte (with CP_UTF8) or C++11's std::codecvt_utf8<wchar_t> (very easy to use if wrapped in std::wstring_convert, Windows supported that since VS 2010)