Encode Char to UTF8

Jul 19, 2012 at 9:29am
Hi!. My question is simple. I need to encode a char array like "áéí" in UTF8. For example 'á' must be converted to "á"

Thanks in advance!
Jul 19, 2012 at 10:09am
It's not clear what you mean. You want to change some text in some other encoding to UTF8?
Jul 19, 2012 at 10:50am
If you use c++11, char is by default UTF8
Jul 19, 2012 at 10:57am
@viliml
That's not true.
Jul 19, 2012 at 11:03am
I have a char with accents. I want to encode this char to utf8.

List characters - utf8

á => á
À => À
ä => ä
é => é
è => è
É => É
ê => ê
æ => æ
í => Ã*
ó => ó
Ó => Ó
ö => ö
ú => ú
ü => ü
ñ => ñ
Ñ => Ñ
ç => ç
Jul 19, 2012 at 11:30am
I think what wiliml meant was that in C++11 you can write UTF8 encoded strings as u8"I'm a UTF-8 string." This is done at compile time so it will not help you if you want to encode the strings at runtime.

If you want to convert the string to UTF8 (at runtime) you first have to know what encoding the original string is using.
Jul 19, 2012 at 11:50am
Wikipedia wrote:
For the purpose of enhancing support for Unicode in C++ compilers, the definition of the type char has been modified to be both at least the size necessary to store an eight-bit coding of UTF-8 and large enough to contain any member of the compiler's basic execution character set. It was previously defined as only the latter.

And:
Wikipedia wrote:
1
2
3
u8"I'm a UTF-8 string."
u"This is a UTF-16 string."
U"This is a UTF-32 string."

The type of the first string is the usual const char[]. The type of the second string is const char16_t[]. The type of the third string is const char32_t[].

As you see, the regular type char has UTF8 encoding
Last edited on Jul 19, 2012 at 11:53am
Jul 19, 2012 at 1:22pm
viliml wrote:
As you see, the regular type char has UTF8 encoding

It doesn't say that. It just says you can use char to store UTF8, but you can use char for other encodings as well.
Last edited on Jul 19, 2012 at 1:23pm
Jul 19, 2012 at 1:48pm
And how do you choose which encoding to use?
Jul 19, 2012 at 3:59pm
I need encode a simple char in utf8

int main() {

char test[10] = "Hellóóóó";

//Here encode test in utf8

return 0;

}
Jul 19, 2012 at 5:37pm
char test[10] = "Hellóóóó";
//Here encode test in utf8

On the majority of platforms, it is already in UTF-8, see ideone.com's linux for example: http://ideone.com/iSKK2
In other cases, there are plenty of platform-specific means to do that conversion, or, in C++11, standard means as well. But, as already pointed out, adding a u8 before the opening " enforces that on all platforms/environments, if you have C++11 support.

If you don't, then try to give as much detail as possible about your platform and compiler.
Last edited on Jul 19, 2012 at 5:40pm
Jul 23, 2012 at 7:50am
I'm using vs 2008 win32. You need more info?

Thanks in advance
Topic archived. No new replies allowed.