Hexadecimal to UTF-8

Hello,

I have a char with the value C5. I know that C5 in hex corresponds the value 197 in decimal. This number (197) corresponds to a character in ASCII extended. I need to show (printf) or convert C5 in it's corresponding UTF-8 character.

By the way I am doing it in C.

Can anyone help me out?
To use printf in C to show the extended character, you could do it like this..

1
2
char star = '\xC5'; // Just to show a char with the C5 value 
	printf("%c \n\n",star); // prints the symbol 
Are you trying to print U+00C5, Å? The proper way to do this is to use wide character I/O.

In C,

1
2
3
4
5
6
7
8
#include <wchar.h>
#include <locale.h>
int main()
{
    setlocale(LC_ALL, "");
    wchar_t c = L'\u00c5'; // or = L'\xc5';
    wprintf(L"%lc\n", c);
}


online demo: http://ideone.com/tpM2a

Note that this does not convert it to UTF-8. To convert to UTF-8, in C, use wide to multibyte conversion, if you are using a platform that supports UTF-8 (e.g. Linux, but not Windows)

1
2
3
4
5
6
7
8
9
10
11
12
#include <stdlib.h>
#include <locale.h>
#include <stdio.h>
int main()
{
    setlocale(LC_ALL, "en_US.utf8"); // or any other .utf8 locale
    wchar_t c = L'\u00c5'; // or = L'\xc5';
    char mb[MB_CUR_MAX + 1];
    int len = wctomb(mb, c);
    mb[len] = '\0';
    printf("UTF-8 char: %s\n", mb);
}


online demo: http://ideone.com/BoENC
Last edited on
Thank you all.

Cubbi: I am sorry for my ignorance but I do not fully understand the need to build wchar_t c and also how to build it for other hex expressions.

Thanks a lot so far.
197 is not a valid value for a char on most systems (since on most systems, CHAR_MAX == 127), while wchar_t is the type capable of holding any character, including yours.

Give an example of "other hex expression"
Say for example A1
Just could replace "c5" with "a1" in the examples above: http://ideone.com/7KEOx (printing as-is) and http://ideone.com/rHeAV (converting to UTF-8)
Thanks a lot
If I could get some extra help.
Let's say I have the following string "DDD %C5 ir".
I want to print it, but replace %C5 by the UTF8 corresponding character. The % is just to identify the point of UTF8/hex
Assuming you can do string processing in C (which is somewhat tedious, compared to C++), it shouldn't be any more difficult than the samples above, but I am not entirely sure you're stating your goal with sufficient detail:

Do you want to print that string, showing the character Å (U+00C5) instead of the three-character fragment "%C5", or do you want to create a new string that holds the two-byte UTF-8 representation of that character ("\xc3\x85") instead of the three-character fragment "%C5", and which would, therefore, display Å when printed on a UTF-8 enabled terminal?
Hello,
Thank you for coming back.
I want to print that string, showing the character Å (U+00C5) instead of the three-character fragment "%C5".
Can you help, pls
Something like this could work:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
#include <wchar.h>
#include <locale.h>
#include <stdlib.h>

void show(char* str)
{
    while(str && *str)
        if(*str != '%')
            putwchar(btowc(*str++));
        else
            putwchar(strtoul(str+1, &str, 16));
}

int main()
{
    setlocale(LC_ALL, "");
    char str[] = "DDD %C5 ir";
    show(str);
} 

online demo: http://ideone.com/jVGsW
Last edited on
Thank you very much. It did work.
All the best
Topic archived. No new replies allowed.