How to keep string literals "u8"

May 28, 2020 at 2:32am
In ImGui, and probably in general, utf-8 string literals must be preceded by the prefix 'u8'. For example, consider the string literal u8"こんにちは!テスト %d" found at: https://github.com/ocornut/imgui/wiki/Loading-Font-Example.

Now, I would like to use .po files to implement my multilingual support, and under such a scheme, I would attempt calls like this: ImGui::Text(Translate("Hello! test"), 123), where the translation function would hopefully return u8"こんにちは!テスト %d" instead of just "こんにちは!テスト %d".

I believe I'm constrained by the following:
(1) PO files contain no u8 prefixes anywhere within the file; all strings are simply enclosed by double quotes
(2) I have no access to the translation function, which is provided by some library.

Within these constraints, how do I ensure that the translated string returned by the library function would still be u8, and thereby remaining compatible with ImGui?

(I'm new to cpp and this is just a tinkering sort of project, so my question might not make very much sense and I apologize in advance for that. I also performed a search for "string literal" on these forums but the search function appears at least offline for the moment)
May 28, 2020 at 4:58am
You need to open file in UTF mode, put BOM mark into file, write UTF-8 string to file.
To read from file into u8 variable, either define char8_t array or u8string.

In any case, you need all wide variables, ex: file streams, string streams, IO streams, everything wide etc. what ever wide means for your OS.
May 28, 2020 at 8:33am
"u8" has no particular meaning. It ensures basically that the string literal is compiled as an 8 bit string literal (as opposed to "L").

Within a file you don't have string literals (the compiler has nothing to do with it). Thus it is up to you to ensure that the file contains utf-8 characters.
May 28, 2020 at 1:28pm
what do you get back from it without doing anything? What is the return type from Translate(..) ? Does it have overloads or other versions?
Last edited on May 28, 2020 at 1:29pm
May 28, 2020 at 7:04pm
@malibor, @coder777: I'll make sure that the .po files are in utf-8 and that only wide variant tools/variables are used when dealing with such files.

@jonnin: I've yet to test it (since I'm still working through other parts of the program), but I searched and saw that the return type is "char *" (the actual name of the translation function is gnu "gettext"). I suppose I could wrap a function around this return value from gettext and turn it into whatever type is necessary?

I say this because I noticed that in the original code, while most strings are u8"" literals, some are L"" literals. Consequently, when I use .po files and gettext() for translation, if the result is already compatible with "u8", then I might need to convert such a "u8" result to a "L" literal ? I've yet to find a way to do so (i.e., convert "u8" to "L").

A quick look at google search results indicate that I haven't made much progress on understanding utf-8-related strings. However, I decided to leave a quick reply here in case that there are simple answers that will either show how nonsensical my questions are or point to relevant readings/solutions. Finally, thanks to everyone who's already responded.
Last edited on May 28, 2020 at 7:12pm
May 28, 2020 at 7:32pm
most normal c++ compilers think char is an 8 bit type. You may already be fine (seems likely).
string s = (your char*); //turn the char* into c++ object may be useful
or leave it be and use it as a C - string (the C tools work in c++ if your needs are simple enough the extra copy into a string object may or may not be worthwhile esp if you turn around and cast it back to char* for another library call after you get it).


Last edited on May 28, 2020 at 7:33pm
Jun 2, 2020 at 7:09am
if the result is already compatible with "u8", then I might need to convert such a "u8" result to a "L" literal ?
Yes, you need MultiByteToWideChar(CP_UTF8, ...), See:

https://docs.microsoft.com/en-us/windows/win32/api/stringapiset/nf-stringapiset-multibytetowidechar
Jun 2, 2020 at 11:01am
@coder777

the OP is working with linux code.
Jun 8, 2020 at 5:57pm
@jonnin, @coder777, @malibor Thanks very much, and sorry for the delayed response
(@malibor I'm using gettext on windows; srry for not being clear previously)
Last edited on Jun 13, 2020 at 6:59am
Topic archived. No new replies allowed.