Hello,
I recently switched over to Ubuntu because of its support for unicode. I am trying to create a program that will handle Arabic characters the same way that the English ASCII characters are handled. For example parsing, string length, string manipulation, concatenation, etc.
I am using the Ajunta IDE and C++ to accomplish this.
I am still trying to overcome the primary hurdles such as outputting Arabic characters to the console:
Here is a sample code that I wrote which declares a string of Arabic and displays it one letter at a time.
#include <iostream>
using namespace std;
int main()
{
string verb = "فعل";
cout << verb << "\n";
for(int i =0; i < verb.length(); i++)
cout << verb[i] << "\n";
return 0;
}
The output of this program is:
ل ع ف
�
�
�
----------------------------------------------
Program exited successfully with errcode (0)
Press the Enter key to close this terminal ...
I have a couple of questions:
1. The string is printed out backwards and unconnected from the first cout statement. How can I enable the compiler to read languages that are written from right to left properly?
2. Also when I looped through the elements of the string and printed out the character in position i, the output is a question mark. How can I get the compiler to output the characters as if it was an ASCII string?
2. Unicode is not ASCII, you need multiple bytes to represent a glyph, moreover if you are using UTF-8 encoding different glyphs would have different sizes.
The current C++ standard doesn't have any tool to handle Unicode
Agreed. Outputting them to the console might be tricky. I'm not sure how native programs accomplish it--you might want to snoop around the Ubuntu source.
I would recommend Pango to render text in GUIs (Firefox uses it).