How will the behaviour of multibyte char

Forum

Forum
General C++ Programming
How will the behaviour of multibyte char

How will the behaviour of multibyte char differ because of different LC_CTYPE locale

I am comparing two multibyte characters in two different platforms having different LC_CTYPE variables, they are returning different values.

One of the variable is sigma initialised to "\317\203"
and the other one is empty string i.e, ""

Below is the scenario of the two platforms:

In AIX:
LC_CTYPE="en_GB"

In Linux:
LC_CTYPE="en_US.UTF-8"

So, what could be the reason of a function returning different values in these two platforms ?

PanGalactic (1658)

Have you looked at the code in a debugger to see what the comparison is doing?

tauqeer (3)

comparison is calculating the length of each char and then substracting one from another i.e,

cmpChar=len(sigma) - len ("")

For calculating length mbrlen function is used which is actually getting affected by different locales.

How should i proceed to get the same value in both platforms. Please suggest

PanGalactic (1658)

From the man page:

NOTES
The behavior of mbrlen() depends on the LC_CTYPE category of the cur-
rent locale.

With LC_CTYPE set to "en_GB" (a non-UTF locale) on any implementation, the string length will be the number of bytes used to encode all of the characters rather than the number of characters in the string.

Topic archived. No new replies allowed.

C++

Forum

How will the behaviour of multibyte char differ because of different LC_CTYPE locale