function
<cwchar>
mbrlen
size_t mbrlen (const char* pmb, size_t max, mbstate_t* ps);
Get length of multibyte character
Returns the size of the multibyte character pointed by pmb, examining at most max bytes.
The function uses (and updates) the shift state described by ps. If ps is a null pointer, the function uses its own internal shift state, which is altered as necessary only by calls to this function.
A call to the function with a null pointer as pmb resets the shift state (and ignores parameter max).
The behavior of this function depends on the LC_CTYPE category of the selected C locale.
This is the restartable version of mblen (<cstdlib>).
Parameters
- pmb
- Pointer to the first byte of a multibyte character.
Alternativelly, the function may be called with a null pointer, in which case the function resets the shift state (either ps or its own internal state) to the initial state and returns zero.
- max
- Maximum number of bytes to check.
The macro constant MB_CUR_MAX defines the maximum number of bytes that can form a multibyte character under the current locale settings.
size_t is an unsigned integral type.
- ps
- Pointer to a mbstate_t object that defines a conversion state.
Return Value
If pmb points to a null character, or if pmb is a null pointer, the function returns zero.
Otherwise, if at most max characters pointed by pmb form a valid multibyte character, the function returns the size in bytes of that multibyte character.
Otherwise, if at most max characters do not contribute to form a valid multibyte character, the function returns (size_t)-1 and sets errno to EILSEQ.
Otherwise, if the max characters contribute to an incomplete (but potentially valid) multibyte character, the function returns (size_t)-2.
Notice that size_t is an unsigned integral type, and thus none of the values possibly returned is less than zero.
Example
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32
|
/* mbrlen example */
#include <stdio.h>
#include <string.h>
#include <wchar.h>
void splitmb (const char* pt, size_t max)
{
size_t length;
wchar_t dest;
mbstate_t mbs;
int i;
mbrlen (NULL,0,&mbs); /* initialize state */
while (max>0) {
length = mbrlen (pt, max, &mbs);
if ((length==0)||(length>max)) break;
putchar ('[');
for (i=0; i<length; ++i) putchar (*pt++);
putchar (']');
max-=length;
}
}
int main()
{
const char str [] = "test string";
splitmb (str,sizeof(str));
return 0;
}
|
The function splitmb splits a multibyte sequence into the groups of bytes that form a each character.
The example uses a trivial string on the "C" locale, but locales supporting multibyte string are supported by the function.
Output:
[t][e][s][t][ ][s][t][r][i][n][g]
|
See also
- mbrtowc
- Convert multibyte sequence to wide character (function)
- wcrtomb
- Convert wide character to multibyte sequence (function)
- mbsrtowcs
- Convert multibyte string to wide-character string (function)
- wcsrtombs
- Convert wide-character string to multibyte string (function)