how to get locale based digit groups separator (ansi c)

I've tried searching for this information on the web but keep getting directed to either c# or c++ solutions that rely on sprintf etc, but as noted at https://man7.org/linux/man-pages/man3/printf.3.html

...many versions of gcc(1) cannot parse this option and will issue a warning.


So using the "standard printf" is not necessarily an option. Also since I'm making my own custom (library specific) printf handler that is more flexible than the standard printf handler I would like to ensure I can identify this character without libc or the mcrt, instead I want to use the system provided APIs to collect the character and the number of digits that need to group.

I'm also looking to implement the 'I' flag also so I would appreciate advice on that one also, I've pretty much implemented everything else (bar 1 or 2 bugs I'm hunting down atm) including support for custom flags, modifiers and specifiers, plus a couple extra flags so I could bring support of extras I desired like center alignment, " & ' prefixes/suffixes with the appropriate U/u/L/u8 applied to them, ULL/LL etc suffixes, and custom drop in character for characters that couldn't be converted from X to UTF-8 (everything gets printed to UTF-8 1st to simplify support of custom specifiers etc) to Y.

Any help is appreciated.
Last edited on
have you played with std::numpunct ?
If you're asking is how to determine the group separator character, in plain C, it is `localeconv`
To rephrase cppreference example https://en.cppreference.com/w/c/locale/lconv
1
2
3
4
5
6
7
8
9
#include <locale.h>
#include <stdio.h>
 
int main(void)
{
    setlocale(LC_ALL, "en_US.UTF-8");
    struct lconv *lc = localeconv();
    printf("Digit separator is: %s\n", lc->mon_thousands_sep);
}
(prints comma with en_US and dot with de_DE for me)
Last edited on
What part of ansi c indicates to you I'm looking for a c++ answer? Anyways I was just about to say I'd found what I was looking for, not exactly easy to find though. This is basically the direction I'm going with it:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
typedef struct _PAWLOC const PAWLOC;

typedef struct _PAWLOC_DIG
{
	pawd	group_num;
	pawmbc	group_sep;
	pawmbc	float_dot;
	pawhhc	group_str[sizeof(pawsd)];
} PAWLOC_DIG;

typedef struct _PAWLOC_SIG
{
	pawmbc	display_with;
	bool	precedes_val;
	pawhhs	precedes_str;
	bool	spaceout_val;
	pawhhs	spaceout_str;
	pawd	location_val;
	pawhhc	location_str[sizeof(pawsd)];
} PAWLOC_SIG;

typedef struct _PAWLOC_VAL
{
	pawhhs	title;
	pawhhs	value;
} PAWLOC_VAL;

typedef enum _PAWLOC_E_VALUE
{
	PAWLOC_E_VAL_POS_PRECEDES = 0,
	PAWLOC_E_VAL_POS_SPACEOUT,
	PAWLOC_E_VAL_POS_LOCATION,
	PAWLOC_E_VAL_NEG_PRECEDES,
	PAWLOC_E_VAL_NEG_SPACEOUT,
	PAWLOC_E_VAL_NEG_LOCATION,
	PAWLOC_E_VAL_NUM_DOT, /* . character for normal representations */
	PAWLOC_E_VAL_NUM_SEP, /* , character for normal representations */
	PAWLOC_E_VAL_NUM_DIV, /* Divide digits into groups of X */
	PAWLOC_E_VAL_MON_DOT, /* . character for money representations */
	PAWLOC_E_VAL_MON_SEP, /* , character for money representations */
	PAWLOC_E_VAL_MON_DIV, /* Divide digits into groups of X */
	PAWLOC_E_VAL_MON_SYM, /* $ character for money representations */
	PAWLOC_E_VAL_COUNT
} PAWLOC_E_VALUE;

typedef struct PAWLOC_MON
{
	pawmbc	sym;
	pawd	req;
	pawhhc	str[sizeof(pawsd)];
} PAWLOC_MON;

struct _PAWLOC
{
	PAWLOC_VAL	val[PAWLOC_E_VAL_COUNT];
	PAWLOC_MON	world, local;
	PAWLOC_DIG	dig, mon;
	PAWLOC_SIG	pos, neg;
};

static struct _PAWLOC pawLocale = {NULL};

static void val2txt( pawhhc *dst, pawd val )
{
	pawd i;
	pawsu tmp = {0};

	pawu2s( &tmp, val, PAW_BASE10 );
	memset( dst, 0, sizeof(pawsd) * sizeof(pawhhc) );

	for ( i = 0; tmp.str[i]; dst[i] = tmp.str[i], ++i );
}

static void txt2mbc( pawmbc *mbc, paws txt )
	{ paws2hhs( *mbc, PAWMBC_MAX_LEN, txt, paws_length(txt) ); }

static pawhhs bool2txt( bool val )
	{ return val ? PAWHHC_C("true") : PAWHHC_C("false"); }

#include <locale.h>
PAW_API PAWLOC* pawSeekLocale()
{
	struct lconv *lc = localeconv();
	struct _PAWLOC *loc = &pawLocale;

	loc->dig.group_num = *(lc->grouping);
	val2txt( loc->dig.group_str, loc->dig.group_num );
	txt2mbc( &(loc->dig.float_dot), lc->decimal_point );
	txt2mbc( &(loc->dig.group_sep), lc->thousands_sep );

	loc->local.req = lc->frac_digits;
	loc->world.req = lc->int_frac_digits;
	loc->mon.group_num = *(lc->mon_grouping);
	val2txt( loc->local.str, loc->local.req );
	val2txt( loc->world.str, loc->world.req );
	val2txt( loc->mon.group_str, loc->mon.group_num );
	txt2mbc( &(loc->local.sym), lc->currency_symbol );
	txt2mbc( &(loc->world.sym), lc->int_curr_symbol );
	txt2mbc( &(loc->mon.float_dot), lc->mon_decimal_point );
	txt2mbc( &(loc->mon.group_sep), lc->mon_thousands_sep );

	loc->pos.location_val = lc->p_sign_posn;
	loc->pos.precedes_val = lc->p_cs_precedes;
	loc->pos.spaceout_val = lc->p_sep_by_space;

	txt2mbc( &(loc->pos.display_with), lc->positive_sign );
	val2txt( loc->pos.location_str, loc->pos.location_val );
	loc->pos.precedes_str = bool2txt( loc->pos.precedes_val );
	loc->pos.spaceout_str = bool2txt( loc->pos.spaceout_val );

	loc->neg.location_val = lc->n_sign_posn;
	loc->neg.precedes_val = lc->n_cs_precedes;
	loc->neg.spaceout_val = lc->n_sep_by_space;

	txt2mbc( &(loc->neg.display_with), lc->negative_sign );
	val2txt( loc->neg.location_str, loc->neg.location_val );
	loc->neg.precedes_str = bool2txt( loc->neg.precedes_val );
	loc->neg.spaceout_str = bool2txt( loc->neg.spaceout_val );

	...

	return loc;
}


Edit: Cubbi's response wasn't visible when I posted so I'll clarify, my starting question in this post was meant for jonnin. Btw thanks cubbi, albeit not in time for me to make use of, the solution you provided was the one I was looking for, at least for linux & co anyways, not so sure about on windows, I'm sure I glimpsed some function in my search for the linux variant that was different to localeconv()
Last edited on
Sorry about that, no excuse. Glad you found the C way.
Windows should honor your approach, though visual /microsoft has its own way of doing this as well.
If you remember the names of any of the functions I'd appreciate you listing them, I can look up the details from there as the code in my post is locked into the linux/unix variants, haven't put any of the structs and enums into the headers yet since I first need to compare the differences between systems and identify what I can and can't do with them.
the win32 stuff uses these: GetNumberFormat, GetLocaleInfo but I believe that is no longer the favored approach. I do not know the new ways ... all my gui work is *still* in the older MFC style.
All I know about the newer ones is they rely less on weird flags and constants and are relatively more OOP/modernish. I also don't know if you can all this junk in pure C or not. Whenever I have to deal with C and a GUI, its really a C++ program that has some C files, not a pure C program.

Microsoft has consistently and persistently tried to re-create the c++ language tools with its own screwball flavors of things. It has its own strings, its own boolean, and so on. I highly recommend you just use the C++ tools for both OS flavors, and ignore the less portable stuff apart from a drive by to appease your curiosity.

I also freely admit that the last time the number thing came up, I just did it manually. The input we accepted was number symbol fraction style, so it simply accepted 3.14 and 3,14 both as valid, and would reject 100,000.0 and 100.000,0 both due to having too many decimal point symbols. Not the most user friendly perhaps, but the context was for scientific folks and they don't favor the digit splitter symbols.
Last edited on
Thanks, I'll look into them later :) Got the tab pinned just in case there was a response while was working on something else, atm I'm trying to track down the cause of a strange bug that triggers this break point:

https://gitlab.com/awsdert/dragonbuilder/-/blob/main/src/libpaw/global/putf_str.c#L208

The strange part is that the same input is given as a test prior to this (I just added the line in my test file) and it works all hunky dory, but when it get's to the code that would make use of the result it instead somehow "forgets" the length of the string and other details, I can't even see a point it would forget that info between when it's set:

https://gitlab.com/awsdert/dragonbuilder/-/blob/main/src/libpaw/global/strbuf.c#L669

And when it's used:

https://gitlab.com/awsdert/dragonbuilder/-/blob/main/src/libpaw/global/strbuf.c#L690

Btw, I'm not expecting you to investigate it at all but I won't complain if you happen to find & post the cause before I do, I'm gonna take a break soon for a series vid that's normally posted on YT, then I'll be getting on with some shopping and investigating a missed phone call that was supposed to happen but didn't (HQ of the place that was supposed to call me is only a 5min walk from where I live), so even if you do feel inclined to investigate the cause, there's no need to feel rushed or anything to find it, I'll still look for it myself if I don't see any posts about it when I go back to it after my break and other tasks.

Edit: Btw, it's this attempt that goes awry:

https://gitlab.com/awsdert/dragonbuilder/-/blob/main/src/trypaw/trypaw.c#L112
Last edited on
I may have had an ABI issue, I did a rebuild during my attempts to find the issue's source and at some point the issue just vanished, until I encounter it again I'll ignore it as I have no means of reproducing it until then, although I did change how the config is retrieved, perhaps the pointer somehow got corrupted when it shouldn't? Either way the only point I saw the length change was at a point it shouldn't so I'm thinking it was an ABI issue.

Edit: There's no reason for me to keep the tab open so I'm shutting it, because of that keep in mind any further posts will take a while for me to reply to since the server has to e-mail me about them.
Last edited on
Topic archived. No new replies allowed.