stdcxx-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Martin Sebor <>
Subject Re: low hanging fruit while cleaning up test failures
Date Thu, 31 Jan 2008 06:04:26 GMT
Travis Vitek wrote:
>> FYI, the setlocale() names can be really long on some platforms (e.g.,
>> on HP-UX, they always take the form:
>> /<category>/<category>/<category>/<category>/<category>/<category>
>> so 64 characters may not be enough for all locales).
> I thought this was only the case when using setlocale (LC_ALL, ...)

Definitely for LC_ALL.

> because the OS needs to return a string that indicates which locales are
> used for each category so that the return of setlocale (LC_ALL, 0) can
> be used to restore the locale to a previous state. I believe that this
> is the way that it works on HP and AIX.

Looks like you're right (at least on HP-UX where I confirmed it).

> So here is the thing that I am concerned about. The previous code
> allowed you to specify which locale facet you wanted to get locales for.

You mean category (the LC_XXX thing).

> I didn't understand how or why this was useful. I believe that I do now.
> Say a call to setlocale (LC_ALL, "X") returns the string "/A/B/C/D/E/F".
> If I just capture the result of setlocale (LC_CTYPE, 0), I'm not going
> to see that the other facets are set differently. I need to store the
> result of locale -a, or the names of the locales used by each of the
> components.

I'm not sure there are any tests that use the category argument but
I think the main point was to eliminate locales that seem to work
but some of whose categories don't. E.g., setlocale(LC_ALL, "zh_CN")
might return non-NULL but setlocale(LC_MESSAGES, "zh_CN") returns
NULL. I think we've had this happen.

> This is similar to what I started out with. The only disadvantage is
> that it doesn't allow you to prioritize one attribute over another.
> Nobody ever said we cared about order, but I assumed that we would.

As we discussed (and so just for the record), the "prioritization"
within, say, the MB_CUR_MAX field would be useful, but within the
locale name probably less so.

>> Internally it would translate into multiple grep-like expressions
>> (i.e., arguments to the -e grep option) looking like this:
>>   *_JP.* 3\n
>>   *_JP.* 4\n
>>   *_CN.* 3\n
>>   *_CN.* 4\n
> Yes, this would work fine provided that you didn't ever want to get all
> 4 byte encodings before the 3 byte encodings. I like the syntax much
> better though.

It would be nice to be able to specify the ordering somehow. Again,
for the record, the approach we discussed was to specify the order
using a second argument, say something like this:

     rw_locale_query("*_{JP,CN}.* {3,4}", "2d");

where the "2d" means: order the second bracket field in a descending
order, i.e., 4 before 3. The first field isn't specified, so the
function would internally expand the query into one of the following
grep-like expressions:

     "*_JP.* 4\n*_CN.* 4\n*_JP.* 3\n*_CN.* 3\n"
     "*_CN.* 4\n*_JP.* 4\n*_JP.* 3\n*_CN.* 3\n"
     "*_JP.* 4\n*_CN.* 4\n*_CN.* 3\n*_JP.* 3\n"
     "*_CN.* 4\n*_JP.* 4\n*_JP.* 3\n*_CN.* 3\n"

>> with the whole thing basically being a simplified grep pattern that
>> could be used to search in a plain text file in this format:
>>   <locale> <mb-cur-max> <alias-list>
> BTW, I never did find a way to get an alias for a locale. If I have the
> name of a locale, how can I find the list of aliases?

I don't know of any programmatic way to get the list of known aliases
but searching the filesystem for symlinks to locale database should
work. The aliases for our own locales and codesets are also listed
in the source files.


View raw message