stdcxx-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Travis Vitek <>
Subject Re: low hanging fruit while cleaning up test failures
Date Thu, 24 Jan 2008 00:15:56 GMT

Okay, I think I've finally got something that will be useful to someone. I'm
attaching the patch to STDCXX-608
[] for review.

There is a lot of code, and the feature is not 100% complete yet. I need to
create a test, deprecate the old rw_locales() function, and come up with a
way to locate the input files without actually requiring that the
environment variable TOPDIR be defined.

The system is fairly simple from the public interface. A new public type
rw_locale_entry_t has been added. It represents is a link in a linked list
of installed locales. The new function rw_all_locales() gives you a pointer
to the first item in a sorted list of installed locales. Another new
function rw_locale_query() takes a query string and a count, and it returns
a pointer to the first entry in a linked list of locale entries that match
the provided query string. The count parameter is used to limit the number
of locales in the linked list.

The query string allows you to specify what attributes you want to query,
what values you want those attributes to have, and what priority to give
thos attributes relative to others. The supported attributes are language
[L], country [C], encoding [E], and mb_cur_len [M]. Multiple values for the
same attribute can be specified by seperating them with a | character. You
can use a * as a match anything wildcard expression. I just realized that it
might be useful to omit certain attribute values. If someone thinks this
might be useful, we could do that with a ! or ^.

As an example, imagine that I want to find up to 10 locales for Japan or
China that have MB_CUR_LEN of 4 or 3. You could get that list of locales
with the following query...

  const rw_locale_entry_t* e = rw_locale_query ("C=JP|CN M=4|3", 10);

That query will give you results in the order you ask for them. In this
example, all locales for Japan would be prioritized before those for China,
regardless of MB_CUR_LEN. They would also be sorted so that the 4 byte
encodings for each language would come before the 3 byte encodings. On AIX
the returned list of locales should be something like...

  JA_JP            [4]
  JA_JP.UTF-8      [4]
  ja_JP            [3]
  ja_JP.IBM-eucJP  [3]
  ZH_CN            [4]
  Zh_CN            [4]
  Zh_CN.GB18030    [4]
  ZH_CN.UTF-8      [4]

If we switched the query to 

  const rw_locale_entry_t* e = rw_locale_query ("M=4|3 C=JP|CN", 10);

the results would be prioritized such that all 4 byte encodings for Japan
would be first, then all 4 byte encodings for China, followed by 3 byte
encodings for for each...

  JA_JP            [4]
  JA_JP.UTF-8      [4]
  ZH_CN            [4]
  Zh_CN            [4]
  Zh_CN.GB18030    [4]
  ZH_CN.UTF-8      [4]
  ja_JP            [3]
  ja_JP.IBM-eucJP  [3]

So I hope that is enough to get us started and that I haven't been wandering
down the wrong road for days. Please post feedback so I know how everyone
feels about this. Evan a 'do not care' post is better than nothing.

View this message in context:
Sent from the stdcxx-dev mailing list archive at

View raw message