couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Paul Davis <>
Subject Re: Locale and rule based view collation
Date Sat, 25 Sep 2010 23:38:29 GMT
On Sat, Sep 25, 2010 at 7:21 PM, Chris Anderson <> wrote:
> On Sat, Sep 18, 2010 at 4:47 PM, Noah Diewald <> wrote:
>> I was wondering if there were any plans to make use of more of the ICU
>> collation API in CouchDB.
>> I'm using CouchDB to make natural language documentation software and
>> it seems like a shame that I might have to use ICU for creating sort
>> keys to get sort orders right for view keys in certain languages when
>> ICU is already used internally by CouchDB. It kind of looks like
>> something could be added in at about the same place as the option for
>> case or no case collations in couch_icu_driver.c but I feel under
>> qualified to play around with it. I think that having an option in the
>> view to specify collation customization would be really great and it
>> must be something that even people working with less obscure languages
>> than I am could benefit from.
> we definitely plan to make this configurable, just a matter of writing
> code. for now there might be a way to set it on a per-server-instance
> basis with environment variables. I am no expert on the topic, but I
> vaguely recall someone mentioning this possibility.
> Chris
>> --
>> Noah Diewald
> --
> Chris Anderson

I'm pretty sure that Chris is right that there's a server wide
environment setting that affects ICU collation, but I can't say with
any certainty.

Its always been on the to-do list to provide the ability to have
language based sorts that are defined at the view or database level,
but as Chris points out, no one's gotten around to doing that.
Currently the major issues would revolve around recoding the
icu_driver to have smarts in how it's created, as well as refactoring
how we access the driver.

If we bumped our minimum Erlang VM version to R13, writing this as a
NIF would probably be orders of magnitude easier because of resource
types and what not.

Once those hard parts are figured out, exposing it to the outside
world should be as easy as going through the bike shedding motions on
what the _design/doc syntax would look like.

Paul Davis

View raw message