couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Antony Blakey <antony.bla...@gmail.com>
Subject Re: API suggestions
Date Mon, 29 Dec 2008 05:37:25 GMT

On 29/12/2008, at 3:26 PM, Chris Anderson wrote:

> I almost suggesting giving an option for inclusive and exclusive
> interval ends, basically, < / > vs <= / >= control from the client.
> But then thinking about Maximillian's proposal (of defaulting to an
> exclusive right end) I began to wonder if offering *only* the
> interval-style he suggests, would satisfy both precision maths, and
> newbie expectations.

My concern right now is prefix searching e.g. paging though

startkey='rs' endkey='rs\uFFF8'

It would be good to have a prefix-test mode that would be applicable  
to the 'final' string component of a key - ala SQLs "LIKE 'rs%'". This  
would eliminate the need for the 'rs\uFFF8' hack.

Something like endkey_succ=<key> which would be equivalent to a non- 
inclusive endkey=succ(<key>) where succ(x) is the first key value wrt  
the the view collation algorithm that wouldn't satisfy x <= <key>. The  
essential characteristic being that succ(x) doesn't need to be  
calculated by the client.

I'm not suggesting endkey_succ as the syntactic mechanism.

> In my opinion the ICU collation
> driver is configured sanely, and I feel comfortable delegating to ICU.
> It's a good library for our cause. I would absolutely love to see test
> cases that indicated where CouchDB can improve on this front.

I'd like to be able to turn on normalization for all sorting. I could  
normalise all documents, and all key values, but given that CouchDB  
has IUC, this would be a lot more convenient and reliable if it was a  
server-provided feature.

I imagine some might like to enable correct ordering of French  
accents: http://unicode.org/reports/tr10/#French_Accents, which is a  
specific instance of a linguistic tailoring as described here: http://unicode.org/reports/tr10/#Linguistic_Features

. I suggest that both a couch instance, and/or an individual db might  
want to specify a unicode locale from e.g. http://unicode.org/cldr/

> There's been a suggestion of raw Unicode code point ordering as a
> collation configuration parameter, specifiable in design docs.

That's not valid unicode. I think it's a bad idea.

> Maybe
> the next logical step is a configuration member, for design docs,
> which could optionally specify the ICU configuration.

Specified in a hierarchic manner: system / db. I hesitate to include  
'view' because there are a number of view-like things that don't have  
configuration (_all_docs), and for completeness you would then want to  
deal with propagating a particular configuration through all of the  
design-doc-driven facilities. IMO, just the system & db would be enough.

Antony Blakey
-------------
CTO, Linkuistics Pty Ltd
Ph: 0438 840 787

A Man may make a Remark –
In itself – a quiet thing
That may furnish the Fuse unto a Spark
In dormant nature – lain –

Let us divide – with skill –
Let us discourse – with care –
Powder exists in Charcoal –
Before it exists in Fire –

   -– Emily Dickinson 913 (1865)



Mime
View raw message