couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Damien Katz <dam...@apache.org>
Subject Re: View keys case-insensitive?
Date Thu, 09 Apr 2009 17:00:13 GMT
User collation settings (case, accent, sensitive, locale, etc) should  
be an option for views if anyone wants to take that on.

On Apr 9, 2009, at 12:49 PM, Paul Davis wrote:

> Oddly enough, this is expected behavior:
>
>    values.push("a");
>    values.push("A");
>    values.push("aa");
>    values.push("b");
>    values.push("B");
>    values.push("ba");
>    values.push("bb");
>
> Even fiddling with the ICU collation options I couldn't get it to sort
> any differently.

Did you recreate the indexes from scratch? Otherwise they'll still be  
sorted with the old collation.

-Damien


> I'm not sure if there's an explanation that I'm
> missing or what but it sure seems like "aa" should come before "A" for
> case sensitive sorting. Unless of course its doing something dumb like
> sorting right to left in which case "a" > null.
>
> No idea.
>
> Paul Davis
>
> Index: src/couchdb/couch_erl_driver.c
> ===================================================================
> --- src/couchdb/couch_erl_driver.c      (revision 762581)
> +++ src/couchdb/couch_erl_driver.c      (working copy)
> @@ -22,6 +22,8 @@
> #define U_DISABLE_RENAMING 1
> #endif
>
> +#include <stdio.h>
> +
> #include "erl_driver.h"
> #include "unicode/ucol.h"
> #include "unicode/ucasemap.h"
> @@ -63,13 +65,25 @@
>         return ERL_DRV_ERROR_GENERAL;
>     }
>
> +    ucol_setAttribute(pData->coll, UCOL_CASE_FIRST,  
> UCOL_LOWER_FIRST, &status);
> +    if(U_FAILURE(status)) {
> +        couch_drv_stop((ErlDrvData)pData);
> +        return ERL_DRV_ERROR_GENERAL;
> +    }
> +
> +    ucol_setAttribute(pData->coll, UCOL_CASE_LEVEL, UCOL_ON,  
> &status);
> +    if(U_FAILURE(status)) {
> +        couch_drv_stop((ErlDrvData)pData);
> +        return ERL_DRV_ERROR_GENERAL;
> +    }
> +
>     pData->collNoCase = ucol_open("", &status);
>     if (U_FAILURE(status)) {
>         couch_drv_stop((ErlDrvData)pData);
>         return ERL_DRV_ERROR_GENERAL;
>     }
>
>  On Thu, Apr 9, 2009 at 6:53 AM, Brian Candler <B.Candler@pobox.com>  
> wrote:
>> I was very surprised to find that view keys seem to be case- 
>> insensitive when
>> using startkey and endkey:
>>
>> $ curl -X POST -d '{"map":"function(doc) { emit(doc.foo,  
>> null); }"}' 'http://127.0.0.1:5984/test_suite_db/_temp_view? 
>> startkey="a"&endkey="az"'
>> {"total_rows":26,"offset":7,"rows":[
>> {"id":"7","key":"a","value":null},
>> {"id":"8","key":"A","value":null},    <<<< huh?!
>> {"id":"9","key":"aa","value":null}
>> ]}
>>
>> But not when fetching them individually:
>>
>> $ curl -X POST -d '{"map":"function(doc) { emit(doc.foo,  
>> null); }"}' 'http://127.0.0.1:5984/test_suite_db/_temp_view?key="a"'
>> {"total_rows":26,"offset":7,"rows":[
>> {"id":"7","key":"a","value":null}
>> ]}
>> $ curl -X POST -d '{"map":"function(doc) { emit(doc.foo,  
>> null); }"}' 'http://127.0.0.1:5984/test_suite_db/_temp_view?key="A"'
>> {"total_rows":26,"offset":8,"rows":[
>> {"id":"8","key":"A","value":null}
>> ]}
>>
>> (Ditto for startkey="a"&endkey="a", or startkey="A"&endkey="A")
>>
>> At http://wiki.apache.org/couchdb/View_collation it says that view  
>> keys are
>> case-sensitive, which normally means that "A" does not appear in  
>> the range
>> "a" to "aa". And with normal ASCII ordering I would expect "A" to  
>> sort
>> before "a", as is the case with Javascript:
>>
>> js> "a" < "A"
>> false
>>
>> Could someone please explain to me what's going on? This may also  
>> explain my
>> recent report COUCHDB-324 where tilde does not collate where I'd  
>> expect.
>>
>> I am running a recent SVN build:
>> {"couchdb":"Welcome","version":"0.9.0a762247"}
>>
>> Thanks,
>>
>> Brian.
>>


Mime
View raw message