couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Paul Joseph Davis (JIRA)" <>
Subject [jira] Updated: (COUCHDB-403) User-defined GroupRowsFun
Date Sat, 09 Oct 2010 19:52:43 GMT


Paul Joseph Davis updated COUCHDB-403:

    Skill Level: Committers Level (Medium to Hard)

> User-defined GroupRowsFun
> -------------------------
>                 Key: COUCHDB-403
>                 URL:
>             Project: CouchDB
>          Issue Type: Wish
>          Components: Database Core, HTTP Interface
>            Reporter: Brian Candler
>            Priority: Minor
> CouchDB has hard-coded functionality for grouping. From the user's point of view: group_level=N
will truncate Array keys to the first N elements, and that's it. (*)
> It would be wonderful if application-specific grouping functions could be added. Useful
examples include:
> * for string keys, truncate to the first N characters (e.g. group by first 3 letters
of surname)
> * for numeric keys, trunc(k/N) (e.g. divide by 100 would give you buckets of 0..99, 100..199,
200..299 etc)
> * combine with group_level: e.g. truncate array to first two elements plus the third
element divided by 100
>     ["string1","string2",Number,"rest"] => ["string1","string2",trunc(Number/100)]
> * for numeric keys: use trunc(log(V) * N) for exponential buckets
> * for hexadecimal-string keys: right-shift N places
> * ...etc
> In each case N would be a parameter chosen at query time, like group_level is now.
> It would be sufficient just to have a hook to statically link Erlang functions to do
this. There would then need to be two new HTTP parameters: one to choose the grouping function
and one for any arguments it needs.
> Theoretically this function could also be handed off to the external view server so the
logic could be written in Javascript or whatever, but I think it would be too slow in practice.
> Note: group truncation functions would have need to meet certain constraints to work
with grouping logic. Something like:
>    K1 <= K2 implies grouptrunc(K1) <= grouptrunc(K2)
> (*) It's not implemented exactly like that. As far as I can see, there's one function
to compare keys for equality by looking at the first N elements (GroupRowsFun), and another
function truncates them when emitting them (RespFun). For adding bolt-on functions it would
be more convenient just to define a single group key truncation function.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message