couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Adam Kocoloski <>
Subject Re: [DISCUSS] Built-in reduce indexes
Date Thu, 25 Apr 2019 16:04:58 GMT
Hiya Garren, thanks for starting this one. A few comments inline:

> On Apr 23, 2019, at 11:38 AM, Garren Smith <> wrote:
> Hi All,
> Following on from the map discussion, I want to start the discussion on
> built-in reduce indexes.
> ## Builtin reduces
> Builtin reduces are definitely the easier of the two reduce options to
> reason about and design. The one factor to keep in mind for reduces is that
> we need to be able reduce at different group levels. So a data model for
> that would like this:
> <group_level>, <group_key>, <_reduce_function_name>} -> <aggregrate_value>}

- I think this should be <view_signature>, not ?VIEW_SIGNATURE — you’re intending
for the actual checksum to go in that element of the tuple, right? I wonder about using a
Directory in that element of the tuple; adding the full signature to every key will blow up
the index substantially.

- I don’t think the second ?VIEWS is needed — you already scoped to ?VIEWS above it.

- Why is the <_reduce_function_name> after the <group_key>? If someone defines
multiple reduce functions on a single view and then wants to do a range query on the grouped
result this will mean reading and discarding the output from the other reduce function. It
seems to me that it would make more sense to put this directly after ?REDUCE.

- You could consider having specific constants for each builtin reduce function, e.g. ?SUM,
instead of the full “_sum” byte string in <_reduce_function_name>, which would save
several bytes per key.

> Most of that is similar to the map data model, where it changes is from the
> ?REDUCE subspace, we add the group_level (from 1 -> number of keys emitted
> in the map function), then the group key used in the reduce, the reduce
> function name e.g _sum, _count and then we store the aggregated value as
> the FDB value.

I think we need a better definition of <group_level> here. It’s not really the number
of keys emitted from the map function, it’s the number of elements of an array key emitted
by the map function that are used to determine the degree of aggregation.

Also, before we even get to group_level we should describe how reduce functions work with
keys that are not arrays. In the current API this is group=true (separate aggregations for
each set of KV pairs that share the same key) or group=false (one aggregated result for the
entire view). Internally in the current codebase we map these two settings to group_level=exact
and group_level=0, respectively. Basically, we need a way to represent “exact” in the

> ### Index management
> To update the reduce indexes, we will rely on the `id_index` and the
> `update_seq` defined in the map discussion. Then to apply changes,  we
> calculate the change of an aggregate value for the keys at the highest
> group level, then apply that change to all the group levels lower than it
> using fdb’s atomic operations [1].

This is a really important point and worth calling out more explicitly. One reason most of
the builtins are significantly simpler than the custom functions is because we know exactly
how the aggregate values change when we get the map output for each updated doc in isolation.
We don’t need to go retrieve all the rest of the map output that happens to share the same
map key. Moreover, the change to the aggregate value can be applied using atomic operations
so we don’t need to do anything special to avoid txn conflicts for views where lots of documents
emit the same map key.

> ### Reducer functions
> The FDB’s atomic functions support all the built in reduce functions
> CouchDB supports. So we can use those as part of our red function. For the
> `_stats` reduce function, we will have to split that across multiple key
> values. So its data model will have an extra key in it to record what stat
> it is for the _stats reducer:
> <group_level>, <group_key>, <_reduce_function_name>, <_stat_field>}
> <aggregrate_value>}
> We do have some problems, with `_approx_count_distinct`  because it does
> not support removing keys from the filter. So we have three options:
> 1. We can ignore key removal entirely in the filter since this is just an
> estimate
> 2.  Implement a real COUNT DISTINCT function, we can do because we’re not
> trying to merge results from different local shards
> 3. Don’t support it going forward

There’s actually a fourth option here, which is to maintain _approx_count_distinct using
the future machinery for custom reduce functions, generating a new filter from the raw map
output at the lowest level when keys are removed. I don’t have an informed opinion yet about
that approach.

> ### Group_level=0
> One tricker situation is if a user does a group_level=0 query with a key
> range, this would require us to do some client level aggregation. We would
> have to get the aggregate values for a `group_level=1` for the supplied key
> range and then aggregate those values together.

This isn’t unique to group_level=0, it could happen for any group_level!=exact. It can also
require descending more than one group_level depending on the precision of the supplied key
range; e.g. for a date-based key like [YYYY, MM, DD]; someone could ask for yearly aggregations
with group_level=1 but supply a startkey and endkey with specific dates rather than months.

Good start! Cheers, Adam

View raw message