couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From J Chris Anderson <jch...@gmail.com>
Subject Re: Views - Finding the column count for a given "table"
Date Tue, 13 Apr 2010 17:28:21 GMT

On Apr 13, 2010, at 10:21 AM, Chris Stockton wrote:

> Hello,
> 
> We have created a data-store for users, one big difference with this
> table is that a specific document may not have the column defined in
> its "data" object. This means as users add new columns we do not need
> to back fill all the rows of the "table" with empty data. Couch is
> working very well for us for this reason but I have become stumbled
> with getting the counts of rows which contain columns for a given
> "table".
> 
> Given a set of documents that look like:
> {"table":"table1","data":{"a":1,"b":1,"c":1}} -> Table row, with a
> column A, B and C. This is the tables defined full schema.
> {"table":"table1","data":{"a":1,"c":1}} -> Here the user never put data into B.
> {"table":"table1","data":{"a":1}}
> {"table":"table1","data":{"c":1}}  -> Table row, with only one column,
> C, the other 2 columns are NULL.
> {"table":"table2","data":{"i":1}}
> {"table":"table2","data":{"i":1,"o":1,"p":1}}
> {"table":"table2","data":{"i":1,"o":1,"p":1}}
> 
> I need to turn them into a view which gives counts of occurrence
> grouped by table and each data key, So I may look them up giving them
> the table & column. Here is what I have currently:
> {"rows":[
> {"key":["table1","a"],"value":3},
> {"key":["table1","b"],"value":1},
> {"key":["table1","c"],"value":3},
> {"key":["table2","i"],"value":3},
> {"key":["table2","o"],"value":2},
> {"key":["table2","p"],"value":2}
> ]}
> 
> The following MapReduce works fine on the above data set:
> function(doc) {
>  for(key in doc.data) {
>    emit([doc.table, key], null);
>  }
> }
> 
> function(keys, values, rereduce) {
>  if(rereduce) {
>    return sum(values);
>  } else {
>    return values.length;
>  }
> }

This looks right to me.

You might try replacing the reduce function with the simple text:

_count

which will do the equivalent reduce in Erlang (which will be much faster). Can you see if
that helps?

> 
> However, when I put this view into tables with hundreds of rows I
> start getting unexpected (well maybe expected, but not understood)
> results. Mostly I just end up with many columns (I.E. a, b, c) being
> the MAX count of all rows in the table. Any help getting my head
> around thinking about this in the correct way would be much
> appreciated.
> 
> -Chris


Mime
View raw message