incubator-couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Anderson <jch...@apache.org>
Subject Re: find all unique field names
Date Fri, 29 May 2009 01:09:34 GMT
On Thu, May 28, 2009 at 4:26 PM, Blair Nilsson <blair.nilsson@gmail.com> wrote:
> On Fri, May 29, 2009 at 10:25 AM, Blair Nilsson <blair.nilsson@gmail.com> wrote:
>> On Fri, May 29, 2009 at 9:20 AM, Douglas Fils <fils@iastate.edu> wrote:
>>> Forgive the noob question..  but I've not been able to easily locate an
>>> approach today to getting a return that gives all the unique field names in
>>> a couch database.
>>>
>>> It's not too hard to generate a map function that emits an array of the
>>> field names in a particular record....
>>> (please note this is about as much JS as I have ever written)  :)
>>> function(doc) {
>>>  var i = 0;
>>>  var keyNames = new Array();
>>>  for (var key in doc) {
>>>    keyNames[i] = key
>>>    i++;
>>>  }
>>>  emit(null,keyNames);
>>> }
>>>
>>> However, once I pass that over to the reduce (assuming this is even the way
>>> to do it) I don't see an easy way to get the unique intersection of the
>>> various field names.
>>>
>>> Any help would be appreciated...
>>> Thanks
>>> Doug
>>>
>>>
>>
>> maybe the map should be
>>
>> function(doc) {
>>  for (var key in doc) {
>>   emit(key,"")
>>  }
>> }
>>
>> and the reduce
>> function(keys,values) {
>>  return null;
>> }
>>
>> and just use the returned keys as the field names.
>>
>> --- Blair
>>
>
> Actually this can be a good demonstration on the reduce function.
>
> Say we were tying to solve a sightly more complicated version of this,
> one were we were trying to get the number of times the field name is
> used.
>
> We want the results by field name, so we use that as our key, and we
> use 1 as our value, since for each emit, we have 1 use of that field
> name.
>
> function(doc) {
>  for (var key in doc) {
>   emit(key,1)
>  }
> }
>
> which would give us
>
> address : 1
> city : 1
> city : 1
> city : 1
> name : 1
> name : 1
>
> etc...
>
> by putting in a reduce function, even if it didn't really do anything,
> the results will get stacked together by key...
>
> function(keys, values) {
>  return values;
> }
>
> would give us....
>
> address : [1]
> city : [1,1,1]
> name : [1,1]
>

Good examples overall, Blair. Thanks for the explanation. The one
nitpick I'm compelled to point out is that once should never have a
reduce function that just returns the values. The above list will end
up with:

address : [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
...
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]

and thats with only 100 documents. You can imagine how hard it will be
to manage this array when you are dealing with thousands or millions
of rows.

But on the whole, you are correct, and the sum() helper you point out
is the way to do the advanced query.

I should note that these examples assume you query the reduce with
group=true (which is the default query option used by Futon).

> The values for each key are stacked together in an array, all good, we
> can work with that...
> Since we want to add them all together, we could step through each one
> adding them up...
>
> function(keys, values) {
>  var total = 0
>  for (var k in values) {
>    total = total + values[k]
>  }
>  return total;
> }
>
> or more simply...
>
> function(keys, values) {
>  return sum(values)
> }
>



-- 
Chris Anderson
http://jchrisa.net
http://couch.io

Mime
View raw message