incubator-couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Blair Nilsson <blair.nils...@gmail.com>
Subject Re: find all unique field names
Date Fri, 29 May 2009 01:39:52 GMT
On Fri, May 29, 2009 at 1:09 PM, Chris Anderson <jchris@apache.org> wrote:
> On Thu, May 28, 2009 at 4:26 PM, Blair Nilsson <blair.nilsson@gmail.com> wrote:
>> On Fri, May 29, 2009 at 10:25 AM, Blair Nilsson <blair.nilsson@gmail.com> wrote:
>>> On Fri, May 29, 2009 at 9:20 AM, Douglas Fils <fils@iastate.edu> wrote:
>>>> Forgive the noob question..  but I've not been able to easily locate an
>>>> approach today to getting a return that gives all the unique field names
in
>>>> a couch database.
>>>>
>>>> It's not too hard to generate a map function that emits an array of the
>>>> field names in a particular record....
>>>> (please note this is about as much JS as I have ever written)  :)
>>>> function(doc) {
>>>>  var i = 0;
>>>>  var keyNames = new Array();
>>>>  for (var key in doc) {
>>>>    keyNames[i] = key
>>>>    i++;
>>>>  }
>>>>  emit(null,keyNames);
>>>> }
>>>>
>>>> However, once I pass that over to the reduce (assuming this is even the way
>>>> to do it) I don't see an easy way to get the unique intersection of the
>>>> various field names.
>>>>
>>>> Any help would be appreciated...
>>>> Thanks
>>>> Doug
>>>>
>>>>
>>>
>>> maybe the map should be
>>>
>>> function(doc) {
>>>  for (var key in doc) {
>>>   emit(key,"")
>>>  }
>>> }
>>>
>>> and the reduce
>>> function(keys,values) {
>>>  return null;
>>> }
>>>
>>> and just use the returned keys as the field names.
>>>
>>> --- Blair
>>>
>>
>> Actually this can be a good demonstration on the reduce function.
>>
>> Say we were tying to solve a sightly more complicated version of this,
>> one were we were trying to get the number of times the field name is
>> used.
>>
>> We want the results by field name, so we use that as our key, and we
>> use 1 as our value, since for each emit, we have 1 use of that field
>> name.
>>
>> function(doc) {
>>  for (var key in doc) {
>>   emit(key,1)
>>  }
>> }
>>
>> which would give us
>>
>> address : 1
>> city : 1
>> city : 1
>> city : 1
>> name : 1
>> name : 1
>>
>> etc...
>>
>> by putting in a reduce function, even if it didn't really do anything,
>> the results will get stacked together by key...
>>
>> function(keys, values) {
>>  return values;
>> }
>>
>> would give us....
>>
>> address : [1]
>> city : [1,1,1]
>> name : [1,1]
>>
>
> Good examples overall, Blair. Thanks for the explanation. The one
> nitpick I'm compelled to point out is that once should never have a
> reduce function that just returns the values. The above list will end
> up with:
>
> address : [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
> ...
> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]
>
> and thats with only 100 documents. You can imagine how hard it will be
> to manage this array when you are dealing with thousands or millions
> of rows.
>
> But on the whole, you are correct, and the sum() helper you point out
> is the way to do the advanced query.
>
> I should note that these examples assume you query the reduce with
> group=true (which is the default query option used by Futon).
>
>> The values for each key are stacked together in an array, all good, we
>> can work with that...
>> Since we want to add them all together, we could step through each one
>> adding them up...
>>
>> function(keys, values) {
>>  var total = 0
>>  for (var k in values) {
>>    total = total + values[k]
>>  }
>>  return total;
>> }
>>
>> or more simply...
>>
>> function(keys, values) {
>>  return sum(values)
>> }
>>
>
>
>
> --
> Chris Anderson
> http://jchrisa.net
> http://couch.io
>

Agreed, it was an intermediate step to show how the reduce function
worked. I should do an example showing re-reduce. Maybe the mailing
list isn't the right place though. Maybe its time I actually started
blogging :)

BTW, the couch.io hosting service is going to be so very useful, I'll
be a paying customer for that soon enough.

Mime
View raw message