couchdb-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Couchdb Wiki] Update of "View Snippets" by PaulDavis
Date Mon, 29 Dec 2008 20:17:06 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Couchdb Wiki" for change notification.

The following page has been changed by PaulDavis:
http://wiki.apache.org/couchdb/View_Snippets

The comment on the change is:
Added a top N tags example.

------------------------------------------------------------------------------
  
  In SQL this would be something along the lines of {{{SELECT num_attachments FROM table GROUP
BY num_attachments}}} (but this would give extra output for rows containing more than one
attachment).
  
+ == Retrieve the top N tags. ==
+ 
+ This snippet assumes your docs have a top level tags element that is an array of strings,
theoretically it'd work with an array of anything, but it hasn't been tested as such.
+ 
+ Use a standard counting emit function:
+ 
+ {{{
+ function(doc)
+ {
+     for(var idx in doc.tags)
+     {
+         emit(doc.tags[idx], 1);
+     }
+ }
+ }}}
+ 
+ Notice that `MAX` is the number of tags to return. Technically this snippet relies on an
implementation artifact that CouchDB will send keys in sorted order to the reduce functions,
thus it'd break subtly if this stopped being true. Buyer beware!
+ 
+ {{{
+ function(keys, values, rereduce)
+ {
+     var MAX = 3;
+ 
+     /*
+         Basically we're just kind of faking a priority queue. We
+         do have one caveat in that we may process a single key
+         across reduce calls. I'm reasonably certain that even so
+         we'll still be processing keys in collation order in
+         which case we can just keep the last key from the previous
+         non-rereduce in our return value. Should work itself out
+         in the rereduces though when we don't keep the extras
+         around.
+     */
+ 
+     var tags = {};
+     var lastkey = null;
+     if(!rereduce)
+     {
+         /*
+             I could probably alter the view output to produce
+             a slightly different output so that this code
+             could get pushed into the same code as below, but
+             I figure that the view output might be used for
+             other reduce functions.
+ 
+             This just creates an object {tag1: N, tag2: M, ...}
+         */
+         
+         for(var k in keys)
+         {
+             if(tags[keys[k][0]]) tags[keys[k][0]] += values[k];
+             else tags[keys[k][0]] = values[k];
+         }
+         lastkey = keys[keys.length-1][0];
+     }
+     else
+     {
+         /*
+             This just takes a collection of objects that have
+             (tag, count) key/value pairs and merges into a
+             single object.
+         */
+         tags = values[0];
+         for(var v = 1; v < values.length; v++)
+         {
+             for(var t in values[v])
+             {
+                 if(tags[t]) tags[t] += values[v][t];
+                 else tags[t] = values[v][t];
+             }
+         }
+     }
+ 
+     /*
+         This code just removes the tags that are out of
+         the top N tags. When re-reduce is false we may
+         keep the last key passed to use because its
+         possible that we only processed part of it's
+         data.
+     */
+     var top = [];
+     for(var t in tags){top[top.length] = [t, tags[t]];}
+     function sort_tags(a, b) {return b[1] - a[1];}
+     top.sort(sort_tags);
+     for(var n = MAX; n < top.length; n++)
+     {
+         if(top[n][0] != lastkey) tags[top[n][0]] = undefined;
+     }
+ 
+     // And done.
+     return tags;
+ }
+ }}}
+ 
+ There's probably a more efficient method to get the priority queue stuff, but I was going
for simplicity over speed.
+ 
+ When querying this reduce you should not use the `group` or `group_level` query string parameters.
The returned reduce value will be an object with the top `MAX` tag: count pairs.
+ 

Mime
View raw message