couchdb-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <>
Subject [Couchdb Wiki] Update of "View_Snippets" by SebastianCohnen
Date Sun, 02 May 2010 18:07:55 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Couchdb Wiki" for change notification.

The "View_Snippets" page has been changed by SebastianCohnen.
The comment on this change is: added first level heading; added TOC; removed unnecessary anchors.


+ = View Snippets =
+ <<TableOfContents()>>
  This page collects code snippets to be used in your [[Views]]. They are mainly meant to
help get your head around the map/reduce approach to accessing database content. Keep in mind
that the the Futon web client silently adds group=true to your views.
-   * [[#common_mistakes|Common mistakes]]
-   * [[#get_doc_id|Get docs with a particular user id ]]
-   * [[#get_doc_with_attachment|Get all documents which have an attachment ]]
-   * [[#count_doc_with_attachment|Count documents with and without an attachment]]
-   * [[#list_unique_values|Generating a list of unique values]]
-   * [[#top_n_tags|Retrieve the top N tags]]
-   * [[#aggregate_sum|Joining an aggregate sum along with related data ]]
-   * [[#standard_deviation|Computing the standard deviation]]
-   * [[#summary_stats|Computing simple summary statistics (min,max,mean,standard deviation)
-   * [[#interactive_couchdb|Interactive CouchDB Tutorial]]
-   * [[#documents_without_a_field|Retrieving documents without a certain field]]
-   * [[#geospatial_indexes|Using views to search for sort documents geographically]]
- <<Anchor(common_mistakes)>>
  == Common mistakes ==
  When creating a reduce function, a re-reduce should behave in the same way as the regular
reduce. The reason is that CouchDB doesn't necessarily call re-reduce on your map results.
  Think about it this way: If you have a bunch of values V1 V2 V3 for key K, then you can
get the combined result either by calling reduce([K,K,K],[V1,V2,V3],0) or by re-reducing the
individual results: reduce(null,[R1,R2,R3],1). This depends on what your view results look
like internally.
- <<Anchor(get_doc_id)>>
  == Get docs with a particular user id ==
@@ -35, +25 @@

  Then query with key=USER_ID to get all the rows that match that user.
- <<Anchor(get_doc_with_attachment)>>
  == Get all documents which have an attachment ==
  This lists only the documents which have an attachment.
@@ -50, +40 @@

  In SQL this would be something like {{{SELECT id FROM table WHERE attachment IS NOT NULL}}}.
- <<Anchor(count_doc_with_attachment)>>
  == Count documents with and without an attachment ==
  Call this with ''group=true'' or you only get the combined number of documents with and
without attachments.
@@ -80, +70 @@

  In SQL this would be something along the lines of {{{SELECT num_attachments FROM table GROUP
BY num_attachments}}} (but this would give extra output for rows containing more than one
- <<Anchor(list_unique_values)>>
  == Generating a list of unique values ==
  Here we use the fact that the key for a view result can be an array. Suppose you have a
map that generates (key, value) pairs with many duplicates and you want to remove the duplicates.
To do so, use ([key, value], null) as the map output.
@@ -124, +114 @@

  If you then want to know the total count for each parent, you can use the ''group_level''
view parameter:
- <<Anchor(top_n_tags)>>
  == Retrieve the top N tags. ==
  This snippet assumes your docs have a top level tags element that is an array of strings,
theoretically it'd work with an array of anything, but it hasn't been tested as such.
@@ -223, +213 @@

  When querying this reduce you should not use the `group` or `group_level` query string parameters.
The returned reduce value will be an object with the top `MAX` tag: count pairs.
- <<Anchor(aggregate_sum)>>
  == Joining an aggregate sum along with related data ==
  Here is a modified example from the [[View_collation|View collation]] page.  Note that `group_level`
needs to be set to `1` for it to return a meaningful `customer_details`.
@@ -261, +251 @@

- <<Anchor(standard_deviation)>>
  == Computing the standard deviation ==
  This example is from the couchdb test-suite. It is '''much''' easier and less complex then
following example ([[#summary_stats|Computing simple summary statistics (min,max,mean,standard
deviation)]]) although it does not calculate min,max and mean (but this should be an easy
@@ -311, +300 @@

- <<Anchor(summary_stats)>>
  == Computing simple summary statistics (min,max,mean,standard deviation)  ==
  This implementation of standard deviation is more complex than the above algorithm, called
the "textbook one-pass algorithm" by Chan, Golub, and Le``Veque.  While it is mathematically
equivalent to the standard two-pass computation of standard deviation, it can be numerically
unstable under certain conditions.  Specifically, if the square of the sums and  the sum of
the squares terms are large, then they will be computed with some rounding error.  If the
variance of the data set is small, then subtracting those two large numbers (which have been
rounded off slightly) might wipe out the computation of the variance.  See,, and the wikipedia description of Knuth's
@@ -706, +695 @@

  For example: you can now query your view and retrieve all documents that do not contain
the field `role` (view/NAME/?key="role").
- <<Anchor(geospatial_indexes)>>
  == Using views to search for sort documents geographically ==
  If you use latitude/longitude information in your documents, it's not very easy to sort
on proximity from a given point using the normal approach (of using a key of [<latitude>,
<longitude>]). This happens because they're on different axes, which doesn't map well
onto CouchDB's treatment of the index sorting -- which is a linear sort. However, using a
[[|geohash]] may solve this, by letting you convert the
coordinates of a location into a string that sorts well (e.g., locations that are close share
a common prefix).

View raw message