From commits-return-33283-archive-asf-public=cust-asf.ponee.io@couchdb.apache.org Wed Jun 6 05:02:49 2018 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by mx-eu-01.ponee.io (Postfix) with SMTP id C6E5C18067B for ; Wed, 6 Jun 2018 05:02:48 +0200 (CEST) Received: (qmail 49253 invoked by uid 500); 6 Jun 2018 03:02:47 -0000 Mailing-List: contact commits-help@couchdb.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@couchdb.apache.org Delivered-To: mailing list commits@couchdb.apache.org Received: (qmail 49013 invoked by uid 99); 6 Jun 2018 03:02:47 -0000 Received: from ec2-52-202-80-70.compute-1.amazonaws.com (HELO gitbox.apache.org) (52.202.80.70) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 06 Jun 2018 03:02:47 +0000 Received: by gitbox.apache.org (ASF Mail Server at gitbox.apache.org, from userid 33) id 2553182ADA; Wed, 6 Jun 2018 03:02:47 +0000 (UTC) Date: Wed, 06 Jun 2018 03:02:48 +0000 To: "commits@couchdb.apache.org" Subject: [couchdb-documentation] 01/02: Expand the existing docs on builtin reducers MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit From: kocolosk@apache.org In-Reply-To: <152825416701.18806.14816170716459899785@gitbox.apache.org> References: <152825416701.18806.14816170716459899785@gitbox.apache.org> X-Git-Host: gitbox.apache.org X-Git-Repo: couchdb-documentation X-Git-Refname: refs/heads/master X-Git-Reftype: branch X-Git-Rev: b99ccc9076d9ce7d37d6987b96575f744b839b43 X-Git-NotificationType: diff X-Git-Multimail-Version: 1.5.dev Auto-Submitted: auto-generated Message-Id: <20180606030247.2553182ADA@gitbox.apache.org> This is an automated email from the ASF dual-hosted git repository. kocolosk pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/couchdb-documentation.git commit b99ccc9076d9ce7d37d6987b96575f744b839b43 Author: Adam Kocoloski AuthorDate: Mon May 28 14:28:00 2018 -0400 Expand the existing docs on builtin reducers In particular, a number of advanced behaviors for _stats and _sum were not documented previously. --- src/ddocs/ddocs.rst | 102 +++++++++++++++++++++++++++++++++++++++++++++++----- 1 file changed, 93 insertions(+), 9 deletions(-) diff --git a/src/ddocs/ddocs.rst b/src/ddocs/ddocs.rst index b12a94e..306023d 100644 --- a/src/ddocs/ddocs.rst +++ b/src/ddocs/ddocs.rst @@ -116,17 +116,16 @@ single value - which could be an array or similar object. Built-in Reduce Functions ^^^^^^^^^^^^^^^^^^^^^^^^^ -Additionally, CouchDB has three built-in reduce functions. These are implemented -in Erlang and run inside CouchDB, so they are much faster than the equivalent -JavaScript functions: ``_sum``, ``_count`` and ``_stats``. Their equivalents in -JavaScript: +Additionally, CouchDB has a set of built-in reduce functions. These are +implemented in Erlang and run inside CouchDB, so they are much faster than the +equivalent JavaScript functions. -.. code-block:: javascript +.. data:: _count - // could be replaced by _sum - function(keys, values) { - return sum(values); - } +Counts the number of values in the index with a given key. This could be +implemented in JavaScript as: + +.. code-block:: javascript // could be replaced by _count function(keys, values, rereduce) { @@ -137,6 +136,16 @@ JavaScript: } } +.. data:: _stats + +Computes the following quantities for numeric values associated with each key: +``sum``, ``min``, ``max``, ``count``, and ``sumsqr``. The behavior of the +``_stats`` function varies depending on the output of the map function. The +simplest case is when the map phase emits a single numeric value for each key. +In this case the ``_stats`` function is equivalent to the following JavaScript: + +.. code-block:: javascript + // could be replaced by _stats function(keys, values, rereduce) { if (rereduce) { @@ -166,6 +175,81 @@ JavaScript: } } +The ``_stats`` function will also work with "pre-aggregated" values from a map +phase. A map function that emits an object containing ``sum``, ``min``, ``max``, +``count``, and ``sumsqr`` keys and numeric values for each can use the +``_stats`` function to combine these results with the data from other documents. +The emitted object may contain other keys (these are ignored by the reducer), +and it is also possible to mix raw numeric values and pre-aggregated objects +in a single view and obtain the correct aggregated statistics. + +Finally, ``_stats`` can operate on key-value pairs where each value is an array +comprised of numbers or pre-aggregated objects. In this case **every** value +emitted from the map function must be an array, and the arrays must all be the +same length, as ``_stats`` will compute the statistical quantities above +*independently* for each element in the array. Users who want to compute +statistics on multiple values from a single document should either ``emit`` each +value into the index separately, or compute the statistics for the set of values +using the JavaScript example above and emit a pre-aggregated object. + +.. data:: _sum + +In its simplest variation, ``_sum`` sums the numeric values associated with each +key, as in the following JavaScript: + +.. code-block:: javascript + + // could be replaced by _sum + function(keys, values) { + return sum(values); + } + +As with ``_stats``, the ``_sum`` function offers a number of extended +capabilities. The ``_sum`` function requires that map values be numbers, arrays +of numbers, or objects. When presented with array output from a map function, +``_sum`` will compute the sum for every element of the array. A bare numeric +value will be treated as an array with a single element, and arrays with fewer +elements will be treated as if they contained zeroes for every additional +element in the longest emitted array. As an example, consider the following map +output: + +.. code-block:: javascript + + {"total_rows":5, "offset":0, "rows": [ + {"id":"id1", "key":"abc", "value": 2}, + {"id":"id2", "key":"abc", "value": [3,5,7]}, + {"id":"id2", "key":"def", "value": [0,0,0,42]}, + {"id":"id2", "key":"ghi", "value": 1}, + {"id":"id1", "key":"ghi", "value": 3} + ]} + +The ``_sum`` for this output without any grouping would be: + +.. code-block:: javascript + + {"rows": [ + {"key":null, "value": [9,5,7,42]} + ]} + +while the grouped output would be + +.. code-block:: javascript + + {"rows": [ + {"key":"abc", "value": [5,5,7]}, + {"key":"def", "value": [0,0,0,42]}, + {"key":"ghi", "value": 4 + ]} + +This is in contrast to the behavior of the ``_stats`` function which requires +that all emitted values be arrays of identical length if any array is emitted. + +It is also possible to have ``_sum`` recursively descend through an emitted +object and compute the sums for every field in the object. Objects *cannot* be +mixed with other data structures. Objects can be arbitrarily nested, provided +that the values for all fields are themselves numbers, arrays of numbers, or +objects. + .. note:: **Why don't reduce functions support CommonJS modules?** -- To stop receiving notification emails like this one, please contact kocolosk@apache.org.