From commits-return-100809-archive-asf-public=cust-asf.ponee.io@lucene.apache.org Tue May 8 23:14:56 2018 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by mx-eu-01.ponee.io (Postfix) with SMTP id 356591807A5 for ; Tue, 8 May 2018 23:14:54 +0200 (CEST) Received: (qmail 16756 invoked by uid 500); 8 May 2018 21:14:49 -0000 Mailing-List: contact commits-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@lucene.apache.org Delivered-To: mailing list commits@lucene.apache.org Received: (qmail 16096 invoked by uid 99); 8 May 2018 21:14:48 -0000 Received: from git1-us-west.apache.org (HELO git1-us-west.apache.org) (140.211.11.23) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 08 May 2018 21:14:48 +0000 Received: by git1-us-west.apache.org (ASF Mail Server at git1-us-west.apache.org, from userid 33) id C7961F6C79; Tue, 8 May 2018 21:14:47 +0000 (UTC) Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit From: ab@apache.org To: commits@lucene.apache.org Date: Tue, 08 May 2018 21:15:10 -0000 Message-Id: In-Reply-To: <871f5f239d3a44bfbb1af4abe3af8906@git.apache.org> References: <871f5f239d3a44bfbb1af4abe3af8906@git.apache.org> X-Mailer: ASF-Git Admin Mailer Subject: [24/50] [abbrv] lucene-solr:jira/solr-11779: SOLR-8998: ref guide update. SOLR-8998: ref guide update. Project: http://git-wip-us.apache.org/repos/asf/lucene-solr/repo Commit: http://git-wip-us.apache.org/repos/asf/lucene-solr/commit/df713fc7 Tree: http://git-wip-us.apache.org/repos/asf/lucene-solr/tree/df713fc7 Diff: http://git-wip-us.apache.org/repos/asf/lucene-solr/diff/df713fc7 Branch: refs/heads/jira/solr-11779 Commit: df713fc70009733afed84484298326b15f963d15 Parents: 46ecb73 Author: Mikhail Khludnev Authored: Wed May 2 18:23:08 2018 +0300 Committer: Mikhail Khludnev Committed: Wed May 2 18:23:15 2018 +0300 ---------------------------------------------------------------------- solr/solr-ref-guide/src/blockjoin-faceting.adoc | 2 + solr/solr-ref-guide/src/json-facet-api.adoc | 69 +++++++++++++++++++- solr/solr-ref-guide/src/other-parsers.adoc | 2 +- 3 files changed, 70 insertions(+), 3 deletions(-) ---------------------------------------------------------------------- http://git-wip-us.apache.org/repos/asf/lucene-solr/blob/df713fc7/solr/solr-ref-guide/src/blockjoin-faceting.adoc ---------------------------------------------------------------------- diff --git a/solr/solr-ref-guide/src/blockjoin-faceting.adoc b/solr/solr-ref-guide/src/blockjoin-faceting.adoc index 41030d5..7e2408b 100644 --- a/solr/solr-ref-guide/src/blockjoin-faceting.adoc +++ b/solr/solr-ref-guide/src/blockjoin-faceting.adoc @@ -20,6 +20,8 @@ BlockJoin facets allow you to aggregate children facet counts by their parents. It is a common requirement that if a parent document has several children documents, all of them need to increment facet value count only once. This functionality is provided by `BlockJoinDocSetFacetComponent`, and `BlockJoinFacetComponent` just an alias for compatibility. +CAUTION: This functionality is considered deprecated. Users are encouraged to use `uniqueBlock(\_root_)` aggregation under terms facet in <>. + CAUTION: This component is considered experimental, and must be explicitly enabled for a request handler in `solrconfig.xml`, in the same way as any other <>. This example shows how you could add this search components to `solrconfig.xml` and define it in request handler: http://git-wip-us.apache.org/repos/asf/lucene-solr/blob/df713fc7/solr/solr-ref-guide/src/json-facet-api.adoc ---------------------------------------------------------------------- diff --git a/solr/solr-ref-guide/src/json-facet-api.adoc b/solr/solr-ref-guide/src/json-facet-api.adoc index 90e197e..e71c8bd 100644 --- a/solr/solr-ref-guide/src/json-facet-api.adoc +++ b/solr/solr-ref-guide/src/json-facet-api.adoc @@ -59,7 +59,7 @@ The response to the facet request above will start with documents matching the r [[BucketingFacetExample]] === Bucketing Facet Example -Here's an example of a bucketing facet, that partitions documents into bucket based on the `cat` field (short for category), and returns the top 5 buckets: +Here's an example of a bucketing facet, that partitions documents into bucket based on the `cat` field (short for category), and returns the top 3 buckets: [source,bash] ---- @@ -342,7 +342,8 @@ Aggregation functions, also called *facet functions, analytic functions,* or **m |avg |avg(popularity) |average of numeric values |min |min(salary) |minimum value |max |max(mul(price,popularity)) |maximum value -|unique |unique(author) |number of unique values +|unique |unique(author) |number of unique values of the given field. Beyond 100 values it yields not exact estimate +|uniqueBlock |uniqueBlock(\_root_) |same as above with smaller footprint strictly requires <>. The given field is expected to be unique across blocks, now only singlevalued string fields are supported, docValues are recommended. |hll |hll(author) |distributed cardinality estimate via hyper-log-log algorithm |percentile |percentile(salary,50,75,99,99.9) |Percentile estimates via t-digest algorithm. When sorting by this metric, the first percentile listed is used as the sort value. |sumsq |sumsq(rent) |sum of squares of field or function @@ -449,6 +450,70 @@ And the response will look something like: By default "top authors" is defined by simple document count descending, but we could use our aggregation functions to sort by more interesting metrics. + +[[BlockJoinFacets]] +== Block Join Facets + +Block Join Facets facets allow bucketing <> as attributes of their parents. + +[[Blockjoinfacetexample]] +=== Block Join Facet example + +Suppose we have products with multiple SKUs, and we want to count products for each color. + +[source,java] +---- +{ + "id": "1", "type": "product", "name": "Solr T-Shirt", + "_childDocuments_": [ + { "id": "11", "type": "SKU", "color": "Red", "size": "L" }, + { "id": "12", "type": "SKU", "color": "Blue", "size": "L" }, + { "id": "13", "type": "SKU", "color": "Red", "size": "M" }, + { "id": "14", "type": "SKU", "color": "Blue", "size": "S" } + ] + } + +---- + +For *SKU domain* we can request + +[source,java] +---- + color: { + type: terms, + field: color, + limit: -1, + facet: { + productsCount: "uniqueBlock(_root_)" + } + } + + +---- + +and get + +[source,java] +---- + + [...] + color:{ + buckets:[ + { + val:Red, count:2, productsCount:1 + }, + { + val:Blue, count:2, productsCount:1 + } + ] + } +---- + +Please notice that `\_root_` is an internal field added by Lucene to each child document to reference on parent one. +Aggregation `uniqueBlock(\_root_)` is functionally equivalent to `unique(\_root_)`, but is optimized for nested documents block structure. +It's recommended to define `limit: -1` for `uniqueBlock` calculation, like in above example, +since default value of `limit` parameter is `10`, while `uniqueBlock` is supposed to be much faster with `-1`. + [[References]] == References http://git-wip-us.apache.org/repos/asf/lucene-solr/blob/df713fc7/solr/solr-ref-guide/src/other-parsers.adoc ---------------------------------------------------------------------- diff --git a/solr/solr-ref-guide/src/other-parsers.adoc b/solr/solr-ref-guide/src/other-parsers.adoc index 1b82883..bcd28eb 100644 --- a/solr/solr-ref-guide/src/other-parsers.adoc +++ b/solr/solr-ref-guide/src/other-parsers.adoc @@ -24,7 +24,7 @@ Many of these parsers are expressed the same way as <>. +There are two query parsers that support block joins. These parsers allow indexing and searching for relational content that has been <>. The example usage of the query parsers below assumes these two documents and each of their child documents have been indexed: