lucene-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From a.@apache.org
Subject [24/50] [abbrv] lucene-solr:jira/solr-11779: SOLR-8998: ref guide update.
Date Tue, 08 May 2018 21:15:10 GMT
SOLR-8998: ref guide update.


Project: http://git-wip-us.apache.org/repos/asf/lucene-solr/repo
Commit: http://git-wip-us.apache.org/repos/asf/lucene-solr/commit/df713fc7
Tree: http://git-wip-us.apache.org/repos/asf/lucene-solr/tree/df713fc7
Diff: http://git-wip-us.apache.org/repos/asf/lucene-solr/diff/df713fc7

Branch: refs/heads/jira/solr-11779
Commit: df713fc70009733afed84484298326b15f963d15
Parents: 46ecb73
Author: Mikhail Khludnev <mkhl@apache.org>
Authored: Wed May 2 18:23:08 2018 +0300
Committer: Mikhail Khludnev <mkhl@apache.org>
Committed: Wed May 2 18:23:15 2018 +0300

----------------------------------------------------------------------
 solr/solr-ref-guide/src/blockjoin-faceting.adoc |  2 +
 solr/solr-ref-guide/src/json-facet-api.adoc     | 69 +++++++++++++++++++-
 solr/solr-ref-guide/src/other-parsers.adoc      |  2 +-
 3 files changed, 70 insertions(+), 3 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/lucene-solr/blob/df713fc7/solr/solr-ref-guide/src/blockjoin-faceting.adoc
----------------------------------------------------------------------
diff --git a/solr/solr-ref-guide/src/blockjoin-faceting.adoc b/solr/solr-ref-guide/src/blockjoin-faceting.adoc
index 41030d5..7e2408b 100644
--- a/solr/solr-ref-guide/src/blockjoin-faceting.adoc
+++ b/solr/solr-ref-guide/src/blockjoin-faceting.adoc
@@ -20,6 +20,8 @@ BlockJoin facets allow you to aggregate children facet counts by their parents.
 
 It is a common requirement that if a parent document has several children documents, all
of them need to increment facet value count only once. This functionality is provided by `BlockJoinDocSetFacetComponent`,
and `BlockJoinFacetComponent` just an alias for compatibility.
 
+CAUTION: This functionality is considered deprecated. Users are encouraged to use `uniqueBlock(\_root_)`
aggregation under terms facet in <<json-facet-api.adoc#Blockjoinfacetexample,JSON Facet
API>>. 
+
 CAUTION: This component is considered experimental, and must be explicitly enabled for a
request handler in `solrconfig.xml`, in the same way as any other <<requesthandlers-and-searchcomponents-in-solrconfig.adoc#requesthandlers-and-searchcomponents-in-solrconfig,search
component>>.
 
 This example shows how you could add this search components to `solrconfig.xml` and define
it in request handler:

http://git-wip-us.apache.org/repos/asf/lucene-solr/blob/df713fc7/solr/solr-ref-guide/src/json-facet-api.adoc
----------------------------------------------------------------------
diff --git a/solr/solr-ref-guide/src/json-facet-api.adoc b/solr/solr-ref-guide/src/json-facet-api.adoc
index 90e197e..e71c8bd 100644
--- a/solr/solr-ref-guide/src/json-facet-api.adoc
+++ b/solr/solr-ref-guide/src/json-facet-api.adoc
@@ -59,7 +59,7 @@ The response to the facet request above will start with documents matching
the r
 [[BucketingFacetExample]]
 === Bucketing Facet Example
 
-Here's an example of a bucketing facet, that partitions documents into bucket based on the
`cat` field (short for category), and returns the top 5 buckets:
+Here's an example of a bucketing facet, that partitions documents into bucket based on the
`cat` field (short for category), and returns the top 3 buckets:
 
 [source,bash]
 ----
@@ -342,7 +342,8 @@ Aggregation functions, also called *facet functions, analytic functions,*
or **m
 |avg |avg(popularity) |average of numeric values
 |min |min(salary) |minimum value
 |max |max(mul(price,popularity)) |maximum value
-|unique |unique(author) |number of unique values
+|unique |unique(author) |number of unique values of the given field. Beyond 100 values it
yields not exact estimate 
+|uniqueBlock |uniqueBlock(\_root_) |same as above with smaller footprint strictly requires
<<uploading-data-with-index-handlers.adoc#nested-child-documents, block index>>.
The given field is expected to be unique across blocks, now only singlevalued string fields
are supported, docValues are recommended. 
 |hll |hll(author) |distributed cardinality estimate via hyper-log-log algorithm
 |percentile |percentile(salary,50,75,99,99.9) |Percentile estimates via t-digest algorithm.
When sorting by this metric, the first percentile listed is used as the sort value.
 |sumsq |sumsq(rent) |sum of squares of field or function
@@ -449,6 +450,70 @@ And the response will look something like:
 
 By default "top authors" is defined by simple document count descending, but we could use
our aggregation functions to sort by more interesting metrics.
 
+
+[[BlockJoinFacets]]
+== Block Join Facets
+
+Block Join Facets facets allow bucketing <<uploading-data-with-index-handlers.adoc#nested-child-documents,
child documents>> as attributes of their parents.
+
+[[Blockjoinfacetexample]]
+=== Block Join Facet example
+
+Suppose we have products with multiple SKUs, and we want to count products for each color.
+
+[source,java]
+----
+{
+    "id": "1", "type": "product", "name": "Solr T-Shirt",
+    "_childDocuments_": [
+      { "id": "11", "type": "SKU", "color": "Red",  "size": "L" },
+      { "id": "12", "type": "SKU", "color": "Blue", "size": "L" },
+      { "id": "13", "type": "SKU", "color": "Red",  "size": "M" },
+      { "id": "14", "type": "SKU", "color": "Blue", "size": "S" }
+    ]
+  }
+
+----
+
+For *SKU domain* we can request
+
+[source,java]
+----
+  color: {
+    type: terms,
+    field: color,
+    limit: -1,
+    facet: {
+      productsCount: "uniqueBlock(_root_)"
+    }
+  }
+
+
+----
+
+and get
+
+[source,java]
+----
+
+  [...]
+  color:{
+     buckets:[
+        {
+          val:Red, count:2, productsCount:1
+        },
+        {
+          val:Blue, count:2, productsCount:1
+        }
+     ]
+  }
+----
+
+Please notice that `\_root_` is an internal field added by Lucene to each child document
to reference on parent one.
+Aggregation `uniqueBlock(\_root_)` is functionally equivalent to `unique(\_root_)`, but is
optimized for nested documents block structure.
+It's recommended to define `limit: -1` for `uniqueBlock` calculation, like in above example,
+since default value of `limit` parameter is `10`, while `uniqueBlock` is supposed to be much
faster with `-1`.
+
 [[References]]
 == References
 

http://git-wip-us.apache.org/repos/asf/lucene-solr/blob/df713fc7/solr/solr-ref-guide/src/other-parsers.adoc
----------------------------------------------------------------------
diff --git a/solr/solr-ref-guide/src/other-parsers.adoc b/solr/solr-ref-guide/src/other-parsers.adoc
index 1b82883..bcd28eb 100644
--- a/solr/solr-ref-guide/src/other-parsers.adoc
+++ b/solr/solr-ref-guide/src/other-parsers.adoc
@@ -24,7 +24,7 @@ Many of these parsers are expressed the same way as <<local-parameters-in-querie
 
 == Block Join Query Parsers
 
-There are two query parsers that support block joins. These parsers allow indexing and searching
for relational content that has been <<uploading-data-with-index-handlers.adoc#uploading-data-with-index-handlers,indexed
as nested documents>>.
+There are two query parsers that support block joins. These parsers allow indexing and searching
for relational content that has been <<uploading-data-with-index-handlers.adoc#nested-child-documents,
indexed as nested documents>>.
 
 The example usage of the query parsers below assumes these two documents and each of their
child documents have been indexed:
 


Mime
View raw message