From commits-return-120473-archive-asf-public=cust-asf.ponee.io@lucene.apache.org Fri Jan 15 21:00:50 2021 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mxout1-he-de.apache.org (mxout1-he-de.apache.org [95.216.194.37]) by mx-eu-01.ponee.io (Postfix) with ESMTPS id DE53E180654 for ; Fri, 15 Jan 2021 22:00:50 +0100 (CET) Received: from mail.apache.org (mailroute1-lw-us.apache.org [207.244.88.153]) by mxout1-he-de.apache.org (ASF Mail Server at mxout1-he-de.apache.org) with SMTP id 26EB766638 for ; Fri, 15 Jan 2021 21:00:50 +0000 (UTC) Received: (qmail 81549 invoked by uid 500); 15 Jan 2021 21:00:47 -0000 Mailing-List: contact commits-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@lucene.apache.org Delivered-To: mailing list commits@lucene.apache.org Received: (qmail 81459 invoked by uid 99); 15 Jan 2021 21:00:47 -0000 Received: from ec2-52-202-80-70.compute-1.amazonaws.com (HELO gitbox.apache.org) (52.202.80.70) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 15 Jan 2021 21:00:47 +0000 Received: by gitbox.apache.org (ASF Mail Server at gitbox.apache.org, from userid 33) id 533F78E7AD; Fri, 15 Jan 2021 21:00:47 +0000 (UTC) Date: Fri, 15 Jan 2021 21:00:47 +0000 To: "commits@lucene.apache.org" Subject: [lucene-solr] 02/02: Ref Guide: copy edits for 8.8 release MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit From: ctargett@apache.org In-Reply-To: <161074444472.27080.15651069841844017194@gitbox.apache.org> References: <161074444472.27080.15651069841844017194@gitbox.apache.org> X-Git-Host: gitbox.apache.org X-Git-Repo: lucene-solr X-Git-Refname: refs/heads/branch_8x X-Git-Reftype: branch X-Git-Rev: b709448c4f52ab86cf1deaf279c22fa9e40f9cd2 X-Git-NotificationType: diff X-Git-Multimail-Version: 1.5.dev Auto-Submitted: auto-generated Message-Id: <20210115210047.533F78E7AD@gitbox.apache.org> This is an automated email from the ASF dual-hosted git repository. ctargett pushed a commit to branch branch_8x in repository https://gitbox.apache.org/repos/asf/lucene-solr.git commit b709448c4f52ab86cf1deaf279c22fa9e40f9cd2 Author: Cassandra Targett AuthorDate: Fri Jan 15 14:54:25 2021 -0600 Ref Guide: copy edits for 8.8 release --- solr/solr-ref-guide/src/format-of-solr-xml.adoc | 2 +- solr/solr-ref-guide/src/highlighting.adoc | 2 +- solr/solr-ref-guide/src/json-facet-api.adoc | 44 +++++++++++----------- solr/solr-ref-guide/src/learning-to-rank.adoc | 4 +- solr/solr-ref-guide/src/metrics-reporting.adoc | 2 +- ...onitoring-solr-with-prometheus-and-grafana.adoc | 8 ++-- solr/solr-ref-guide/src/pagination-of-results.adoc | 11 ++++-- solr/solr-ref-guide/src/solr-upgrade-notes.adoc | 2 +- .../src/updating-parts-of-documents.adoc | 2 +- 9 files changed, 41 insertions(+), 36 deletions(-) diff --git a/solr/solr-ref-guide/src/format-of-solr-xml.adoc b/solr/solr-ref-guide/src/format-of-solr-xml.adoc index 3484dbb..6416c6c 100644 --- a/solr/solr-ref-guide/src/format-of-solr-xml.adoc +++ b/solr/solr-ref-guide/src/format-of-solr-xml.adoc @@ -140,7 +140,7 @@ For example, if the Solr node is running behind a proxy or in a cloud environmen `hostPort` is the port that the Solr instance wants other nodes to contact it at. + In the default `solr.xml` file, this is set to `${solr.port.advertise:0}`. -If no port is passed via the `solr.xml` (i.e. `0`), then Solr will default to the port that jetty is listening on, defined by `${jetty.port}`. +If no port is passed via the `solr.xml` (i.e., `0`), then Solr will default to the port that jetty is listening on, defined by `${jetty.port}`. `leaderVoteWait`:: When SolrCloud is starting up, how long each Solr node will wait for all known replicas for that shard to be found before assuming that any nodes that haven't reported are down. diff --git a/solr/solr-ref-guide/src/highlighting.adoc b/solr/solr-ref-guide/src/highlighting.adoc index a1ddd5b..aef915f 100644 --- a/solr/solr-ref-guide/src/highlighting.adoc +++ b/solr/solr-ref-guide/src/highlighting.adoc @@ -227,7 +227,7 @@ By default, the Unified Highlighter will usually pick the right offset source (s The offset source can be explicitly configured to one of: `ANALYSIS`, `POSTINGS`, `POSTINGS_WITH_TERM_VECTORS`, or `TERM_VECTORS`. `hl.fragAlignRatio`:: -This parameter influences where the first match (i.e. highlighted text) in a passage is positioned. +This parameter influences where the first match (i.e., highlighted text) in a passage is positioned. The default value of `0.5` means to align the match to the middle. A value of `0.0` means to align the match to the left, while `1.0` to align it to the right. This setting is a best-effort hint, as there are a variety of factors. diff --git a/solr/solr-ref-guide/src/json-facet-api.adoc b/solr/solr-ref-guide/src/json-facet-api.adoc index 5e83807..e1e21c4e 100644 --- a/solr/solr-ref-guide/src/json-facet-api.adoc +++ b/solr/solr-ref-guide/src/json-facet-api.adoc @@ -188,42 +188,42 @@ include::{example-source-dir}JsonRequestApiTest.java[tag=solrj-json-terms-facet- [width="100%",cols="20%,90%",options="header",] |=== |Parameter |Description -|field |The field name to facet over. -|offset |Used for paging, this skips the first N buckets. Defaults to 0. -|limit |Limits the number of buckets returned. Defaults to 10. -|sort |Specifies how to sort the buckets produced. +|`field` |The field name to facet over. +|`offset` |Used for paging, this skips the first N buckets. Defaults to 0. +|`limit` |Limits the number of buckets returned. Defaults to 10. +|`sort` |Specifies how to sort the buckets produced. -“count” specifies document count, “index” sorts by the index (natural) order of the bucket value. One can also sort by any <> that occurs in the bucket. The default is “count desc”. This parameter may also be specified in JSON like `sort:{count:desc}`. The sort order may either be “asc” or “desc” -|overrequest a| +`count` specifies document count, `index` sorts by the index (natural) order of the bucket value. One can also sort by any <> that occurs in the bucket. The default is `count desc`. This parameter may also be specified in JSON like `sort:{count:desc}`. The sort order may either be “asc” or “desc” +|`overrequest` a| Number of buckets beyond the `limit` to internally request from shards during a distributed search. Larger values can increase the accuracy of the final "Top Terms" returned when the individual shards have very diff top terms. The default of `-1` causes a hueristic to be applied based on the other options specified. -|refine |If `true`, turns on distributed facet refining. This uses a second phase to retrieve any buckets needed for the final result from shards that did not include those buckets in their initial internal results, so that every shard contributes to every returned bucket in this facet and any sub-facets. This makes counts & stats for returned buckets exact. -|overrefine a| +|`refine` |If `true`, turns on distributed facet refining. This uses a second phase to retrieve any buckets needed for the final result from shards that did not include those buckets in their initial internal results, so that every shard contributes to every returned bucket in this facet and any sub-facets. This makes counts & stats for returned buckets exact. +|`overrefine` a| Number of buckets beyond the `limit` to consider internally during a distributed search when determining which buckets to refine. Larger values can increase the accuracy of the final "Top Terms" returned when the individual shards have very diff top terms, and the current `sort` option can result in refinement pushing terms lower down the sorted list (ex: `sort:"count asc"`) The default of `-1` causes a hueristic to be applied based on other options specified. -|mincount |Only return buckets with a count of at least this number. Defaults to 1. -|missing |A boolean that specifies if a special “missing” bucket should be returned that is defined by documents without a value in the field. Defaults to false. -|numBuckets |A boolean. If true, adds “numBuckets” to the response, an integer representing the number of buckets for the facet (as opposed to the number of buckets returned). Defaults to false. -|allBuckets |A boolean. If true, adds an “allBuckets” bucket to the response, representing the union of all of the buckets. For multi-valued fields, this is different than a bucket for all of the documents in the domain since a single document can belong to multiple buckets. Defaults to false. -|prefix |Only produce buckets for terms starting with the specified prefix. -|facet |Aggregations, metrics or nested facets that will be calculated for every returned bucket -|method a| +|`mincount` |Only return buckets with a count of at least this number. Defaults to `1`. +|`missing` |A boolean that specifies if a special “missing” bucket should be returned that is defined by documents without a value in the field. Defaults to `false`. +|`numBuckets` |A boolean. If `true`, adds “numBuckets” to the response, an integer representing the number of buckets for the facet (as opposed to the number of buckets returned). Defaults to `false`. +|`allBuckets` |A boolean. If `true`, adds an “allBuckets” bucket to the response, representing the union of all of the buckets. For multi-valued fields, this is different than a bucket for all of the documents in the domain since a single document can belong to multiple buckets. Defaults to `false`. +|`prefix` |Only produce buckets for terms starting with the specified prefix. +|`facet` |Aggregations, metrics or nested facets that will be calculated for every returned bucket +|`method` a| This parameter indicates the facet algorithm to use: -* "dv" DocValues, collect into ordinal array -* "uif" UnInvertedField, collect into ordinal array -* "dvhash" DocValues, collect into hash - improves efficiency over high cardinality fields -* "enum" TermsEnum then intersect DocSet (stream-able) -* "stream" Presently equivalent to "enum" - used for indexed, non-point fields with sort 'index asc' and allBuckets, numBuckets, missing disabled. -* "smart" Pick the best method for the field type (this is the default) +* `dv` DocValues, collect into ordinal array +* `uif` UnInvertedField, collect into ordinal array +* `dvhash` DocValues, collect into hash - improves efficiency over high cardinality fields +* `enum` TermsEnum then intersect DocSet (stream-able) +* `stream` Presently equivalent to `enum`. Used for indexed, non-point fields with sort `index asc` and `allBuckets`, `numBuckets`, and `missing` disabled. +* `smart` Pick the best method for the field type (this is the default) -|prelim_sort |An optional parameter for specifying an approximation of the final `sort` to use during initial collection of top buckets when the <>. +|`prelim_sort` |An optional parameter for specifying an approximation of the final `sort` to use during initial collection of top buckets when the <>. |=== === Query Facet diff --git a/solr/solr-ref-guide/src/learning-to-rank.adoc b/solr/solr-ref-guide/src/learning-to-rank.adoc index 4edd554..e411651 100644 --- a/solr/solr-ref-guide/src/learning-to-rank.adoc +++ b/solr/solr-ref-guide/src/learning-to-rank.adoc @@ -290,7 +290,7 @@ The output will include the model picked for each search result, resembling the }} ---- -=== Running a Rerank Query Interleaving a model with the original ranking +=== Running a Rerank Query Interleaving a Model with the Original Ranking When approaching Search Quality Evaluation with interleaving it may be useful to compare a model with the original ranking. To rerank the results of a query, interleaving a model with the original ranking, add the `rq` parameter to your search, passing the special inbuilt `_OriginalRanking_` model identifier as one model and your comparison model as the other model, for example: @@ -329,7 +329,7 @@ The output will include the model picked for each search result, resembling the }} ---- -=== Running a Rerank Query with Interleaving passing a specific algorithm +=== Running a Rerank Query with Interleaving Passing a Specific Algorithm To rerank the results of a query, interleaving two models using a specific algorithm, add the `interleavingAlgorithm` local parameter to the ltr query parser, for example: [source,text] diff --git a/solr/solr-ref-guide/src/metrics-reporting.adoc b/solr/solr-ref-guide/src/metrics-reporting.adoc index 5782aac..def9a03 100644 --- a/solr/solr-ref-guide/src/metrics-reporting.adoc +++ b/solr/solr-ref-guide/src/metrics-reporting.adoc @@ -97,7 +97,7 @@ The metrics available in your system can be customized by modifying the `> for more information about the `solr.xml` file, where to find it, and how to edit it. -=== Disabling the metrics collection === +=== Disabling the Metrics Collection The `` element in `solr.xml` supports one attribute `enabled`, which takes a boolean value, for example ``. diff --git a/solr/solr-ref-guide/src/monitoring-solr-with-prometheus-and-grafana.adoc b/solr/solr-ref-guide/src/monitoring-solr-with-prometheus-and-grafana.adoc index 8661fea..05c844e 100644 --- a/solr/solr-ref-guide/src/monitoring-solr-with-prometheus-and-grafana.adoc +++ b/solr/solr-ref-guide/src/monitoring-solr-with-prometheus-and-grafana.adoc @@ -118,7 +118,7 @@ The Solr's metrics exposed by `solr-exporter` can be seen at: `\http://localhost === Environment Variable Options -The bin scripts provided with the Prometheus Exporter support the use of custom java options through the following environment variables: +The `./bin` scripts provided with the Prometheus Exporter support the use of custom java options through the following environment variables: `JAVA_HEAP`:: Sets the initial (`Xms`) and max (`Xmx`) Java heap size. The default is `512m`. @@ -133,13 +133,13 @@ Custom Java garbage collection settings. The default is `-XX:+UseG1GC`. Extra JVM options. `ZK_CREDS_AND_ACLS`:: -Credentials for connecting to a ZK Host that is protected with ACLs. -For more information on what to include in this variable, refer to the <> or the <<#getting-metrics-from-a-secured-solrcloud,example below>>. +Credentials for connecting to a ZooKeeper host that is protected with ACLs. +For more information on what to include in this variable, refer to the section <> or the <>. `CLASSPATH_PREFIX`:: Location of extra libraries to load when starting the `solr-exporter`. -All <<#command-line-parameters,command line parameters>> are able to be provided via environment variables when using the bin scripts. +All <<#command-line-parameters,command line parameters>> are able to be provided via environment variables when using the `./bin` scripts. === Getting Metrics from a Secured SolrCloud diff --git a/solr/solr-ref-guide/src/pagination-of-results.adoc b/solr/solr-ref-guide/src/pagination-of-results.adoc index 491f135..2bc420a 100644 --- a/solr/solr-ref-guide/src/pagination-of-results.adoc +++ b/solr/solr-ref-guide/src/pagination-of-results.adoc @@ -16,7 +16,6 @@ // specific language governing permissions and limitations // under the License. - In most search applications, the "top" matching results (sorted by score, or some other criteria) are displayed to some human user. In many applications the UI for these sorted results are displayed to the user in "pages" containing a fixed number of matching results, and users don't typically look at results past the first few pages worth of results. @@ -97,9 +96,15 @@ There are a few important constraints to be aware of when using `cursorMark` par . `cursorMark` and `start` are mutually exclusive parameters. * Your requests must either not include a `start` parameter, or it must be specified with a value of "```0```". -. When using the <>, partial results may be returned. If time expires before the search is complete - as indicated when the `responseHeader` includes `"partialResults": true`, some matching documents may have been skipped. Additionally, if `cursorMark` matches `nextCursorMark`, you cannot be sure that there are no more results. In these situation, consider increasing `timeAllowed` and reissuing the query. W [...] +. When using the <> request parameter, partial results may be returned. +If time expires before the search is complete, as indicated when the `responseHeader` includes `"partialResults": true`, some matching documents may have been skipped. +Additionally, if `cursorMark` matches `nextCursorMark`, you cannot be sure that there are no more results. ++ +In this situation, consider increasing `timeAllowed` and reissuing the query. +When the `responseHeader` no longer includes `"partialResults": true`, and `cursorMark` matches `nextCursorMark`, there are no more results. . `sort` clauses must include the uniqueKey field (either `asc` or `desc`). -* If `id` is your uniqueKey field, then sort parameters like `id asc` and `name asc, id desc` would both work fine, but `name asc` by itself would not ++ +If `id` is your uniqueKey field, then sort parameters like `id asc` and `name asc, id desc` would both work fine, but `name asc` by itself would not . Sorts including <> based functions that involve calculations relative to `NOW` will cause confusing results, since every document will get a new sort value on every subsequent request. This can easily result in cursors that never end, and constantly return the same documents over and over – even if the documents are never updated. + In this situation, choose & re-use a fixed value for the <> in all of your cursor requests. diff --git a/solr/solr-ref-guide/src/solr-upgrade-notes.adoc b/solr/solr-ref-guide/src/solr-upgrade-notes.adoc index 1730546..fe08bdf 100644 --- a/solr/solr-ref-guide/src/solr-upgrade-notes.adoc +++ b/solr/solr-ref-guide/src/solr-upgrade-notes.adoc @@ -46,7 +46,7 @@ If this assumption is false, Solr will do a cheap check that usually detects the throw an exception to alert you of the need to specify the Root ID. This backwards incompatible change was done to increase performance and robustness. *** This feature no longer requires stored=true or docValues=true on the `\_root_` field. You might -have it for other purposes though (e.g. for `uniqueBlock(...)`) +have it for other purposes though (e.g., for `uniqueBlock(...)`) *** This feature no longer requires the `\_nest_path_` field, although you probably ought to continue to define it as it's useful for other things. diff --git a/solr/solr-ref-guide/src/updating-parts-of-documents.adoc b/solr/solr-ref-guide/src/updating-parts-of-documents.adoc index ed71354..cea01be 100644 --- a/solr/solr-ref-guide/src/updating-parts-of-documents.adoc +++ b/solr/solr-ref-guide/src/updating-parts-of-documents.adoc @@ -130,7 +130,7 @@ Solr offers two solutions to address this: Furthermore, you _should_ (sometimes _must_) specify the Root document's ID in the `\_root_` field of this partial update. This is how Solr understands that you are updating a child document, and not a Root document. Without it, Solr only guesses that the `\_route_` param is -equivalent, but it may be absent or not equivalent (e.g. when using the `implicit` router). +equivalent, but it may be absent or not equivalent (e.g., when using the `implicit` router). All of the examples below use `id` prefixes, so no `\_route_` param will be necessary for these examples. ====