lucene-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From tflo...@apache.org
Subject [09/27] lucene-solr:jira/solr-10233: Ref Guide: add AtomicUpdateRequestProcessorFactory from SOLR-9530
Date Tue, 23 May 2017 00:03:31 GMT
Ref Guide: add AtomicUpdateRequestProcessorFactory from SOLR-9530


Project: http://git-wip-us.apache.org/repos/asf/lucene-solr/repo
Commit: http://git-wip-us.apache.org/repos/asf/lucene-solr/commit/fa76171a
Tree: http://git-wip-us.apache.org/repos/asf/lucene-solr/tree/fa76171a
Diff: http://git-wip-us.apache.org/repos/asf/lucene-solr/diff/fa76171a

Branch: refs/heads/jira/solr-10233
Commit: fa76171a63cca0b42f81213f4cf1284bdaab6b63
Parents: 3392a12
Author: Cassandra Targett <cassandra.targett@lucidworks.com>
Authored: Fri May 19 13:10:24 2017 -0500
Committer: Cassandra Targett <cassandra.targett@lucidworks.com>
Committed: Fri May 19 13:10:51 2017 -0500

----------------------------------------------------------------------
 .../src/update-request-processors.adoc          | 191 +++++++++++--------
 1 file changed, 113 insertions(+), 78 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/lucene-solr/blob/fa76171a/solr/solr-ref-guide/src/update-request-processors.adoc
----------------------------------------------------------------------
diff --git a/solr/solr-ref-guide/src/update-request-processors.adoc b/solr/solr-ref-guide/src/update-request-processors.adoc
index e140116..d61e3d7 100644
--- a/solr/solr-ref-guide/src/update-request-processors.adoc
+++ b/solr/solr-ref-guide/src/update-request-processors.adoc
@@ -2,7 +2,9 @@
 :page-shortname: update-request-processors
 :page-permalink: update-request-processors.html
 
-Every update request received by Solr is run through a chain of plugins known as Update Request
Processors, or __URPs__. This can be useful, for example, to add a field to the document being
indexed; to change the value of a particular field; or to drop an update if the incoming document
doesn't fulfill certain criteria. In fact, a surprisingly large number of features in Solr
are implemented as Update Processors and therefore it is necessary to understand how such
plugins work and where are they configured.
+Every update request received by Solr is run through a chain of plugins known as Update Request
Processors, or _URPs_.
+
+This can be useful, for example, to add a field to the document being indexed; to change
the value of a particular field; or to drop an update if the incoming document doesn't fulfill
certain criteria. In fact, a surprisingly large number of features in Solr are implemented
as Update Processors and therefore it is necessary to understand how such plugins work and
where are they configured.
 
 [[UpdateRequestProcessors-AnatomyandLifecycle]]
 == Anatomy and Lifecycle
@@ -11,19 +13,14 @@ An Update Request Processor is created as part of a {solr-javadocs}/solr-core/or
 
 The easiest way to describe an Update Request Processor is to look at the Javadocs of the
abstract class {solr-javadocs}//solr-core/org/apache/solr/update/processor/UpdateRequestProcessor.html[UpdateRequestProcessor].
Every UpdateRequestProcessor must have a corresponding factory class which extends {solr-javadocs}/solr-core/org/apache/solr/update/processor/UpdateRequestProcessorFactory.html[UpdateRequestProcessorFactory].
This factory class is used by Solr to create a new instance of this plugin. Such a design
provides two benefits:
 
-1.  An update request processor need not be thread safe because it is used by one and only
one request thread and destroyed once the request is complete.
-2.  The factory class can accept configuration parameters and maintain any state that may
be required between requests. The factory class must be thread-safe.
+. An update request processor need not be thread safe because it is used by one and only
one request thread and destroyed once the request is complete.
+. The factory class can accept configuration parameters and maintain any state that may be
required between requests. The factory class must be thread-safe.
 
 Every update request processor chain is constructed during loading of a Solr core and cached
until the core is unloaded. Each `UpdateRequestProcessorFactory` specified in the chain is
also instantiated and initialized with configuration that may have been specified in `solrconfig.xml`.
 
 When an update request is received by Solr, it looks up the update chain to be used for this
request. A new instance of each UpdateRequestProcessor specified in the chain is created using
the corresponding factory. The update request is parsed into corresponding {solr-javadocs}/solr-core/org/apache/solr/update/UpdateCommand.html[UpdateCommand]
objects which are run through the chain. Each UpdateRequestProcessor instance is responsible
for invoking the next plugin in the chain. It can choose to short circuit the chain by not
invoking the next processor and even abort further processing by throwing an exception.
 
-[NOTE]
-====
-
-A single update request may contain a batch of multiple new documents or deletes and therefore
the corresponding processXXX methods of an UpdateRequestProcessor will be invoked multiple
times for every individual update. However, it is guaranteed that a single thread will serially
invoke these methods.
-
-====
+NOTE: A single update request may contain a batch of multiple new documents or deletes and
therefore the corresponding processXXX methods of an UpdateRequestProcessor will be invoked
multiple times for every individual update. However, it is guaranteed that a single thread
will serially invoke these methods.
 
 [[UpdateRequestProcessors-Configuration]]
 == Configuration
@@ -48,7 +45,7 @@ Each of these perform an essential function and as such any custom chain
usually
 
 The following example demonstrates how a custom chain can be configured inside `solrconfig.xml`.
 
-.updateRequestProcessorChain
+.Example dedupe updateRequestProcessorChain
 [source,xml]
 ----
 <updateRequestProcessorChain name="dedupe">
@@ -69,9 +66,7 @@ In the above example, a new update processor chain named "dedupe" is created
wit
 .RunUpdateProcessorFactory
 [WARNING]
 ====
-
 Do not forget to add `RunUpdateProcessorFactory` at the end of any chains you define in `solrconfig.xml`.
Otherwise update requests processed by that chain will not actually affect the indexed data.
-
 ====
 
 [[UpdateRequestProcessors-ConfiguringIndividualProcessorsasTop-LevelPlugins]]
@@ -113,14 +108,14 @@ In SolrCloud mode, all processors in the chain _before_ the `DistributedUpdatePr
 
 For example, consider the "dedupe" chain which we saw in a section above. Assume that a 3-node
SolrCloud cluster exists where node A hosts the leader of shard1, node B hosts the leader
of shard2 and node C hosts the replica of shard2. Assume that an update request is sent to
node A which forwards the update to node B (because the update belongs to shard2) which then
distributes the update to its replica node C. Let's see what happens at each node:
 
-* **Node A**: Runs the update through the `SignatureUpdateProcessor` (which computes the
signature and puts it in the "id" field), then `LogUpdateProcessor` and then `DistributedUpdateProcessor`.
This processor determines that the update actually belongs to node B and is forwarded to node
B. The update is not processed further. This is required because the next processor, `RunUpdateProcessor`,
will execute the update against the local shard1 index which would lead to duplicate data
on shard1 and shard2.
-* **Node B**: Receives the update and sees that it was forwarded by another node. The update
is directly sent to `DistributedUpdateProcessor` because it has already been through the `SignatureUpdateProcessor`
on node A and doing the same signature computation again would be redundant. The `DistributedUpdateProcessor`
determines that the update indeed belongs to this node, distributes it to its replica on Node
C and then forwards the update further in the chain to `RunUpdateProcessor`.
-* **Node C**: Receives the update and sees that it was distributed by its leader. The update
is directly sent to `DistributedUpdateProcessor` which performs some consistency checks and
forwards the update further in the chain to `RunUpdateProcessor`.
+* *Node A*: Runs the update through the `SignatureUpdateProcessor` (which computes the signature
and puts it in the "id" field), then `LogUpdateProcessor` and then `DistributedUpdateProcessor`.
This processor determines that the update actually belongs to node B and is forwarded to node
B. The update is not processed further. This is required because the next processor, `RunUpdateProcessor`,
will execute the update against the local shard1 index which would lead to duplicate data
on shard1 and shard2.
+* *Node B*: Receives the update and sees that it was forwarded by another node. The update
is directly sent to `DistributedUpdateProcessor` because it has already been through the `SignatureUpdateProcessor`
on node A and doing the same signature computation again would be redundant. The `DistributedUpdateProcessor`
determines that the update indeed belongs to this node, distributes it to its replica on Node
C and then forwards the update further in the chain to `RunUpdateProcessor`.
+* *Node C*: Receives the update and sees that it was distributed by its leader. The update
is directly sent to `DistributedUpdateProcessor` which performs some consistency checks and
forwards the update further in the chain to `RunUpdateProcessor`.
 
 In summary:
 
-1.  All processors before `DistributedUpdateProcessor` are only run on the first node that
receives an update request whether it be a forwarding node (e.g., node A in the above example)
or a leader (e.g., node B). We call these "pre-processors" or just "processors".
-2.  All processors after `DistributedUpdateProcessor` run only on the leader and the replica
nodes. They are not executed on forwarding nodes. Such processors are called "post-processors".
+. All processors before `DistributedUpdateProcessor` are only run on the first node that
receives an update request whether it be a forwarding node (e.g., node A in the above example)
or a leader (e.g., node B). We call these "pre-processors" or just "processors".
+. All processors after `DistributedUpdateProcessor` run only on the leader and the replica
nodes. They are not executed on forwarding nodes. Such processors are called "post-processors".
 
 In the previous section, we saw that the `updateRequestProcessorChain` was configured with
`processor="remove_blanks, signature"`. This means that such processors are of the #1 kind
and are run only on the forwarding nodes. Similarly, we can configure them as the #2 kind
by specifying with the attribute "post-processor" as follows:
 
@@ -134,20 +129,17 @@ In the previous section, we saw that the `updateRequestProcessorChain`
was confi
 
 However executing a processor only on the forwarding nodes is a great way of distributing
an expensive computation such as de-duplication across a SolrCloud cluster by sending requests
randomly via a load balancer. Otherwise the expensive computation is repeated on both the
leader and replica nodes.
 
+// TODO 6.6 I think this can be removed after SOLR-9530 -CT
 .Pre-processors and Atomic Updates
 [WARNING]
 ====
-
 Because `DistributedUpdateProcessor` is responsible for processing <<updating-parts-of-documents.adoc#updating-parts-of-documents,Atomic
Updates>> into full documents on the leader node, this means that pre-processors which
are executed only on the forwarding nodes can only operate on the partial document. If you
have a processor which must process a full document then the only choice is to specify it
as a post-processor.
-
 ====
 
 .Custom update chain post-processors may never be invoked on a recovering replica
 [WARNING]
 ====
-
-While a replica is in <<read-and-write-side-fault-tolerance.adoc#ReadandWriteSideFaultTolerance-WriteSideFaultTolerance,recovery>>,
inbound update requests are buffered to the transaction log. After recovery has completed
successfully, those buffered update requests are replayed. As of this writing, however, custom
update chain post-processors are never invoked for buffered update requests. See https://issues.apache.org/jira/browse/SOLR-8030[SOLR-8030].
To work around this problem until SOLR-8030 has been fixed, **avoid specifying post-processors
in custom update chains**.
-
+While a replica is in <<read-and-write-side-fault-tolerance.adoc#ReadandWriteSideFaultTolerance-WriteSideFaultTolerance,recovery>>,
inbound update requests are buffered to the transaction log. After recovery has completed
successfully, those buffered update requests are replayed. As of this writing, however, custom
update chain post-processors are never invoked for buffered update requests. See https://issues.apache.org/jira/browse/SOLR-8030[SOLR-8030].
To work around this problem until SOLR-8030 has been fixed, *avoid specifying post-processors
in custom update chains*.
 ====
 
 [[UpdateRequestProcessors-UsingCustomChains]]
@@ -158,7 +150,7 @@ While a replica is in <<read-and-write-side-fault-tolerance.adoc#ReadandWriteSid
 
 The `update.chain` parameter can be used in any update request to choose a custom chain which
has been configured in `solrconfig.xml`. For example, in order to choose the "dedupe" chain
described in a previous section, one can issue the following request:
 
-.update.chain
+.Using update.chain
 [source,bash]
 ----
 curl "http://localhost:8983/solr/gettingstarted/update/json?update.chain=dedupe&commit=true"
-H 'Content-type: application/json' -d '
@@ -182,12 +174,12 @@ The above should dedupe the two identical documents and index only one
of them.
 [[UpdateRequestProcessors-Processor_Post-ProcessorRequestParameters]]
 === Processor & Post-Processor Request Parameters
 
-We can dynamically construct a custom update request processor chain using the "processor"
and "post-processor" request parameters. Multiple processors can be specified as a comma-separated
value for these two parameters. For example:
+We can dynamically construct a custom update request processor chain using the `processor`
and `post-processor` request parameters. Multiple processors can be specified as a comma-separated
value for these two parameters. For example:
 
-.Constructing a chain at request time
+.Executing processors configured in solrconfig.xml as (pre)-processors
 [source,bash]
 ----
-# Executing processors configured in solrconfig.xml as (pre)-processors
+
 curl "http://localhost:8983/solr/gettingstarted/update/json?processor=remove_blanks,signature&commit=true"
-H 'Content-type: application/json' -d '
 [
   {
@@ -202,8 +194,11 @@ curl "http://localhost:8983/solr/gettingstarted/update/json?processor=remove_bla
 
   }
 ]'
- 
-# Executing processors configured in solrconfig.xml as pre- and post-processors
+----
+
+.Executing processors configured in solrconfig.xml as pre- and post-processors
+[source,bash]
+----
 curl "http://localhost:8983/solr/gettingstarted/update/json?processor=remove_blanks&post-processor=signature&commit=true"
-H 'Content-type: application/json' -d '
 [
   {
@@ -230,7 +225,7 @@ This can be done by adding either "update.chain" or "processor" and "post-proces
 
 The following is an `initParam` defined in the <<schemaless-mode.adoc#schemaless-mode,schemaless
configuration>> which applies a custom update chain to all request handlers starting
with "/update/".
 
-.InitParams
+.Example initParams
 [source,xml]
 ----
 <initParams path="/update/**">
@@ -242,12 +237,10 @@ The following is an `initParam` defined in the <<schemaless-mode.adoc#schemaless
 
 Alternately, one can achieve a similar effect using the "defaults" as shown in the example
below:
 
-.defaults
+.Example defaults
 [source,xml]
 ----
-<requestHandler name="/update/extract"
-                startup="lazy"
-                class="solr.extraction.ExtractingRequestHandler" >
+<requestHandler name="/update/extract" startup="lazy" class="solr.extraction.ExtractingRequestHandler"
>
   <lst name="defaults">
     <str name="update.chain">add-unknown-fields-to-the-schema</str>
   </lst>
@@ -262,69 +255,111 @@ What follows are brief descriptions of the currently available update
request pr
 [[UpdateRequestProcessors-GeneralUseUpdateProcessorFactories]]
 === General Use UpdateProcessorFactories
 
-* {solr-javadocs}/solr-core/org/apache/solr/update/processor/AddSchemaFieldsUpdateProcessorFactory.html[AddSchemaFieldsUpdateProcessorFactory]:
This processor will dynamically add fields to the schema if an input document contains one
or more fields that don't match any field or dynamic field in the schema.
-* {solr-javadocs}/solr-core/org/apache/solr/update/processor/ClassificationUpdateProcessorFactory.html[ClassificationUpdateProcessorFactory]:
This processor uses Lucene's classification module to provide simple document classification.
See https://wiki.apache.org/solr/SolrClassification for more details on how to use this processor.
-* {solr-javadocs}/solr-core/org/apache/solr/update/processor/CloneFieldUpdateProcessorFactory.html[CloneFieldUpdateProcessorFactory]:
Clones the values found in any matching _source_ field into the configured _dest_ field.
-* {solr-javadocs}/solr-core/org/apache/solr/update/processor/DefaultValueUpdateProcessorFactory.html[DefaultValueUpdateProcessorFactory]:
A simple processor that adds a default value to any document which does not already have a
value in fieldName.
-* {solr-javadocs}/solr-core/org/apache/solr/update/processor/DocBasedVersionConstraintsProcessorFactory.html[DocBasedVersionConstraintsProcessorFactory]:
This Factory generates an UpdateProcessor that helps to enforce version constraints on documents
based on per-document version numbers using a configured name of a versionField.
-* {solr-javadocs}/solr-core/org/apache/solr/update/processor/DocExpirationUpdateProcessorFactory.html[DocExpirationUpdateProcessorFactory]:
Update Processor Factory for managing automatic "expiration" of documents.
-* {solr-javadocs}/solr-core/org/apache/solr/update/processor/FieldNameMutatingUpdateProcessorFactory.html[FieldNameMutatingUpdateProcessorFactory]:
Modifies field names by replacing all matches to the configured `pattern` with the configured
`replacement`.
-* {solr-javadocs}/solr-core/org/apache/solr/update/processor/IgnoreCommitOptimizeUpdateProcessorFactory.html[IgnoreCommitOptimizeUpdateProcessorFactory]:
Allows you to ignore commit and/or optimize requests from client applications when running
in SolrCloud mode, for more information, see: Shards and Indexing Data in SolrCloud
-* {solr-javadocs}/solr-core/org/apache/solr/update/processor/RegexpBoostProcessorFactory.html[RegexpBoostProcessorFactory]:
A processor which will match content of "inputField" against regular expressions found in
"boostFilename", and if it matches will return the corresponding boost value from the file
and output this to "boostField" as a double value.
-* {solr-javadocs}/solr-core/org/apache/solr/update/processor/SignatureUpdateProcessorFactory.html[SignatureUpdateProcessorFactory]:
Uses a defined set of fields to generate a hash "signature" for the document. Useful for only
indexing one copy of "similar" documents.
-* {solr-javadocs}/solr-core/org/apache/solr/update/processor/StatelessScriptUpdateProcessorFactory.html[StatelessScriptUpdateProcessorFactory]:
An update request processor factory that enables the use of update processors implemented
as scripts.
-* {solr-javadocs}/solr-core/org/apache/solr/update/processor/TimestampUpdateProcessorFactory.html[TimestampUpdateProcessorFactory]:
An update processor that adds a newly generated date value of "NOW" to any document being
added that does not already have a value in the specified field.
-* {solr-javadocs}/solr-core/org/apache/solr/update/processor/URLClassifyProcessorFactory.html[URLClassifyProcessorFactory]:
Update processor which examines a URL and outputs to various other fields with characteristics
of that URL, including length, number of path levels, whether it is a top level URL (levels==0),
whether it looks like a landing/index page, a canonical representation of the URL (e.g., stripping
index.html), the domain and path parts of the URL, etc.
-* {solr-javadocs}/solr-core/org/apache/solr/update/processor/UUIDUpdateProcessorFactory.html[UUIDUpdateProcessorFactory]:
An update processor that adds a newly generated UUID value to any document being added that
does not already have a value in the specified field.
+{solr-javadocs}/solr-core/org/apache/solr/update/processor/AddSchemaFieldsUpdateProcessorFactory.html[AddSchemaFieldsUpdateProcessorFactory]::
This processor will dynamically add fields to the schema if an input document contains one
or more fields that don't match any field or dynamic field in the schema.
+
+{solr-javadocs}/solr-core/org/apache/solr/update/processor/AtomicUpdateRequestProcessorFactory.html[AtomicUpdateProcessorFactory]::
This processor will convert conventional field-value documents to atomic update documents.
+
+{solr-javadocs}/solr-core/org/apache/solr/update/processor/ClassificationUpdateProcessorFactory.html[ClassificationUpdateProcessorFactory]::
This processor uses Lucene's classification module to provide simple document classification.
See https://wiki.apache.org/solr/SolrClassification for more details on how to use this processor.
+{solr-javadocs}/solr-core/org/apache/solr/update/processor/CloneFieldUpdateProcessorFactory.html[CloneFieldUpdateProcessorFactory]::
Clones the values found in any matching _source_ field into the configured _dest_ field.
+
+{solr-javadocs}/solr-core/org/apache/solr/update/processor/DefaultValueUpdateProcessorFactory.html[DefaultValueUpdateProcessorFactory]::
A simple processor that adds a default value to any document which does not already have a
value in fieldName.
+
+{solr-javadocs}/solr-core/org/apache/solr/update/processor/DocBasedVersionConstraintsProcessorFactory.html[DocBasedVersionConstraintsProcessorFactory]::
This Factory generates an UpdateProcessor that helps to enforce version constraints on documents
based on per-document version numbers using a configured name of a versionField.
+
+{solr-javadocs}/solr-core/org/apache/solr/update/processor/DocExpirationUpdateProcessorFactory.html[DocExpirationUpdateProcessorFactory]::
Update Processor Factory for managing automatic "expiration" of documents.
+
+{solr-javadocs}/solr-core/org/apache/solr/update/processor/FieldNameMutatingUpdateProcessorFactory.html[FieldNameMutatingUpdateProcessorFactory]::
Modifies field names by replacing all matches to the configured `pattern` with the configured
`replacement`.
+
+{solr-javadocs}/solr-core/org/apache/solr/update/processor/IgnoreCommitOptimizeUpdateProcessorFactory.html[IgnoreCommitOptimizeUpdateProcessorFactory]::
Allows you to ignore commit and/or optimize requests from client applications when running
in SolrCloud mode, for more information, see: Shards and Indexing Data in SolrCloud
+
+{solr-javadocs}/solr-core/org/apache/solr/update/processor/RegexpBoostProcessorFactory.html[RegexpBoostProcessorFactory]::
A processor which will match content of "inputField" against regular expressions found in
"boostFilename", and if it matches will return the corresponding boost value from the file
and output this to "boostField" as a double value.
+
+{solr-javadocs}/solr-core/org/apache/solr/update/processor/SignatureUpdateProcessorFactory.html[SignatureUpdateProcessorFactory]::
Uses a defined set of fields to generate a hash "signature" for the document. Useful for only
indexing one copy of "similar" documents.
+
+{solr-javadocs}/solr-core/org/apache/solr/update/processor/StatelessScriptUpdateProcessorFactory.html[StatelessScriptUpdateProcessorFactory]::
An update request processor factory that enables the use of update processors implemented
as scripts.
+
+{solr-javadocs}/solr-core/org/apache/solr/update/processor/TimestampUpdateProcessorFactory.html[TimestampUpdateProcessorFactory]::
An update processor that adds a newly generated date value of "NOW" to any document being
added that does not already have a value in the specified field.
+
+{solr-javadocs}/solr-core/org/apache/solr/update/processor/URLClassifyProcessorFactory.html[URLClassifyProcessorFactory]::
Update processor which examines a URL and outputs to various other fields with characteristics
of that URL, including length, number of path levels, whether it is a top level URL (levels==0),
whether it looks like a landing/index page, a canonical representation of the URL (e.g., stripping
index.html), the domain and path parts of the URL, etc.
+
+{solr-javadocs}/solr-core/org/apache/solr/update/processor/UUIDUpdateProcessorFactory.html[UUIDUpdateProcessorFactory]::
An update processor that adds a newly generated UUID value to any document being added that
does not already have a value in the specified field.
 
 [[UpdateRequestProcessors-FieldMutatingUpdateProcessorFactoryDerivedFactories]]
 === FieldMutatingUpdateProcessorFactory Derived Factories
 
 These factories all provide functionality to _modify_ fields in a document as they're being
indexed. When using any of these factories, please consult the {solr-javadocs}/solr-core/org/apache/solr/update/processor/FieldMutatingUpdateProcessorFactory.html[FieldMutatingUpdateProcessorFactory
javadocs] for details on the common options they all support for configuring which fields
are modified.
 
-* {solr-javadocs}/solr-core/org/apache/solr/update/processor/ConcatFieldUpdateProcessorFactory.html[ConcatFieldUpdateProcessorFactory]:
Concatenates multiple values for fields matching the specified conditions using a configurable
delimiter.
-* {solr-javadocs}/solr-core/org/apache/solr/update/processor/CountFieldValuesUpdateProcessorFactory.html[CountFieldValuesUpdateProcessorFactory]:
Replaces any list of values for a field matching the specified conditions with the the count
of the number of values for that field.
-* {solr-javadocs}/solr-core/org/apache/solr/update/processor/FieldLengthUpdateProcessorFactory.html[FieldLengthUpdateProcessorFactory]:
Replaces any CharSequence values found in fields matching the specified conditions with the
lengths of those CharSequences (as an Integer).
-* {solr-javadocs}/solr-core/org/apache/solr/update/processor/FirstFieldValueUpdateProcessorFactory.html[FirstFieldValueUpdateProcessorFactory]:
Keeps only the first value of fields matching the specified conditions.
-* {solr-javadocs}/solr-core/org/apache/solr/update/processor/HTMLStripFieldUpdateProcessorFactory.html[HTMLStripFieldUpdateProcessorFactory]:
Strips all HTML Markup in any CharSequence values found in fields matching the specified conditions.
-* {solr-javadocs}/solr-core/org/apache/solr/update/processor/IgnoreFieldUpdateProcessorFactory.html[IgnoreFieldUpdateProcessorFactory]:
Ignores and removes fields matching the specified conditions from any document being added
to the index.
-* {solr-javadocs}/solr-core/org/apache/solr/update/processor/LastFieldValueUpdateProcessorFactory.html[LastFieldValueUpdateProcessorFactory]:
Keeps only the last value of fields matching the specified conditions.
-* {solr-javadocs}/solr-core/org/apache/solr/update/processor/MaxFieldValueUpdateProcessorFactory.html[MaxFieldValueUpdateProcessorFactory]:
An update processor that keeps only the the maximum value from any selected fields where multiple
values are found.
-* {solr-javadocs}/solr-core/org/apache/solr/update/processor/MinFieldValueUpdateProcessorFactory.html[MinFieldValueUpdateProcessorFactory]:
An update processor that keeps only the the minimum value from any selected fields where multiple
values are found.
-* {solr-javadocs}/solr-core/org/apache/solr/update/processor/ParseBooleanFieldUpdateProcessorFactory.html[ParseBooleanFieldUpdateProcessorFactory]:
Attempts to mutate selected fields that have only CharSequence-typed values into Boolean values.
-* {solr-javadocs}/solr-core/org/apache/solr/update/processor/ParseDateFieldUpdateProcessorFactory.html[ParseDateFieldUpdateProcessorFactory]:
Attempts to mutate selected fields that have only CharSequence-typed values into Solr date
values.
-* {solr-javadocs}/solr-core/org/apache/solr/update/processor/ParseNumericFieldUpdateProcessorFactory.html[ParseNumericFieldUpdateProcessorFactory]
derived classes:
-** {solr-javadocs}/solr-core/org/apache/solr/update/processor/ParseDoubleFieldUpdateProcessorFactory.html[ParseDoubleFieldUpdateProcessorFactory]:
Attempts to mutate selected fields that have only CharSequence-typed values into Double values.
-** {solr-javadocs}/solr-core/org/apache/solr/update/processor/ParseFloatFieldUpdateProcessorFactory.html[ParseFloatFieldUpdateProcessorFactory]:
Attempts to mutate selected fields that have only CharSequence-typed values into Float values.
-** {solr-javadocs}/solr-core/org/apache/solr/update/processor/ParseIntFieldUpdateProcessorFactory.html[ParseIntFieldUpdateProcessorFactory]:
Attempts to mutate selected fields that have only CharSequence-typed values into Integer values.
-** {solr-javadocs}/solr-core/org/apache/solr/update/processor/ParseLongFieldUpdateProcessorFactory.html[ParseLongFieldUpdateProcessorFactory]:
Attempts to mutate selected fields that have only CharSequence-typed values into Long values.
-* {solr-javadocs}/solr-core/org/apache/solr/update/processor/PreAnalyzedUpdateProcessorFactory.html[PreAnalyzedUpdateProcessorFactory]:
An update processor that parses configured fields of any document being added using _PreAnalyzedField_
with the configured format parser.
-* {solr-javadocs}/solr-core/org/apache/solr/update/processor/RegexReplaceProcessorFactory.html[RegexReplaceProcessorFactory]:
An updated processor that applies a configured regex to any CharSequence values found in the
selected fields, and replaces any matches with the configured replacement string.
-* {solr-javadocs}/solr-core/org/apache/solr/update/processor/RemoveBlankFieldUpdateProcessorFactory.html[RemoveBlankFieldUpdateProcessorFactory]:
Removes any values found which are CharSequence with a length of 0. (ie: empty strings).
-* {solr-javadocs}/solr-core/org/apache/solr/update/processor/TrimFieldUpdateProcessorFactory.html[TrimFieldUpdateProcessorFactory]:
Trims leading and trailing whitespace from any CharSequence values found in fields matching
the specified conditions.
-* {solr-javadocs}/solr-core/org/apache/solr/update/processor/TruncateFieldUpdateProcessorFactory.html[TruncateFieldUpdateProcessorFactory]:
Truncates any CharSequence values found in fields matching the specified conditions to a maximum
character length.
-* {solr-javadocs}/solr-core/org/apache/solr/update/processor/UniqFieldsUpdateProcessorFactory.html[UniqFieldsUpdateProcessorFactory]:
Removes duplicate values found in fields matching the specified conditions.
+{solr-javadocs}/solr-core/org/apache/solr/update/processor/ConcatFieldUpdateProcessorFactory.html[ConcatFieldUpdateProcessorFactory]::
Concatenates multiple values for fields matching the specified conditions using a configurable
delimiter.
+
+{solr-javadocs}/solr-core/org/apache/solr/update/processor/CountFieldValuesUpdateProcessorFactory.html[CountFieldValuesUpdateProcessorFactory]::
Replaces any list of values for a field matching the specified conditions with the the count
of the number of values for that field.
+
+{solr-javadocs}/solr-core/org/apache/solr/update/processor/FieldLengthUpdateProcessorFactory.html[FieldLengthUpdateProcessorFactory]::
Replaces any CharSequence values found in fields matching the specified conditions with the
lengths of those CharSequences (as an Integer).
+
+{solr-javadocs}/solr-core/org/apache/solr/update/processor/FirstFieldValueUpdateProcessorFactory.html[FirstFieldValueUpdateProcessorFactory]::
Keeps only the first value of fields matching the specified conditions.
+
+{solr-javadocs}/solr-core/org/apache/solr/update/processor/HTMLStripFieldUpdateProcessorFactory.html[HTMLStripFieldUpdateProcessorFactory]::
Strips all HTML Markup in any CharSequence values found in fields matching the specified conditions.
+
+{solr-javadocs}/solr-core/org/apache/solr/update/processor/IgnoreFieldUpdateProcessorFactory.html[IgnoreFieldUpdateProcessorFactory]::
Ignores and removes fields matching the specified conditions from any document being added
to the index.
+
+{solr-javadocs}/solr-core/org/apache/solr/update/processor/LastFieldValueUpdateProcessorFactory.html[LastFieldValueUpdateProcessorFactory]::
Keeps only the last value of fields matching the specified conditions.
+
+{solr-javadocs}/solr-core/org/apache/solr/update/processor/MaxFieldValueUpdateProcessorFactory.html[MaxFieldValueUpdateProcessorFactory]::
An update processor that keeps only the the maximum value from any selected fields where multiple
values are found.
+
+{solr-javadocs}/solr-core/org/apache/solr/update/processor/MinFieldValueUpdateProcessorFactory.html[MinFieldValueUpdateProcessorFactory]::
An update processor that keeps only the the minimum value from any selected fields where multiple
values are found.
+
+{solr-javadocs}/solr-core/org/apache/solr/update/processor/ParseBooleanFieldUpdateProcessorFactory.html[ParseBooleanFieldUpdateProcessorFactory]::
Attempts to mutate selected fields that have only CharSequence-typed values into Boolean values.
+
+{solr-javadocs}/solr-core/org/apache/solr/update/processor/ParseDateFieldUpdateProcessorFactory.html[ParseDateFieldUpdateProcessorFactory]::
Attempts to mutate selected fields that have only CharSequence-typed values into Solr date
values.
+
+{solr-javadocs}/solr-core/org/apache/solr/update/processor/ParseNumericFieldUpdateProcessorFactory.html[ParseNumericFieldUpdateProcessorFactory]
derived classes::
+
+{solr-javadocs}/solr-core/org/apache/solr/update/processor/ParseDoubleFieldUpdateProcessorFactory.html[ParseDoubleFieldUpdateProcessorFactory]:::
Attempts to mutate selected fields that have only CharSequence-typed values into Double values.
+
+{solr-javadocs}/solr-core/org/apache/solr/update/processor/ParseFloatFieldUpdateProcessorFactory.html[ParseFloatFieldUpdateProcessorFactory]:::
Attempts to mutate selected fields that have only CharSequence-typed values into Float values.
+
+{solr-javadocs}/solr-core/org/apache/solr/update/processor/ParseIntFieldUpdateProcessorFactory.html[ParseIntFieldUpdateProcessorFactory]:::
Attempts to mutate selected fields that have only CharSequence-typed values into Integer values.
+
+{solr-javadocs}/solr-core/org/apache/solr/update/processor/ParseLongFieldUpdateProcessorFactory.html[ParseLongFieldUpdateProcessorFactory]:::
Attempts to mutate selected fields that have only CharSequence-typed values into Long values.
+
+{solr-javadocs}/solr-core/org/apache/solr/update/processor/PreAnalyzedUpdateProcessorFactory.html[PreAnalyzedUpdateProcessorFactory]::
An update processor that parses configured fields of any document being added using _PreAnalyzedField_
with the configured format parser.
+
+{solr-javadocs}/solr-core/org/apache/solr/update/processor/RegexReplaceProcessorFactory.html[RegexReplaceProcessorFactory]::
An updated processor that applies a configured regex to any CharSequence values found in the
selected fields, and replaces any matches with the configured replacement string.
+
+{solr-javadocs}/solr-core/org/apache/solr/update/processor/RemoveBlankFieldUpdateProcessorFactory.html[RemoveBlankFieldUpdateProcessorFactory]::
Removes any values found which are CharSequence with a length of 0. (ie: empty strings).
+
+{solr-javadocs}/solr-core/org/apache/solr/update/processor/TrimFieldUpdateProcessorFactory.html[TrimFieldUpdateProcessorFactory]::
Trims leading and trailing whitespace from any CharSequence values found in fields matching
the specified conditions.
+
+{solr-javadocs}/solr-core/org/apache/solr/update/processor/TruncateFieldUpdateProcessorFactory.html[TruncateFieldUpdateProcessorFactory]::
Truncates any CharSequence values found in fields matching the specified conditions to a maximum
character length.
+
+{solr-javadocs}/solr-core/org/apache/solr/update/processor/UniqFieldsUpdateProcessorFactory.html[UniqFieldsUpdateProcessorFactory]::
Removes duplicate values found in fields matching the specified conditions.
 
 [[UpdateRequestProcessors-UpdateProcessorFactoriesThatCanBeLoadedasPlugins]]
 === Update Processor Factories That Can Be Loaded as Plugins
 
 These processors are included in Solr releases as "contribs", and require additional jars
loaded at runtime. See the README files associated with each contrib for details:
 
-* The {solr-javadocs}/solr-langid/index.html[`langid`] contrib provides:
-** {solr-javadocs}/solr-langid/org/apache/solr/update/processor/LangDetectLanguageIdentifierUpdateProcessorFactory.html[LangDetectLanguageIdentifierUpdateProcessorFactory]:
Identifies the language of a set of input fields using http://code.google.com/p/language-detection.
-** {solr-javadocs}/solr-langid/org/apache/solr/update/processor/TikaLanguageIdentifierUpdateProcessorFactory.html[TikaLanguageIdentifierUpdateProcessorFactory]:
Identifies the language of a set of input fields using Tika's LanguageIdentifier.
-* The {solr-javadocs}/solr-uima/index.html[`uima`] contrib provides:
-** {solr-javadocs}/solr-uima/org/apache/solr/uima/processor/UIMAUpdateRequestProcessorFactory.html[UIMAUpdateRequestProcessorFactory]:
Update document(s) to be indexed with UIMA extracted information.
+The {solr-javadocs}/solr-langid/index.html[`langid`] contrib provides::
+
+{solr-javadocs}/solr-langid/org/apache/solr/update/processor/LangDetectLanguageIdentifierUpdateProcessorFactory.html[LangDetectLanguageIdentifierUpdateProcessorFactory]:::
Identifies the language of a set of input fields using http://code.google.com/p/language-detection.
+
+{solr-javadocs}/solr-langid/org/apache/solr/update/processor/TikaLanguageIdentifierUpdateProcessorFactory.html[TikaLanguageIdentifierUpdateProcessorFactory]:::
Identifies the language of a set of input fields using Tika's LanguageIdentifier.
+
+The {solr-javadocs}/solr-uima/index.html[`uima`] contrib provides::
+
+{solr-javadocs}/solr-uima/org/apache/solr/uima/processor/UIMAUpdateRequestProcessorFactory.html[UIMAUpdateRequestProcessorFactory]:::
Update document(s) to be indexed with UIMA extracted information.
 
 [[UpdateRequestProcessors-UpdateProcessorFactoriesYouShouldNotModifyorRemove]]
 === Update Processor Factories You Should _Not_ Modify or Remove
 
 These are listed for completeness, but are part of the Solr infrastructure, particularly
SolrCloud. Other than insuring you do _not_ remove them when modifying the update request
handlers (or any copies you make), you will rarely, if ever, need to change these.
 
-* {solr-javadocs}/solr-core/org/apache/solr/update/processor/DistributedUpdateProcessorFactory.html[DistributedUpdateProcessorFactory]:
Used to distribute updates to all necessary nodes.
-** {solr-javadocs}/solr-core/org/apache/solr/update/processor/NoOpDistributingUpdateProcessorFactory.html[NoOpDistributingUpdateProcessorFactory]:
An alternative No-Op implementation of `DistributingUpdateProcessorFactory` that always returns
null. Designed for experts who want to bypass distributed updates and use their own custom
update logic.
-* {solr-javadocs}/solr-core/org/apache/solr/update/processor/LogUpdateProcessorFactory.html[LogUpdateProcessorFactory]:
A logging processor. This keeps track of all commands that have passed through the chain and
prints them on finish().
-* {solr-javadocs}/solr-core/org/apache/solr/update/processor/RunUpdateProcessorFactory.html[RunUpdateProcessorFactory]:
Executes the update commands using the underlying UpdateHandler. Almost all processor chains
should end with an instance of `RunUpdateProcessorFactory` unless the user is explicitly executing
the update commands in an alternative custom `UpdateRequestProcessorFactory`.
+{solr-javadocs}/solr-core/org/apache/solr/update/processor/DistributedUpdateProcessorFactory.html[DistributedUpdateProcessorFactory]::
Used to distribute updates to all necessary nodes.
+
+{solr-javadocs}/solr-core/org/apache/solr/update/processor/NoOpDistributingUpdateProcessorFactory.html[NoOpDistributingUpdateProcessorFactory]:::
An alternative No-Op implementation of `DistributingUpdateProcessorFactory` that always returns
null. Designed for experts who want to bypass distributed updates and use their own custom
update logic.
+
+{solr-javadocs}/solr-core/org/apache/solr/update/processor/LogUpdateProcessorFactory.html[LogUpdateProcessorFactory]::
A logging processor. This keeps track of all commands that have passed through the chain and
prints them on finish().
+
+{solr-javadocs}/solr-core/org/apache/solr/update/processor/RunUpdateProcessorFactory.html[RunUpdateProcessorFactory]::
Executes the update commands using the underlying UpdateHandler. Almost all processor chains
should end with an instance of `RunUpdateProcessorFactory` unless the user is explicitly executing
the update commands in an alternative custom `UpdateRequestProcessorFactory`.
 
 [[UpdateRequestProcessors-UpdateProcessorsThatCanBeUsedatRuntime]]
 === Update Processors That Can Be Used at Runtime


Mime
View raw message