lucene-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hoss Man (Confluence)" <conflue...@apache.org>
Subject [CONF] Apache Solr Reference Guide > Updating Parts of Documents
Date Wed, 18 Sep 2013 18:48:00 GMT
Space: Apache Solr Reference Guide (https://cwiki.apache.org/confluence/display/solr)
Page: Updating Parts of Documents (https://cwiki.apache.org/confluence/display/solr/Updating+Parts+of+Documents)

Change Comment:
---------------------------------------------------------------------
note about combining atomic updates with optimistic concurrency

Edited by Hoss Man:
---------------------------------------------------------------------
Once you have indexed the content you need in your Solr index, you will want to start thinking
about your strategy for dealing with changes to those documents. Solr supports two approaches
to updating documents that have only partially changed. 

The first is _atomic updates_. This approach allows changing only one or more fields of a
document without having to re-index the entire document.

The second approach is known as _optimistic concurrency_ or _optimistic locking_.  It is a
feature of many NoSQL databases, and allows conditional updating a document based on it's
version. This approach includes semantics and rules for how to deal with version matches or
mis-matches.

Atomic Updates and Optimistic Concurrency may be used as independent strategies for managing
changes to documents, or they may be combined: you can use optimistic concurrency to conditionally
apply an atomic update.


h2. Atomic Updates

Solr supports several modifiers that atomically update values of a document. This allows updating
only specific fields, which can help speed indexing processes in an environment where speed
of index additions is critical to the application.

To use atomic updates, add a modifier to the field that needs to be updated. The content can
be updated, added to, or incrementally increased if a number.

|| Modifier || Usage ||
| set | Set or replace a particular value, or remove the value if 'null' is specified as the
new value. |
| add | Adds an additional value to a list. |
| inc | Increments a numeric value by a specific amount. |

{note}All original source fields must be stored for field modifiers to work correctly, which
is the Solr default.{note}

For example:

{code:language=none|borderStyle=solid|borderColor=#666666}
{"id":"mydoc", "f1"{"set":10}, "f2"{"add":20}}
{code}

This example results in field {{f1}} being set to "10", and field {{f2}} having an additional
value of "20" added. All other existing fields from the original document remain unchanged.

h2. Optimistic Concurrency

Optimistic concurrency support allows for versioning of documents. By default, Solr's {{schema.xml}}
includes a {{\_version_}} field, and this field is added to each document. In general, using
optimistic concurrency involves the following workflow:

# A client reads a document. In Solr, one might retrieve the document with the {{/get}} handler
to be sure to have the latest version.
# A client changes the document locally.
# The client resubmits the changed document to Solr, for example, perhaps with the {{/update}}
handler.
# If there is a version conflict (HTTP error code 409), the client starts the process over.

When the client resubmits a changed document to Solr, the {{\_version_}} can be included with
the update to invoke optimistic concurrency control. Specific semantics are used to define
when the document should be updated or when to report a conflict. 

* If the content in the {{\_version_}} field is greater than '1' (i.e., '12345'), then the
{{\_version_}} in the document must match the {{\_version_}} in the index. 
* If the content in the {{\_version_}} field is equal to '1', then the document must simply
exist. In this case, no version matching occurs, but if the document does not exist, the updates
will be rejected.
* If the content in the {{\_version_}} field is less than '0' (i.e., '-1'), then the document
must *not* exist. In this case, no version matching occurs, but if the document exists, the
updates will be rejected.
* If the content in the {{\_version_}} field is equal to '0', then it doesn't matter if the
versions match or if the document exists or not. If it exists, it will be overwritten; if
it does not exist, it will be added.

If the document being updated does not include the {{\_version_}} field, and atomic updates
are not being used, the document will be treated by normal Solr rules, which is usually to
discard it

For more information, please also see [Yonik Seeley's presentation on NoSQL features in Solr
4|https://www.youtube.com/watch?v=WYVM6Wz-XTw] from Apache Lucene EuroCon 2012. 

{scrollbar}


Stop watching space: https://cwiki.apache.org/confluence/users/removespacenotification.action?spaceKey=solr
Change email notification preferences: https://cwiki.apache.org/confluence/users/editmyemailsettings.action


    

Mime
View raw message