lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Yonik Seeley (Issue Comment Edited) (JIRA)" <j...@apache.org>
Subject [jira] [Issue Comment Edited] (SOLR-3178) Versioning - optimistic locking
Date Thu, 19 Apr 2012 19:14:40 GMT

    [ https://issues.apache.org/jira/browse/SOLR-3178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13257682#comment-13257682
] 

Yonik Seeley edited comment on SOLR-3178 at 4/19/12 7:14 PM:
-------------------------------------------------------------

bq. it is hard to push too many small partly-done features.

As committers it's the opposite - it's sometimes hard to consume patches that do a lot of
different things.

bq. It works with multi-document update request, providing fully detailed feedback to the
client about which document-updates failed (and why) and which succeeded.

That's great!  But it's also a separate feature that we've needed for a while (and I think
there has been a JIRA issue open for it for a while).

bq. Looking shortly at you patch, I belive, that if two threads in the server JVM overlaps
in a way that is unfortunate enough, then it is possible that data will not be stored or will
be overwritten without an exception being thrown to indicate that to the client.

I'm confident in the synchronization/concurrency part - it's just reusing what was put in
place for SolrCloud to handle reordering of updates to replicas and is very well tested (see
TestRealTimeGet).  But please let us know if you see a problem with it, as that would mean
a problem with SolrCloud today (even without this patch).

bq. Feedback to client is even possible if using ConcurrentSolrServer, because it has been
changed to provide a Future from "request" - a Future that will eventually provide the calling
client with the result from the request.

This sounds like another great feature!  It really deserves it's own issue so people can more
easily provide review/feedback and not have it coupled to this issue.

Some other additions that can be handled later that I see:
- SolrJ support for easier passing of _version_ on a delete, constants for _version_, etc
- Use of "1" as a generic "exists" version (i.e. update document only if it already exists)
- If one document in a batch fails, don't automatically abort, and provide info back about
which docs succeeded and which failed (your first point).

That last one in particular needs some good discussion and design work and really deserves
it's own issue.

Down to the specifics of this patch... it's very non-invasive to core, consisting of the following
code block (once for add, once for delete):
{code}
            if (versionOnUpdate != 0) {
              Long lastVersion = vinfo.lookupVersion(cmd.getIndexedId());
              long foundVersion = lastVersion == null ? -1 : lastVersion;
              if ( versionOnUpdate == foundVersion || (versionOnUpdate < 0 && foundVersion
< 0) ) {
                // we're ok if versions match, or if both are negative (all missing docs are
equal)
              } else {
                throw new SolrException(ErrorCode.CONFLICT, "version conflict for " + cmd.getPrintableId()
+ " expected=" + versionOnUpdate + " actual=" + foundVersion);
              }
            }
{code}

Having this current improvement committed in no way blocks any future improvements you may
come up with (including deleting the code quoted above and using whatever method you have
come up with for checking the versions), and it even uses the same API (or a subset of it)
via _version_ and 409.  Progress, not perfection!

                
      was (Author: yseeley@gmail.com):
    bq. it is hard to push too many small partly-done features.

As committers it's the opposite - it's sometimes hard to consume patches that do a lot of
different things.

bq. It works with multi-document update request, providing fully detailed feedback to the
client about which document-updates failed (and why) and which succeeded.

That's great!  But it's also a separate feature that we've needed for a while (and I think
there has been a JIRA issue open for it for a while).

bq. Looking shortly at you patch, I belive, that if two threads in the server JVM overlaps
in a way that is unfortunate enough, then it is possible that data will not be stored or will
be overwritten without an exception being thrown to indicate that to the client.

I'm confident in the synchronization/concurrency part - it's just reusing what was put in
place for SolrCloud to handle reordering of updates to replicas and is very well tested (see
TestRealTimeGet).  But please let us know if you see a problem with it, as that would mean
a problem with SolrCloud today (even without this patch).

Some other additions that can be handled later that I see:
- SolrJ support for easier passing of _version_ on a delete, constants for _version_, etc
- Use of "1" as a generic "exists" version (i.e. update document only if it already exists)
- If one document in a batch fails, don't automatically abort, and provide info back about
which docs succeeded and which failed (your first point).

That last one in particular needs some good discussion and design work and really deserves
it's own issue.

Down to the specifics of this patch... it's very non-invasive to core, consisting of the following
code block (once for add, once for delete):
{code}
            if (versionOnUpdate != 0) {
              Long lastVersion = vinfo.lookupVersion(cmd.getIndexedId());
              long foundVersion = lastVersion == null ? -1 : lastVersion;
              if ( versionOnUpdate == foundVersion || (versionOnUpdate < 0 && foundVersion
< 0) ) {
                // we're ok if versions match, or if both are negative (all missing docs are
equal)
              } else {
                throw new SolrException(ErrorCode.CONFLICT, "version conflict for " + cmd.getPrintableId()
+ " expected=" + versionOnUpdate + " actual=" + foundVersion);
              }
            }
{code}

Having this current improvement committed in no way blocks any future improvements you may
come up with (including deleting the code quoted above and using whatever method you have
come up with for checking the versions), and it even uses the same API (or a subset of it)
via _version_ and 409.  Progress, not perfection!


                  
> Versioning - optimistic locking
> -------------------------------
>
>                 Key: SOLR-3178
>                 URL: https://issues.apache.org/jira/browse/SOLR-3178
>             Project: Solr
>          Issue Type: New Feature
>          Components: update
>    Affects Versions: 3.5
>         Environment: All
>            Reporter: Per Steffensen
>            Assignee: Per Steffensen
>              Labels: RDBMS, insert, locking, nosql, optimistic, uniqueKey, update, versioning
>             Fix For: 4.0
>
>         Attachments: SOLR-3178.patch
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> In order increase the ability of Solr to be used as a NoSql database (lots of concurrent
inserts, updates, deletes and queries in the entire lifetime of the index) instead of just
a search index (first: everything indexed (in one thread), after: only queries), I would like
Solr to support versioning to be used for optimistic locking.
> When my intent (see SOLR-3173) is to update an existing document, I will need to provide
a version-number equal to "the version number I got when I fetched the existing document for
update" plus one. If this provided version-number does not correspond to "the newest version-number
of that document at the time of update" plus one, I will get a VersionConflict error. If it
does correspond the document will be updated with the new one, so that "the newest version-number
of that document" is NOW one higher than before the update. Correct but efficient concurrency
handling.
> When my intent (see SOLR-3173) is to insert a new document, the version number provided
will not be used - instead a version-number 0 will be used. According to SOLR-3173 insert
will only succeed if a document with the same value on uniqueKey-field does not already exist.
> In general when talking about different versions of "the same document", of course we
need to be able to identify when a document "is the same" - that, per definition, is when
the values of the uniqueKey-fields are equal. 
> The functionality provided by this issue is only really meaningfull when you run with
"updateLog" activated.
> This issue might be solved more or less at the same time as SOLR-3173, and only one single
SVN patch might be given to cover both issues.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message