lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Per Steffensen (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SOLR-3428) SolrCmdDistributor flushAdds/flushDeletes problems
Date Thu, 03 May 2012 22:00:49 GMT

    [ https://issues.apache.org/jira/browse/SOLR-3428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13267858#comment-13267858
] 

Per Steffensen commented on SOLR-3428:
--------------------------------------

Well first of all ClassicConsistencyHybridUpdateSemanticsSolrCloudTest (in the patch) is a
test that protects the fix from being broken, because it depends on the issues mentioned in
the description above being fixed. So if you commit the entire patch attached to SOLR-3178
you will have the fix for the issues AND a good test (even though its core focus in not the
issues in this SOLR-3428) protecting if from being broken again by accident in the future.

One reason that you probably never found the issues before now, is that it is really hard
to detect that requests are forwarded (from non-leader to leader or from leader to replica)
multiple times or not at all, if requests always result in idempotent-that-cant-fail-operations
only is carried out on server-side. And that has always been the case for update requests
until I introduced versioning and unique key constraint errors. Using those features update
requests are no longer idempotent and fail-free - an update that succeeds nicely the first
time it is carried out will not succeed if it is carried out a second time (DocumentAlreadyExists-,
DocumentDoesNotExist- or VersionConflict-error will occur). Because of the fact that updates
are idempotent and fail-free, and because we have not finegrained (per document) and typed
error propagation (before my patch), it will be kinda hard to construct a (not too artificial,
with a lot of mocking in order to simulate unrealistic behaviour) test showing those issues
mentioned in the description above (and protecting the fix from being broken in the future).
So I guess I would rather not try. 

Instead I really encourage you to have that patch committed. It provides well done features
and fixes. I know it is a big patch covering many issues and I have learned for the future
to try not to bring so big patches, but basically all the issues are kinda related - versioning
and the ability to fail on unique key constraint violations (SOLR-3173 and SOLR-3178) is really
not worth much without the ability to report the typed errors back to the client per document
(SOLR-3382) and the bugs in this SOLR-3428 is kinda harmless and therefore hard to provoke
in a test as long as everything behind an update request is idempotent and fail-free (which
it is before SOLR-3173, SOLR-3178 and SOLR-3382). I really think it is worth it that one or
two of you committers spent a day or two understanding the features and fixes in the patch
- it will save you much more than the one or two days later, if you have to make the features
and fix the bugs yourself.

So bacially I think it would be a waste of (a lot) of time trying to create a poor (because
it will have to be very artificial and unrealistic) test for the issues in this SOLR-3428.
IMHO you should put your effort into doing whatever it takes to get the big patch committed.

Regards, Per Steffensen 
                
> SolrCmdDistributor flushAdds/flushDeletes problems
> --------------------------------------------------
>
>                 Key: SOLR-3428
>                 URL: https://issues.apache.org/jira/browse/SOLR-3428
>             Project: Solr
>          Issue Type: Bug
>          Components: replication (java), SolrCloud, update
>    Affects Versions: 4.0
>            Reporter: Per Steffensen
>            Assignee: Per Steffensen
>              Labels: add, delete, replica, solrcloud, update
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> A few problems with SolrCmdDistributor.flushAdds/flushDeletes
> * If number of AddRequests/DeleteRequests in alist/dlist is below limit for a specific
node the method returns immediately and doesnt flush for subsequent nodes
> * When returning immediately because there is below limit requests for a given node,
then previous nodes that have already been flushed/submitted are not removed from adds/deletes
maps (causing them to be flushed/submitted again the next time flushAdds/flushDeletes is executed)
> * The idea about just combining params does not work for SEEN_LEADER params (and probably
others as well). Since SEEN_LEADER cannot be expressed (unlike commitWithin and overwrite)
for individual operations in the request, you need to sent two separate submits. One containing
requests with SEEN_LEADER=true and one with SEEN_LEADER=false.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message