lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Yonik Seeley (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SOLR-3884) possible bug in how commits are handled during "recovery" mode on startup?
Date Mon, 24 Sep 2012 23:54:07 GMT

    [ https://issues.apache.org/jira/browse/SOLR-3884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13462271#comment-13462271
] 

Yonik Seeley commented on SOLR-3884:
------------------------------------

bq. So the problem may just be that we accept updates before we are ready?

Yes, sounds like it from the diagnosis that Hoss gave.

bq. I think this is somewhat related to SOLR-3861 in that the only reason we see these problems
during tlog REPLAY is because there was no hard commit on shutdown of the first instance.

I was just going to ask this (why we are seeing a tlog recovery in the first place).  We always
used to do a commit on shutdown, and I had to add an explicit test hook to disable this for
TestRecovery (DirectUpdateHandler2.commitOnClose = false;)

bq. but independent of that we still need to think about how to better deal with documents
comming in during RECOVERY

We buffer them to the tlog.  They will get added eventually.
What we should really think about is documents coming in during recovery that aren't from
a leader and we aren't in cloud mode (hence we won't forward to a leader).  Perhaps we should
fail the update?

bq. On IRC miller suggested that perhaps Solr should block and not accept new updates until
REPLAY is done (ideally by not listing on the socket i would think)

Hmmm, yeah I guess that would work too, only if we don't advertise the core being in recovery
mode to the cluster.  We *don't* want the leader to be forwarding updates if we're going to
block.

                
> possible bug in how commits are handled during "recovery" mode on startup?
> --------------------------------------------------------------------------
>
>                 Key: SOLR-3884
>                 URL: https://issues.apache.org/jira/browse/SOLR-3884
>             Project: Solr
>          Issue Type: Bug
>            Reporter: Hoss Man
>
> while testing out 4.0-rc0, sarowe noted the he was seeing the smoke tester script fail
while sanity checking the solr example.
> https://mail-archives.apache.org/mod_mbox/lucene-dev/201209.mbox/%3C6C78E97C707B5B4C8CC61D44F87545863ED73E@SUEX10-mbx-03.ad.syr.edu%3E
> I'm not certain, but looking at his logs, i think this suggests a bug in how commits
are handled when a newly started server is in "recovery" mode

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message