Mailing-List: contact dev-help@lucene.apache.org; run by ezmlm
Precedence: bulk
Reply-To: dev@lucene.apache.org
Date: Fri, 6 Jan 2017 04:16:58 +0000 (UTC)
From: "Cao Manh Dat (JIRA)" <jira@apache.org>
To: dev@lucene.apache.org
Message-ID: <JIRA.13031898.1483520000000.661012.1483676218356@Atlassian.JIRA>
In-Reply-To: <JIRA.13031898.1483520000000@Atlassian.JIRA>
References: <JIRA.13031898.1483520000000@Atlassian.JIRA> <JIRA.13031898.1483520000285@arcas>
Subject: [jira] [Comment Edited] (SOLR-9922) Write buffering updates to
 another tlog
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit
archived-at: Fri, 06 Jan 2017 04:17:01 -0000


    [ https://issues.apache.org/jira/browse/SOLR-9922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15803485#comment-15803485 ] 

Cao Manh Dat edited comment on SOLR-9922 at 1/6/17 4:16 AM:
------------------------------------------------------------

In the current code, FLAG_GAP is used in RecoveryStrategy, we first check lastOperation have FLAG_GAP, if yes we are sure that buffering updates is not applied ( because the node failed during buffering ) so we skip peersync and go directly to replication process.

In my patch, I detect this event by checking that any old buffer log exists. So I'm worried about the case when the lastOperation have FLAG_GAP when users restart the whole cluster with the new code. Instead of going to replication process, the new code will go to peerSync.


was (Author: caomanhdat):
In current code, FLAG_GAP is used in RecoveryStrategy, we first check lastOperation have FLAG_GAP, if yes we are sure that buffering updates is not applied ( because the node failed during buffering ) so we skip peersync and go directly to replication process.

In my patch, I detect this event by checking that any old buffer log exist. So I'm worry about the case when the lastOperation have FLAG_GAP when users restart the whole cluster with new code. That the reason why I said that "all nodes should be in ACTIVE state".

> Write buffering updates to another tlog
> ---------------------------------------
>
>                 Key: SOLR-9922
>                 URL: https://issues.apache.org/jira/browse/SOLR-9922
>             Project: Solr
>          Issue Type: Improvement
>      Security Level: Public(Default Security Level. Issues are Public) 
>            Reporter: Cao Manh Dat
>         Attachments: SOLR-9922.patch, SOLR-9922.patch, SOLR-9922.patch
>
>
> Currently, we write buffering logs to current tlog and not apply that updates to index. Then we rely on replay log to apply that updates to index. But at the same time there are some updates also write to current tlog and applied to the index. 
> For example, during peersync, if new updates come to replica we will end up with this tlog
> tlog : old1, new1, new2, old2, new3, old3
> old updates belong to peersync, and these updates are applied to the index.
> new updates belong to buffering updates, and these updates are not applied to the index.
> But writing all the updates to same current tlog make code base very complex. We should write buffering updates to another tlog file.
> By doing this, it will help our code base simpler. It also makes replica recovery for SOLR-9835 more easier. Because after peersync success we can copy new updates from temporary file to current tlog, for example
> tlog : old1, old2, old3
> temporary tlog : new1, new2, new3
> -->
> tlog : old1, old2, old3, new1, new2, new3


--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org