Mailing-List: contact issues-help@hbase.apache.org; run by ezmlm
Precedence: bulk
Date: Sat, 15 Jun 2013 04:59:20 +0000 (UTC)
From: "stack (JIRA)" <jira@apache.org>
To: issues@hbase.apache.org
Message-ID: <JIRA.12652565.1371101563923.119685.1371272360215@arcas>
In-Reply-To: <JIRA.12652565.1371101563923@arcas>
References: <JIRA.12652565.1371101563923@arcas>
Subject: [jira] [Commented] (HBASE-8741) Mutations on Regions in recovery
 mode might have same sequenceIDs
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit


    [ https://issues.apache.org/jira/browse/HBASE-8741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13684074#comment-13684074 ] 

stack commented on HBASE-8741:
------------------------------

Who would read the new info that is in tail of the WAL logs?

Do we need to read the tail of the second to last file if we are reading the lsat unfinished file anyways since it will have an edit in excess of the second-to-last files?

Will we read this last file twice?  Once when figuring last sequenceid and then again when replaying the edits?

In multiwall case we may have to read more-than-one file -- all files that are still open.

How we transfer the sequence number to the regionserver that is opening the region?

Good stuff.

One thought I had yesterday talking w/ Himanshu is if we did the above stuff he suggests, couldn't we open a region for writes in the current distributed log system?  Could help while we are still working on the distributed replay stuff?
                
> Mutations on Regions in recovery mode might have same sequenceIDs
> -----------------------------------------------------------------
>
>                 Key: HBASE-8741
>                 URL: https://issues.apache.org/jira/browse/HBASE-8741
>             Project: HBase
>          Issue Type: Bug
>          Components: MTTR
>    Affects Versions: 0.95.1
>            Reporter: Himanshu Vashishtha
>            Assignee: Himanshu Vashishtha
>
> Currently, when opening a region, we find the maximum sequence ID from all its HFiles and then set the LogSequenceId of the log (in case the later is at a small value). This works good in recovered.edits case as we are not writing to the region until we have replayed all of its previous edits. 
> With distributed log replay, if we want to enable writes while a region is under recovery, we need to make sure that the logSequenceId > maximum logSequenceId of the old regionserver. Otherwise, we might have a situation where new edits have same (or smaller) sequenceIds. 
> We can store region level information in the WALTrailer, than this scenario could be avoided by:
> a) reading the trailer of the "last completed" file, i.e., last wal file which has a trailer and,
> b) completely reading the last wal file (this file would not have the trailer, so it needs to be read completely).
> In future, if we switch to multi wal file, we could read the trailer for all completed WAL files, and reading the remaining incomplete files.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira