Return-Path: X-Original-To: apmail-hbase-issues-archive@www.apache.org Delivered-To: apmail-hbase-issues-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 2D24610A46 for ; Fri, 7 Jun 2013 23:00:22 +0000 (UTC) Received: (qmail 73909 invoked by uid 500); 7 Jun 2013 23:00:22 -0000 Delivered-To: apmail-hbase-issues-archive@hbase.apache.org Received: (qmail 73871 invoked by uid 500); 7 Jun 2013 23:00:21 -0000 Mailing-List: contact issues-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list issues@hbase.apache.org Received: (qmail 73862 invoked by uid 99); 7 Jun 2013 23:00:21 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 07 Jun 2013 23:00:21 +0000 Date: Fri, 7 Jun 2013 23:00:21 +0000 (UTC) From: "Elliott Clark (JIRA)" To: issues@hbase.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (HBASE-8701) distributedLogReplay need to apply wal edits in the receiving order of those edits MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HBASE-8701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13678551#comment-13678551 ] Elliott Clark commented on HBASE-8701: -------------------------------------- bq.It seems to me that we have edits with the same timestamps in different WAL files this can only happen when the client explicitly set the timestamps. Until we have a full Multi-Wal implementation, which is something that's definitely planned. > distributedLogReplay need to apply wal edits in the receiving order of those edits > ---------------------------------------------------------------------------------- > > Key: HBASE-8701 > URL: https://issues.apache.org/jira/browse/HBASE-8701 > Project: HBase > Issue Type: Bug > Components: MTTR > Reporter: Jeffrey Zhong > Assignee: Jeffrey Zhong > Fix For: 0.98.0, 0.95.2 > > > This issue happens in distributedLogReplay mode when recovering multiple puts of the same key + version(timestamp). After replay, the value is nondeterministic of the key > h5. The original concern situation raised from [~eclark]: > For all edits the rowkey is the same. > There's a log with: [ A (ts = 0), B (ts = 0) ] > Replay the first half of the log. > A user puts in C (ts = 0) > Memstore has to flush > A new Hfile will be created with [ C, A ] and MaxSequenceId = C's seqid. > Replay the rest of the Log. > Flush > The issue will happen in similar situation like Put(key, t=T) in WAL1 and Put(key,t=T) in WAL2 > h5. Below is the option I'd like to use: > a) During replay, we pass wal file name hash in each replay batch and original wal sequence id of each edit to the receiving RS > b) Once a wal is recovered, playing RS send a signal to the receiving RS so the receiving RS can flush > c) In receiving RS, different WAL file of a region sends edits to different memstores.(We can visualize this in high level as sending changes to a new region object with name(origin region name + wal name hash) and use the original sequence Ids.) > d) writes from normal traffic(allow writes during recovery) are put in normal memstores as of today and flush normally with new sequenceIds. > h5. The other alternative options are listed below for references: > Option one > a) disallow writes during recovery > b) during replay, we pass original wal sequence ids > c) hold flush till all wals of a recovering region are replayed. Memstore should hold because we only recover unflushed wal edits. For edits with same key + version, whichever with larger sequence Id wins. > Option two > a) During replay, we pass original wal sequence ids > b) for each wal edit, we store each edit's original sequence id along with its key. > c) during scanning, we use the original sequence id if it's present otherwise its store file sequence Id > d) compaction can just leave put with max sequence id > Please let me know if you have better ideas. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira