Return-Path: X-Original-To: apmail-hbase-issues-archive@www.apache.org Delivered-To: apmail-hbase-issues-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id DFC2710ADD for ; Wed, 18 Feb 2015 21:10:16 +0000 (UTC) Received: (qmail 44701 invoked by uid 500); 18 Feb 2015 21:10:13 -0000 Delivered-To: apmail-hbase-issues-archive@hbase.apache.org Received: (qmail 44660 invoked by uid 500); 18 Feb 2015 21:10:13 -0000 Mailing-List: contact issues-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list issues@hbase.apache.org Received: (qmail 44646 invoked by uid 99); 18 Feb 2015 21:10:13 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 18 Feb 2015 21:10:13 +0000 Date: Wed, 18 Feb 2015 21:10:13 +0000 (UTC) From: "Jonathan Hsieh (JIRA)" To: issues@hbase.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Updated] (HBASE-11906) Meta data loss with distributed log replay MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HBASE-11906?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Hsieh updated HBASE-11906: ----------------------------------- Assignee: Jeffrey Zhong (was: Jonathan Hsieh) > Meta data loss with distributed log replay > ------------------------------------------ > > Key: HBASE-11906 > URL: https://issues.apache.org/jira/browse/HBASE-11906 > Project: HBase > Issue Type: Bug > Affects Versions: 0.99.0, 2.0.0 > Reporter: Jimmy Xiang > Assignee: Jeffrey Zhong > Fix For: 2.0.0, 0.99.1 > > Attachments: HBASE-11906.patch, debugging.patch, hbase-11906-v2.patch, meta-data-loss-2.log, meta-data-loss-with-dlr.log > > > In the attached log, you can see, before log replaying, the region is open on e1205: > {noformat} > A3. 2014-09-05 16:38:46,705 INFO [B.defaultRpcServer.handler=5,queue=2,port=20020] master.RegionStateStore: Updating row IntegrationTestBigLinkedList,\x90Jy\x04\xA7\x90Jp,1409959495482.cbb0d736ebfabcf4a07e5a7b395fcdf7. with state=OPEN&openSeqNum=40118237&server=e1205.halxg.cloudera.com,20020,1409960280431 > {noformat} > After the log replay, we got from meta the region is open on e1209 > {noformat} > A4. 2014-09-05 16:41:12,257 INFO [ActiveMasterManager] master.AssignmentManager: Loading from meta: {cbb0d736ebfabcf4a07e5a7b395fcdf7 state=OPEN, ts=1409960472257, server=e1209.halxg.cloudera.com,20020,1409959391651} > {noformat} > The replayed edits show the log does have the edit expected: > {noformat} > 2014-09-05 16:41:11,862 INFO [B.defaultRpcServer.handler=18,queue=0,port=20020] regionserver.RSRpcServices: Meta replay edit type=PUT,mutation={"totalColumns":4,"families":{"info":[{"timestamp":1409960326705,"tag":["3:\\x00\\x00\\x00\\x00\\x02bad"],"value":"e1205.halxg.cloudera.com:20020","qualifier":"server","vlen":30},{"timestamp":1409960326705,"tag":["3:\\x00\\x00\\x00\\x00\\x02bad"],"value":"\\x00\\x00\\x01HH.\\x81o","qualifier":"serverstartcode","vlen":8},{"timestamp":1409960326705,"tag":["3:\\x00\\x00\\x00\\x00\\x02bad"],"value":"\\x00\\x00\\x00\\x00\\x02d'\\xDD","qualifier":"seqnumDuringOpen","vlen":8},{"timestamp":1409960326706,"tag":["3:\\x00\\x00\\x00\\x00\\x02bad"],"value":"OPEN","qualifier":"state","vlen":4}]},"row":"IntegrationTestBigLinkedList,\\x90Jy\\x04\\xA7\\x90Jp,1409959495482.cbb0d736ebfabcf4a07e5a7b395fcdf7."} > {noformat} > Why we picked up a wrong value with an older time stamp? > {noformat} > 2014-09-05 16:41:11,063 INFO [B.defaultRpcServer.handler=9,queue=0,port=20020] regionserver.RSRpcServices: Meta replay edit type=PUT,mutation={"totalColumns":4,"families":{"info":[{"timestamp":1409959994634,"tag":["3:\\x00\\x00\\x00\\x00\\x00\\x00\\x09\\x99"],"value":"e1209.halxg.cloudera.com:20020","qualifier":"server","vlen":30},{"timestamp":1409959994634,"tag":["3:\\x00\\x00\\x00\\x00\\x00\\x00\\x09\\x99"],"value":"\\x00\\x00\\x01HH \\xF1\\xA3","qualifier":"serverstartcode","vlen":8},{"timestamp":1409959994634,"tag":["3:\\x00\\x00\\x00\\x00\\x00\\x00\\x09\\x99"],"value":"\\x00\\x00\\x00\\x00\\x00\\x01\\xB7\\xAB","qualifier":"seqnumDuringOpen","vlen":8},{"timestamp":1409959994634,"tag":["3:\\x00\\x00\\x00\\x00\\x00\\x00\\x09\\x99"],"value":"OPEN","qualifier":"state","vlen":4}]},"row":"IntegrationTestBigLinkedList,\\x90Jy\\x04\\xA7\\x90Jp,1409959495482.cbb0d736ebfabcf4a07e5a7b395fcdf7."} > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)