hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jeffrey Zhong (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-11906) Meta data loss with distributed log replay
Date Tue, 09 Sep 2014 19:47:29 GMT

    [ https://issues.apache.org/jira/browse/HBASE-11906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14127454#comment-14127454
] 

Jeffrey Zhong commented on HBASE-11906:
---------------------------------------

[~jxiang] Where your second log line is printed? I'm assuming your first log line "Log replay
mvccNum = (seq id)40002911" is from the following line in HRegion#doMiniBatchMutation

{code}
mvccNum = batchOp.getReplaySequenceId();
{code}

The basic idea is that we use original SeqId for replaying while the replayed edit will be
assigned a new SeqId and is written with the new seqId along with original seqId stored in
HLogKey#origLogSeqNum into replaying RS's WAL. During replay, region read point is bigger
than current replay edit's mvcc value. Since we don't allow read during recovery, so it is
fine here. 

If you open a scanner before recovery process(with smaller read point),  the scan will only
see updates till its read point not the newest.

> Meta data loss with distributed log replay
> ------------------------------------------
>
>                 Key: HBASE-11906
>                 URL: https://issues.apache.org/jira/browse/HBASE-11906
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 2.0.0
>            Reporter: Jimmy Xiang
>         Attachments: meta-data-loss-2.log, meta-data-loss-with-dlr.log
>
>
> In the attached log, you can see, before log replaying, the region is open on e1205:
> {noformat}
> A3. 2014-09-05 16:38:46,705 INFO  [B.defaultRpcServer.handler=5,queue=2,port=20020] master.RegionStateStore:
Updating row IntegrationTestBigLinkedList,\x90Jy\x04\xA7\x90Jp,1409959495482.cbb0d736ebfabcf4a07e5a7b395fcdf7.
with state=OPEN&openSeqNum=40118237&server=e1205.halxg.cloudera.com,20020,1409960280431
> {noformat}
> After the log replay, we got from meta the region is open on e1209
> {noformat}
> A4. 2014-09-05 16:41:12,257 INFO  [ActiveMasterManager] master.AssignmentManager: Loading
from meta: {cbb0d736ebfabcf4a07e5a7b395fcdf7 state=OPEN, ts=1409960472257, server=e1209.halxg.cloudera.com,20020,1409959391651}
> {noformat}
> The replayed edits show the log does have the edit expected:
> {noformat}
> 2014-09-05 16:41:11,862 INFO  [B.defaultRpcServer.handler=18,queue=0,port=20020] regionserver.RSRpcServices:
Meta replay edit type=PUT,mutation={"totalColumns":4,"families":{"info":[{"timestamp":1409960326705,"tag":["3:\\x00\\x00\\x00\\x00\\x02bad"],"value":"e1205.halxg.cloudera.com:20020","qualifier":"server","vlen":30},{"timestamp":1409960326705,"tag":["3:\\x00\\x00\\x00\\x00\\x02bad"],"value":"\\x00\\x00\\x01HH.\\x81o","qualifier":"serverstartcode","vlen":8},{"timestamp":1409960326705,"tag":["3:\\x00\\x00\\x00\\x00\\x02bad"],"value":"\\x00\\x00\\x00\\x00\\x02d'\\xDD","qualifier":"seqnumDuringOpen","vlen":8},{"timestamp":1409960326706,"tag":["3:\\x00\\x00\\x00\\x00\\x02bad"],"value":"OPEN","qualifier":"state","vlen":4}]},"row":"IntegrationTestBigLinkedList,\\x90Jy\\x04\\xA7\\x90Jp,1409959495482.cbb0d736ebfabcf4a07e5a7b395fcdf7."}
> {noformat}
> Why we picked up a wrong value with an older time stamp?
> {noformat}
> 2014-09-05 16:41:11,063 INFO  [B.defaultRpcServer.handler=9,queue=0,port=20020] regionserver.RSRpcServices:
Meta replay edit type=PUT,mutation={"totalColumns":4,"families":{"info":[{"timestamp":1409959994634,"tag":["3:\\x00\\x00\\x00\\x00\\x00\\x00\\x09\\x99"],"value":"e1209.halxg.cloudera.com:20020","qualifier":"server","vlen":30},{"timestamp":1409959994634,"tag":["3:\\x00\\x00\\x00\\x00\\x00\\x00\\x09\\x99"],"value":"\\x00\\x00\\x01HH
\\xF1\\xA3","qualifier":"serverstartcode","vlen":8},{"timestamp":1409959994634,"tag":["3:\\x00\\x00\\x00\\x00\\x00\\x00\\x09\\x99"],"value":"\\x00\\x00\\x00\\x00\\x00\\x01\\xB7\\xAB","qualifier":"seqnumDuringOpen","vlen":8},{"timestamp":1409959994634,"tag":["3:\\x00\\x00\\x00\\x00\\x00\\x00\\x09\\x99"],"value":"OPEN","qualifier":"state","vlen":4}]},"row":"IntegrationTestBigLinkedList,\\x90Jy\\x04\\xA7\\x90Jp,1409959495482.cbb0d736ebfabcf4a07e5a7b395fcdf7."}
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message