hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jeffrey Zhong (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (HBASE-11906) Meta data loss with distributed log replay
Date Mon, 08 Sep 2014 23:19:29 GMT

    [ https://issues.apache.org/jira/browse/HBASE-11906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14126282#comment-14126282
] 

Jeffrey Zhong edited comment on HBASE-11906 at 9/8/14 11:18 PM:
----------------------------------------------------------------

In both cases, it seems the last edit was replayed as shown below while for some reason the
last edit couldn't be read. [~jxiang] do you have some simple repro steps? In addition, Could
you run a raw scan on the metadata row? Thanks.

{noformat}
2014-09-08 10:56:34,193 INFO  [B.defaultRpcServer.handler=24,queue=0,port=20020] regionserver.RSRpcServices:
Meta replay seq id=40001551, edit type=PUT,mutation={"totalColumns":4,"families":{"info":[{"timestamp":1410198840046,"tag":["3:\\x00\\x00\\x00\\x00\\x02b`\\x0F"],"value":"e1206.halxg.cloudera.com:20020","qualifier":"server","vlen":30},{"timestamp":1410198840046,"tag":["3:\\x00\\x00\\x00\\x00\\x02b`\\x0F"],"value":"\\x00\\x00\\x01HVf\\x05)","qualifier":"serverstartcode","vlen":8},{"timestamp":1410198840046,"tag":["3:\\x00\\x00\\x00\\x00\\x02b`\\x0F"],"value":"\\x00\\x00\\x00\\x00\\x04\\xC6~\\x7F","qualifier":"seqnumDuringOpen","vlen":8},{"timestamp":1410198840046,"tag":["3:\\x00\\x00\\x00\\x00\\x02b`\\x0F"],"value":"OPEN","qualifier":"state","vlen":4}]},"row":"IntegrationTestBigLinkedList,k\\x0D\\xF6\\xB0\\xDFk\\x0D\\xF0,1410197955915.cf591043d1fc374b8891599f6f133b17."}
{noformat}


was (Author: jeffreyz):
In both cases, it seems the last edit was replayed as shown below while for some reason the
last edit couldn't be read. [~jxiang] do you have some simple repro steps? In addition, Could
you run a raw scan on the metadata row? Thanks.

{quote}
2014-09-08 10:56:34,193 INFO  [B.defaultRpcServer.handler=24,queue=0,port=20020] regionserver.RSRpcServices:
Meta replay seq id=40001551, edit type=PUT,mutation={"totalColumns":4,"families":{"info":[{"timestamp":1410198840046,"tag":["3:\\x00\\x00\\x00\\x00\\x02b`\\x0F"],"value":"e1206.halxg.cloudera.com:20020","qualifier":"server","vlen":30},{"timestamp":1410198840046,"tag":["3:\\x00\\x00\\x00\\x00\\x02b`\\x0F"],"value":"\\x00\\x00\\x01HVf\\x05)","qualifier":"serverstartcode","vlen":8},{"timestamp":1410198840046,"tag":["3:\\x00\\x00\\x00\\x00\\x02b`\\x0F"],"value":"\\x00\\x00\\x00\\x00\\x04\\xC6~\\x7F","qualifier":"seqnumDuringOpen","vlen":8},{"timestamp":1410198840046,"tag":["3:\\x00\\x00\\x00\\x00\\x02b`\\x0F"],"value":"OPEN","qualifier":"state","vlen":4}]},"row":"IntegrationTestBigLinkedList,k\\x0D\\xF6\\xB0\\xDFk\\x0D\\xF0,1410197955915.cf591043d1fc374b8891599f6f133b17."}
{quote}

> Meta data loss with distributed log replay
> ------------------------------------------
>
>                 Key: HBASE-11906
>                 URL: https://issues.apache.org/jira/browse/HBASE-11906
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Jimmy Xiang
>         Attachments: meta-data-loss-2.log, meta-data-loss-with-dlr.log
>
>
> In the attached log, you can see, before log replaying, the region is open on e1205:
> {noformat}
> A3. 2014-09-05 16:38:46,705 INFO  [B.defaultRpcServer.handler=5,queue=2,port=20020] master.RegionStateStore:
Updating row IntegrationTestBigLinkedList,\x90Jy\x04\xA7\x90Jp,1409959495482.cbb0d736ebfabcf4a07e5a7b395fcdf7.
with state=OPEN&openSeqNum=40118237&server=e1205.halxg.cloudera.com,20020,1409960280431
> {noformat}
> After the log replay, we got from meta the region is open on e1209
> {noformat}
> A4. 2014-09-05 16:41:12,257 INFO  [ActiveMasterManager] master.AssignmentManager: Loading
from meta: {cbb0d736ebfabcf4a07e5a7b395fcdf7 state=OPEN, ts=1409960472257, server=e1209.halxg.cloudera.com,20020,1409959391651}
> {noformat}
> The replayed edits show the log does have the edit expected:
> {noformat}
> 2014-09-05 16:41:11,862 INFO  [B.defaultRpcServer.handler=18,queue=0,port=20020] regionserver.RSRpcServices:
Meta replay edit type=PUT,mutation={"totalColumns":4,"families":{"info":[{"timestamp":1409960326705,"tag":["3:\\x00\\x00\\x00\\x00\\x02bad"],"value":"e1205.halxg.cloudera.com:20020","qualifier":"server","vlen":30},{"timestamp":1409960326705,"tag":["3:\\x00\\x00\\x00\\x00\\x02bad"],"value":"\\x00\\x00\\x01HH.\\x81o","qualifier":"serverstartcode","vlen":8},{"timestamp":1409960326705,"tag":["3:\\x00\\x00\\x00\\x00\\x02bad"],"value":"\\x00\\x00\\x00\\x00\\x02d'\\xDD","qualifier":"seqnumDuringOpen","vlen":8},{"timestamp":1409960326706,"tag":["3:\\x00\\x00\\x00\\x00\\x02bad"],"value":"OPEN","qualifier":"state","vlen":4}]},"row":"IntegrationTestBigLinkedList,\\x90Jy\\x04\\xA7\\x90Jp,1409959495482.cbb0d736ebfabcf4a07e5a7b395fcdf7."}
> {noformat}
> Why we picked up a wrong value with an older time stamp?
> {noformat}
> 2014-09-05 16:41:11,063 INFO  [B.defaultRpcServer.handler=9,queue=0,port=20020] regionserver.RSRpcServices:
Meta replay edit type=PUT,mutation={"totalColumns":4,"families":{"info":[{"timestamp":1409959994634,"tag":["3:\\x00\\x00\\x00\\x00\\x00\\x00\\x09\\x99"],"value":"e1209.halxg.cloudera.com:20020","qualifier":"server","vlen":30},{"timestamp":1409959994634,"tag":["3:\\x00\\x00\\x00\\x00\\x00\\x00\\x09\\x99"],"value":"\\x00\\x00\\x01HH
\\xF1\\xA3","qualifier":"serverstartcode","vlen":8},{"timestamp":1409959994634,"tag":["3:\\x00\\x00\\x00\\x00\\x00\\x00\\x09\\x99"],"value":"\\x00\\x00\\x00\\x00\\x00\\x01\\xB7\\xAB","qualifier":"seqnumDuringOpen","vlen":8},{"timestamp":1409959994634,"tag":["3:\\x00\\x00\\x00\\x00\\x00\\x00\\x09\\x99"],"value":"OPEN","qualifier":"state","vlen":4}]},"row":"IntegrationTestBigLinkedList,\\x90Jy\\x04\\xA7\\x90Jp,1409959495482.cbb0d736ebfabcf4a07e5a7b395fcdf7."}
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message