hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "stack (JIRA)" <j...@apache.org>
Subject [jira] Resolved: (HBASE-2727) Splits writing one file only is untenable; need dir of recovered edits ordered by sequenceid.
Date Fri, 16 Jul 2010 22:29:50 GMT

     [ https://issues.apache.org/jira/browse/HBASE-2727?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

stack resolved HBASE-2727.
--------------------------

    Hadoop Flags: [Reviewed]
      Resolution: Fixed

committed.

> Splits writing one file only is untenable; need dir of recovered edits ordered by sequenceid.
> ---------------------------------------------------------------------------------------------
>
>                 Key: HBASE-2727
>                 URL: https://issues.apache.org/jira/browse/HBASE-2727
>             Project: HBase
>          Issue Type: Bug
>            Reporter: stack
>            Assignee: stack
>            Priority: Blocker
>             Fix For: 0.90.0
>
>         Attachments: 2727-v2.txt, 2727-v4.txt, 2727-v6.txt, 2727-v7.txt
>
>
> This issue comes of tlipcon doing a bit of human unit testing.  His speculation is:
> Let a region X deploy to server A.  Server A opens the region, then closes it.
> Let region X now deploy to server B.  Server B now crashes.
> Both server A and server B now have edits for region X in their WALs.
> The processing of server crashes is currently sequential. 
> If server A crashes before server B, server A will write out a file of recovered edits
for region X but region X was not deployed on server A so, the file will just sit there unused.
 The processing of server B crash will overwrite the recovered edits file written by the split
of server A wal.  This is ok.
> But if somehow, server B processing is done before server A's, then interesting issues
will likely arise; in the main, there is danger that the server B's recovered edits could
be overwritten.
> Another issue comes up in the review of hbase-1025.  During the replay of edits on region
deploy, if the hosting regionserver crashes before we have processed all of the recovered
edits, we could lose some (the recovery of the regionserver that is replaying the edits could
overwrite the log of edits only partially replayed).
> Discussing up on IRC, whats needed is a directory of edits to replay ordered by sequenceid.
 On recovery, we play the oldest through to the newest removing the edits only on successfully
replay.
> Making blocker on 0.21 since this is a correctness issue.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message