hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jim Kellerman (JIRA)" <j...@apache.org>
Subject [jira] Updated: (HADOOP-2079) [hbase] HLog generates incorrect file name when splitting a log, race condition also contributes
Date Sat, 20 Oct 2007 14:17:50 GMT

     [ https://issues.apache.org/jira/browse/HADOOP-2079?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Jim Kellerman updated HADOOP-2079:

    Resolution: Fixed
        Status: Resolved  (was: Patch Available)

This issue was mostly about the incorrect HLog name generation and the race condition in the
master in splitting the HLog when a region server dies. That part has been fixed. Resolving
this issue.

> [hbase] HLog generates incorrect file name when splitting a log, race  condition also
> -------------------------------------------------------------------------------------------------
>                 Key: HADOOP-2079
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2079
>             Project: Hadoop
>          Issue Type: Bug
>          Components: contrib/hbase
>    Affects Versions: 0.16.0
>            Reporter: Jim Kellerman
>            Assignee: Jim Kellerman
>             Fix For: 0.16.0
>         Attachments: patch.txt
> In Hadoop-Nightly #277 TestRegionServerExit failed with a timeout.
> The reason for this was a race in the Master in which checkAssigned (run from either
the root or meta scanner)  will immediately try to split the log and then assign a region
which has invalid server info.
> The scenario went something like this:
> 1. region server aborted
> 2. root region was written on optional cache flush
> lease timed out on aborted server which removes it from serversToServerInfo and queues
a PendingServerShutdown operation
> 3. root scanner runs and finds server info incorrect (it is in the root region but the
server is not in serversToServerInfo
> 4. checkAssigned starts splitting the log but because the log name is incorrect it can't
> 5. PendingServerShutdown fires and really gums up the works.
> So there are two problems:
> 1. HLog.splitLog needs to generate the correct log file name.
> 2. PendingServerShutdown and/or leaseExpired need to cooperate with checkAssigned so
that there are not two concurrent attempts to recover the log.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message