hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jim Kellerman (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-1937) [hbase] when the master times out a region server's lease, it is too aggressive in reclaiming the server's log
Date Tue, 02 Oct 2007 22:25:50 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-1937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12531933

Jim Kellerman commented on HADOOP-1937:

Revised strategy:

With HADOOP-1960, if the region server cannot talk to the master
before its lease expires it shuts itself down. Thus the likelihood of
a region server checking in after its lease has expired is low. In the
event this does happen, however, the master will tell the region
server to restart; that is close all open regions and flush its log.

However, the master should defer processing the server's log and
reassigning its regions as the server may still be in the process of
shutting down. Consequently, all PendingServerShutdowns will be placed
in a delay queue for 1/2 a lease period to ensure the region server
has shut down.

Finally, we will add the server start code to the log file name, so
that if the region server restarts before the master processes the old
log file, the new log file will not be included.

> [hbase] when the master times out a region server's lease, it is too aggressive in reclaiming
the server's log
> --------------------------------------------------------------------------------------------------------------
>                 Key: HADOOP-1937
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1937
>             Project: Hadoop
>          Issue Type: Bug
>          Components: contrib/hbase
>    Affects Versions: 0.15.0
>            Reporter: Jim Kellerman
>            Assignee: Jim Kellerman
>             Fix For: 0.15.0
> When a region server's lease times out, the master immediately begins trying to split
the server's log file. There have been cases where a region server was just a little late
reporting to the master and the master had already started trying to reclaim the server's
log, even though the server was still writing to it. 
> There needs to be some kind of "grace period" in which, if the region server reports
in, the master re-instates the server. If the "grace period" expires, then the master should
start processing the server's log.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message