hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jim Kellerman (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-2173) [hbase] When the master times out a region servers lease, the region server may not restart
Date Tue, 27 Nov 2007 07:14:43 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-2173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12545737

Jim Kellerman commented on HADOOP-2173:

This should be fixed with the commit for HADOOP-2276. Leaving open in case other other circumstances
also exhibit this bug.

> [hbase] When the master times out a region servers lease, the region server may not restart
> -------------------------------------------------------------------------------------------
>                 Key: HADOOP-2173
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2173
>             Project: Hadoop
>          Issue Type: Bug
>          Components: contrib/hbase
>            Reporter: Jim Kellerman
>            Assignee: Jim Kellerman
> Hadoop-Nightly 297 failed because:
>     * The region server's lease expired (Why? was the heartbeat thread starved?)
>     * The region server gets a call startup message
>     * The master splits the region server's log and deletes it.
> I think that when the region server called log.closeAndDelete(), it got an exception
(because the file no longer existed) at that point it said "error restarting server" and quit.
From there on the master is just looping because there is no region server to talk to
> We should probably just log an error for log.closeAndDelete() and proceed with region
server restart.
> Also for that test, we should probably increase the lease timeout and make the lease
timeout check happen less frequently accordingly

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message