hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jim Kellerman (JIRA)" <j...@apache.org>
Subject [jira] Created: (HADOOP-2173) [hbase] When the master times out a region servers lease, the region server may not restart
Date Thu, 08 Nov 2007 17:47:50 GMT
[hbase] When the master times out a region servers lease, the region server may not restart
-------------------------------------------------------------------------------------------

                 Key: HADOOP-2173
                 URL: https://issues.apache.org/jira/browse/HADOOP-2173
             Project: Hadoop
          Issue Type: Bug
          Components: contrib/hbase
            Reporter: Jim Kellerman


Hadoop-Nightly 297 failed because:

    * The region server's lease expired (Why? was the heartbeat thread starved?)
    * The region server gets a call startup message
    * The master splits the region server's log and deletes it.

I think that when the region server called log.closeAndDelete(), it got an exception (because
the file no longer existed) at that point it said "error restarting server" and quit. From
there on the master is just looping because there is no region server to talk to

We should probably just log an error for log.closeAndDelete() and proceed with region server
restart.

Also for that test, we should probably increase the lease timeout and make the lease timeout
check happen less frequently accordingly

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message