hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "stack (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-2662) master stopping before regions finish stopping
Date Sat, 19 Jan 2008 00:09:34 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-2662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12560613#action_12560613
] 

stack commented on HADOOP-2662:
-------------------------------

In the master log, what do you see? Its waiting and waiting and then just gives up after lease
timeout because it says the regionservers are taking too long to report in?

The code doesn't currently accomodate a regionserver that is taking its time going down because
its busy running flushes and outstanding, etc.  It probably should (regionserver reports every
so often to the master that its working on the close).

There should be no data loss in this case (Regionservers are going down properly -- its just
the final report in to the master that its down that is failing)

> master stopping before regions finish stopping
> ----------------------------------------------
>
>                 Key: HADOOP-2662
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2662
>             Project: Hadoop
>          Issue Type: Bug
>          Components: contrib/hbase
>            Reporter: Billy Pearson
>            Priority: Minor
>             Fix For: 0.16.1
>
>
> I get this on my region servers logs sometimes when I shutdown the cluster it repeats
several times sometimes trying to find the master
> I thank we need to look at the master and make sure we do not stop the master on exit
before all region servers report down.
> I am not sure if there could be data loss or not but we should not leave region servers
looking for the master unless it has failed on it own.
> {code}
> 2008-01-18 17:40:42,009 WARN org.apache.hadoop.hbase.HRegionServer: Failed to send exiting
message to master:
> java.net.ConnectException: Connection refused
>         at java.net.PlainSocketImpl.socketConnect(Native Method)
>         at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:333)
>         at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:195)
>         at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:182)
>         at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366)
>         at java.net.Socket.connect(Socket.java:519)
>         at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:159)
>         at org.apache.hadoop.ipc.Client.getConnection(Client.java:575)
>         at org.apache.hadoop.ipc.Client.call(Client.java:498)
>         at org.apache.hadoop.hbase.ipc.HbaseRPC$Invoker.invoke(HbaseRPC.java:210)
>         at $Proxy0.regionServerReport(Unknown Source)
>         at org.apache.hadoop.hbase.HRegionServer.run(HRegionServer.java:898)
>         at java.lang.Thread.run(Thread.java:595)
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message