hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "stack (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HBASE-3344) Master aborts after RPC to server that was shutting down
Date Fri, 31 Dec 2010 04:46:46 GMT

    [ https://issues.apache.org/jira/browse/HBASE-3344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12976186#action_12976186
] 

stack commented on HBASE-3344:
------------------------------

Odd.  For sure you have RC2 loaded?

sendRegionClose is on line #1091 in tip of 0.90 TRUNK (This is not RC2 but I dont think this
has changed since RC'ing).

Also, there is this catch clause around the sendRegionClose:

{code}
1115     } catch (EOFException e) {
1116       LOG.info("Server " + server + " returned " + e.getMessage() + " for " +
1117         region.getEncodedName());
1118       // Presume retry or server will expire.
{code}

I wonder why its not triggering.  Maybe its wrapped in a RemoteException?

Good on you Todd.

> Master aborts after RPC to server that was shutting down
> --------------------------------------------------------
>
>                 Key: HBASE-3344
>                 URL: https://issues.apache.org/jira/browse/HBASE-3344
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.90.0
>            Reporter: Todd Lipcon
>            Priority: Blocker
>
> I was doing a rolling restart during a bunch of splits happening, and the master aborted
with the following:
> 2010-12-13 12:24:55,536 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Starting
unassignment of region usertable,user1590589031,1291843166306.dbcbe21b3447c78560802962b87fd34f.
(offlining)
> 2010-12-13 12:24:55,537 FATAL org.apache.hadoop.hbase.master.HMaster: Remote unexpected
exception
> java.io.IOException: Call to haus03.sf.cloudera.com/172.29.5.34:60020 failed on local
exception: java.io.EOFException
>         at org.apache.hadoop.hbase.ipc.HBaseClient.wrapException(HBaseClient.java:788)
>         at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:757)
>         at org.apache.hadoop.hbase.ipc.HBaseRPC$Invoker.invoke(HBaseRPC.java:257)
>         at $Proxy6.closeRegion(Unknown Source)
>         at org.apache.hadoop.hbase.master.ServerManager.sendRegionClose(ServerManager.java:589)
>         at org.apache.hadoop.hbase.master.AssignmentManager.unassign(AssignmentManager.java:1085)
>         at org.apache.hadoop.hbase.master.AssignmentManager.unassign(AssignmentManager.java:1032)
>         at org.apache.hadoop.hbase.master.AssignmentManager.balance(AssignmentManager.java:1791)
>         at org.apache.hadoop.hbase.master.HMaster.balance(HMaster.java:688)
>         at org.apache.hadoop.hbase.master.HMaster$1.chore(HMaster.java:579)
>         at org.apache.hadoop.hbase.Chore.run(Chore.java:66)
> Caused by: java.io.EOFException
>         at java.io.DataInputStream.readInt(DataInputStream.java:375)
>         at org.apache.hadoop.hbase.ipc.HBaseClient$Connection.receiveResponse(HBaseClient.java:521)
>         at org.apache.hadoop.hbase.ipc.HBaseClient$Connection.run(HBaseClient.java:459)
> 2010-12-13 12:24:55,541 INFO org.apache.hadoop.hbase.master.HMaster: Aborting

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message