hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stack <st...@duboce.net>
Subject Re: Master crash during assignment.
Date Thu, 12 May 2011 18:02:48 GMT
The issue says that it was applied to the branch for 0.90.2.    Thats
a misstatement.   The patch was not applied.  Will apply to the branch
now.
St.Ack

On Thu, May 12, 2011 at 10:59 AM, Stack <stack@duboce.net> wrote:
> Vidhya:
>
> So its failing to send close to an explicit server -- see the IP in
> the below -- and the other server is closing down the request
> prematurely so we get the EOFE.  Can you see anything in the logs on
> that machine?
>
> Regards EOFE crashing Master, you might want to pick up a TRUNK
> change.  See http://hbase.apache.org/xref/org/apache/hadoop/hbase/master/AssignmentManager.html#1261
> (This is how TRUNK looks).  Notice that its more generic than what you
> currently have -- or add a catch for the EOFE.
>
> The patch is actually kinda small and targetted explicitly to fix the
> likes of what you are seeing:
>
> +   HBASE-3617  NoRouteToHostException during balancing will cause Master abort
> +               (Ted Yu via Stack)
>
> Let me know if it works for you.  If so, I'll backport it to the branch.
>
> St.Ack
>
>
>
> On Wed, May 11, 2011 at 2:32 PM, Vidhyashankar Venkataraman
> <vidhyash@yahoo-inc.com> wrote:
>> The master of my Hbase instance (0.90.x) crashes each time it is restarted, with
the exceptions shown below. Can you let me know what this is usually due to? (I also saw these
exceptions in a JIRA but they were about uncaught EOF exception). Only the master dies while
the region servers wait for a master to wake back up.
>>
>> Thank you
>> Vidhya
>>
>> The master log:
>>
>> 2011-05-11 21:19:04,259 FATAL org.apache.hadoop.hbase.master.HMaster: Remote unexpected
exception
>> java.io.IOException: Call to /67.195.47.230:44420 failed on local exception: java.io.EOFException
       at org.apache.hadoop.hbase.ipc.HBaseClient.wrapException(HBaseClient.java:788)
>>        at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:757)
       at org.apache.hadoop.hbase.ipc.HBaseRPC$Invoker.invoke(HBaseRPC.java:257)
>>        at $Proxy7.closeRegion(Unknown Source)
>>        at org.apache.hadoop.hbase.master.ServerManager.sendRegionClose(ServerManager.java:589)
>>        at org.apache.hadoop.hbase.master.AssignmentManager.unassign(AssignmentManager.java:1092)
>>        at org.apache.hadoop.hbase.master.AssignmentManager.unassign(AssignmentManager.java:1039)
>>        at org.apache.hadoop.hbase.master.AssignmentManager.balance(AssignmentManager.java:1808)
>>        at org.apache.hadoop.hbase.master.HMaster.balance(HMaster.java:691)
>>        at org.apache.hadoop.hbase.master.HMaster$1.chore(HMaster.java:582)
>>        at org.apache.hadoop.hbase.Chore.run(Chore.java:66)
>> Caused by: java.io.EOFException
>>        at java.io.DataInputStream.readInt(DataInputStream.java:375)
>>        at org.apache.hadoop.hbase.ipc.HBaseClient$Connection.receiveResponse(HBaseClient.java:521)
       at org.apache.hadoop.hbase.ipc.HBaseClient$Connection.run(HBaseClient.java:459)2011-05-11
21:19:04,260 INFO org.apache.hadoop.hbase.master.HMaster: Aborting
>> 2011-05-11 21:19:04,260 INFO org.apache.hadoop.hbase.master.HMaster: balance hri=WCC.davesch2,r:at#start#www!/Gateway2000!http,1302916227366.b7d206f663282e2a37adb24ba7e4c0de.,
src=b3110318.yst.yahoo.net,44420,1305073517470, dest=b3110175.yst.yahoo.net,44420,1305073507459
>> 2011-05-11 21:19:04,260 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Starting
unassignment of region WCC.davesch2,r:at#start#www!/Gateway2000!http
>> ,1302916227366.b7d206f663282e2a37adb24ba7e4c0de. (offlining)
>> 2011-05-11 21:19:04,260 FATAL org.apache.hadoop.hbase.master.HMaster: Remote unexpected
exception
>> java.io.IOException: Call to /67.195.47.230:44420 failed on local exception: java.io.EOFException
>>        at org.apache.hadoop.hbase.ipc.HBaseClient.wrapException(HBaseClient.java:788)
       at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:757)
>>        at org.apache.hadoop.hbase.ipc.HBaseRPC$Invoker.invoke(HBaseRPC.java:257)
>>        at $Proxy7.closeRegion(Unknown Source)        at org.apache.hadoop.hbase.master.ServerManager.sendRegionClose(ServerManager.java:589)
>>        at org.apache.hadoop.hbase.master.AssignmentManager.unassign(AssignmentManager.java:1092)
>>        at org.apache.hadoop.hbase.master.AssignmentManager.unassign(AssignmentManager.java:1039)
>>        at org.apache.hadoop.hbase.master.AssignmentManager.balance(AssignmentManager.java:1808)
>>        at org.apache.hadoop.hbase.master.HMaster.balance(HMaster.java:691)
>>        at org.apache.hadoop.hbase.master.HMaster$1.chore(HMaster.java:582)
>>        at org.apache.hadoop.hbase.Chore.run(Chore.java:66)
>> Caused by: java.io.EOFException
>>        at java.io.DataInputStream.readInt(DataInputStream.java:375)
>>        at org.apache.hadoop.hbase.ipc.HBaseClient$Connection.receiveResponse(HBaseClient.java:521)
>>        at org.apache.hadoop.hbase.ipc.HBaseClient$Connection.run(HBaseClient.java:459)
>> 2011-05-11 21:19:04,260 DEBUG org.apache.hadoop.hbase.master.HMaster: Stopping service
threads
>> 2011-05-11 21:19:04,260 INFO org.apache.hadoop.hbase.master.HMaster: Aborting
>>
>

Mime
View raw message