hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jean-Daniel Cryans <jdcry...@apache.org>
Subject Re: HBase Master dies with an unexpected exception
Date Tue, 15 Nov 2011 20:16:29 GMT
Fixed in https://issues.apache.org/jira/browse/HBASE-3617, upgrade to 0.90.4

J-D

On Tue, Nov 8, 2011 at 6:08 PM, Amit Phadke <aphadke@yahoo-inc.com> wrote:
> Adding right address.
>
> On Nov 7, 2011, at 2:45 PM, Amit Phadke wrote:
>
> Hey Guys,
>
> We are seeing an issue where Master dies with something like the following.
> Any idea why the master dies ? Ideally, if a RS isnt behaving well, shouldnt that RS
be blacklisted and ignored or something of that sort ?
>
> This is on a cluster with Hadoop 205 and Hbase 0.90.3
>
> Thanks
> Amit
>
> 2011-11-07 02:38:00,252 nng2.coke.ac4.yahoo.com:60000.timeoutMonitor INFO org.apache.hadoop.hbase.master.AssignmentManager:
Regions in transition timed out:  items,023b3bba-5282-3edc-a984-dbed11d1cc51,1320395576309.bf3cd2b2cc06f8708050ce725cf1fa7d.
state=PENDING_CLOSE, ts=1320631670889
> 2011-11-07 02:38:00,252 nng2.coke.ac4.yahoo.com:60000.timeoutMonitor INFO org.apache.hadoop.hbase.master.AssignmentManager:
Region has been PENDING_CLOSE for too long, running forced unassign again on region=items,023b3bba-5282-3edc-a984-dbed11d1cc51,1320395576309.bf3cd2b2cc06f8708050ce725cf1fa7d.
> 2011-11-07 02:38:51,501 nng2.coke.ac4.yahoo.com:60000.timeoutMonitor FATAL org.apache.hadoop.hbase.master.HMaster:
Remote unexpected exception
> java.io.IOException: Call to /216.109.127.135:60020 failed on local exception: java.io.IOException:
Connection reset by peer
>        at org.apache.hadoop.hbase.ipc.HBaseClient.wrapException(HBaseClient.java:806)
>        at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:775)
>        at org.apache.hadoop.hbase.ipc.HBaseRPC$Invoker.invoke(HBaseRPC.java:257)
>        at $Proxy7.closeRegion(Unknown Source)
>        at org.apache.hadoop.hbase.master.ServerManager.sendRegionClose(ServerManager.java:601)
>        at org.apache.hadoop.hbase.master.AssignmentManager.unassign(AssignmentManager.java:1126)
>        at org.apache.hadoop.hbase.master.AssignmentManager$TimeoutMonitor.chore(AssignmentManager.java:1788)
>        at org.apache.hadoop.hbase.Chore.run(Chore.java:66)
> Caused by: java.io.IOException: Connection reset by peer
>        at sun.nio.ch.FileDispatcher.read0(Native Method)
>        at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:21)
>        at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:198)
>        at sun.nio.ch.IOUtil.read(IOUtil.java:171)
>        at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:243)
>        at org.apache.hadoop.net.SocketInputStream$Reader.performIO(SocketInputStream.java:55)
>        at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:142)
>        at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:155)
>        at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:128)
>        at java.io.FilterInputStream.read(FilterInputStream.java:116)
>        at org.apache.hadoop.hbase.ipc.HBaseClient$Connection$PingInputStream.read(HBaseClient.java:299)
>        at java.io.BufferedInputStream.fill(BufferedInputStream.java:218)
>        at java.io.BufferedInputStream.read(BufferedInputStream.java:237)
>        at java.io.DataInputStream.readInt(DataInputStream.java:370)
>        at org.apache.hadoop.hbase.ipc.HBaseClient$Connection.receiveResponse(HBaseClient.java:539)
>        at org.apache.hadoop.hbase.ipc.HBaseClient$Connection.run(HBaseClient.java:477)
> 2011-11-07 02:38:51,502 nng2.coke.ac4.yahoo.com:60000.timeoutMonitor INFO org.apache.hadoop.hbase.master.HMaster:
Aborting
> 2011-11-07 02:38:51,502 nng2.coke.ac4.yahoo.com:60000.timeoutMonitor INFO org.apache.hadoop.hbase.master.AssignmentManager$TimeoutMonitor:
nng2.coke.ac4.yahoo.com:60000.timeoutMonitor exiting
>
>

Mime
View raw message