hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "聪聪" <175998...@qq.com>
Subject 回复: 回复: Hbase cluster is suddenly unable to respond
Date Mon, 26 Oct 2015 13:28:58 GMT
Thank you very much!


------------------ 原始邮件 ------------------
发件人: "Ted Yu";<yuzhihong@gmail.com>;
发送时间: 2015年10月26日(星期一) 晚上8:28
收件人: "user"<user@hbase.apache.org>; 

主题: Re: 回复: Hbase cluster is suddenly unable to respond



The fix from HBASE-11277 may solve your problem - if you collect stack trace during the hang,
we would have more clue. 

I suggest upgrading to newer release such as 1.1.2 or 0.98.15

Cheers

> On Oct 26, 2015, at 12:42 AM, 聪聪 <175998806@qq.com> wrote:
> 
> hi,Ted:
> 
> 
> I use the HBase version is hbase-0.96.0.
> Around 17:33,other region servers also appeared in this warn log.I don't know if it's
normal or not.At that time I saw web ui can not open.I restart the regionserver  then hbase
back to normal. Is it possible  bug  HBASE-11277?
> 
> 
> Regionserver on the log basically almost  this warn log
> mater on the log  is as follows:
> 2015-10-21 22:15:43,575 INFO  [CatalogJanitor-l-namenode2:60000] master.CatalogJanitor:
Scanned 672 catalog row(s), gc'd 0 unreferenced merged region(s) and 1 unreferenced parent
region(s)
> 2015-10-23 17:47:25,617 INFO  [RpcServer.handler=28,port=60000] master.HMaster: Client=hbase//192.168.39.19
set balanceSwitch=false
> 2015-10-23 17:49:45,513 WARN  [RpcServer.handler=24,port=60000] ipc.RpcServer: (responseTooSlow):
{"processingtimems":70266,"call":"ListTableDescriptorsByNamespace(org.apache.hadoop.hbase.protobuf.generated.MasterProtos$ListTableDescriptorsByNamespaceRequest)","client":"192.168.39.22:60292","starttimems":1445593715207,"queuetimems":0,"class":"HMaster","responsesize":704,"method":"ListTableDescriptorsByNamespace"}
> 2015-10-23 17:49:45,513 WARN  [RpcServer.handler=6,port=60000] ipc.RpcServer: (responseTooSlow):
{"processingtimems":130525,"call":"ListTableDescriptorsByNamespace(org.apache.hadoop.hbase.protobuf.generated.MasterProtos$ListTableDescriptorsByNamespaceRequest)","client":"192.168.39.22:60286","starttimems":1445593654945,"queuetimems":0,"class":"HMaster","responsesize":704,"method":"ListTableDescriptorsByNamespace"}
> 2015-10-23 17:49:45,513 WARN  [RpcServer.handler=24,port=60000] ipc.RpcServer: RpcServer.respondercallId:
130953 service: MasterService methodName: ListTableDescriptorsByNamespace size: 48 connection:
192.168.39.22:60292: output error
> 2015-10-23 17:49:45,513 WARN  [RpcServer.handler=6,port=60000] ipc.RpcServer: RpcServer.respondercallId:
130945 service: MasterService methodName: ListTableDescriptorsByNamespace size: 48 connection:
192.168.39.22:60286: output error
> 2015-10-23 17:49:45,513 WARN  [RpcServer.handler=6,port=60000] ipc.RpcServer: RpcServer.handler=6,port=60000:
caught a ClosedChannelException, this means that the server was processing a request but the
client went away. The error message was: null
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> ------------------ 原始邮件 ------------------
> 发件人: "Ted Yu";<yuzhihong@gmail.com>;
> 发送时间: 2015年10月23日(星期五) 晚上11:39
> 收件人: "user@hbase.apache.org"<user@hbase.apache.org>; 
> 
> 主题: Re: Hbase cluster is suddenly unable to respond
> 
> 
> 
> Were other region servers functioning normally around 17:33 ?
> 
> Which hbase release are you using ?
> 
> Can you pastebin more of the region server log ?
> 
> Thanks
> 
>> On Fri, Oct 23, 2015 at 8:28 AM, 聪聪 <175998806@qq.com> wrote:
>> 
>> hi,all:
>> 
>> 
>> This afternoon,The whole Hbase cluster is suddenly unable to respond.after
>> I restart a regionserver after,the cluster has recovered.I don't know the
>> cause of the trouble.I hope I can get help from you.
>> 
>> 
>> Regionserver on the log is as follows:
>> 2015-10-23 17:28:49,335 INFO  [regionserver60020.logRoller] wal.FSHLog:
>> moving old hlog file /hbase/WALs/l-hbase30.data.cn8.qunar.com
>> ,60020,1442810406218/l-hbase30.data.cn8.qunar.com%2C60020%2C1442810406218.1445580462689
>> whose highest sequenceid is 9071525521 to /hbase/oldWALs/
>> l-hbase30.data.cn8.qunar.com%2C60020%2C1442810406218.1445580462689
>> 2015-10-23 17:33:31,375 WARN  [RpcServer.reader=8,port=60020]
>> ipc.RpcServer: RpcServer.listener,port=60020: count of bytes read: 0
>> java.io.IOException: Connection reset by peer
>>        at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
>>        at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
>>        at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
>>        at sun.nio.ch.IOUtil.read(IOUtil.java:197)
>>        at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:379)
>>        at
>> org.apache.hadoop.hbase.ipc.RpcServer.channelRead(RpcServer.java:2368)
>>        at
>> org.apache.hadoop.hbase.ipc.RpcServer$Connection.readAndProcess(RpcServer.java:1403)
>>        at
>> org.apache.hadoop.hbase.ipc.RpcServer$Listener.doRead(RpcServer.java:770)
>>        at
>> org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.doRunLoop(RpcServer.java:563)
>>        at
>> org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.run(RpcServer.java:538)
>>        at
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>        at
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>>        at java.lang.Thread.run(Thread.java:744)
>> 2015-10-23 17:33:31,779 WARN  [RpcServer.reader=2,port=60020]
>> ipc.RpcServer: RpcServer.listener,port=60020: count of bytes read: 0
>> java.io.IOException: Connection reset by peer
>>        at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
>>        at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
>>        at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
>>        at sun.nio.ch.IOUtil.read(IOUtil.java:197)
>>        at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:379)
>>        at
>> org.apache.hadoop.hbase.ipc.RpcServer.channelRead(RpcServer.java:2368)
>>        at
>> org.apache.hadoop.hbase.ipc.RpcServer$Connection.readAndProcess(RpcServer.java:1403)
>>        at
>> org.apache.hadoop.hbase.ipc.RpcServer$Listener.doRead(RpcServer.java:770)
>>        at
>> org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.doRunLoop(RpcServer.java:563)
>>        at
>> org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.run(RpcServer.java:538)
>>        at
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>        at
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>>        at java.lang.Thread.run(Thread.java:744)
Mime
  • Unnamed multipart/alternative (inline, 8-Bit, 0 bytes)
View raw message