hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "sunweiwei" <su...@asiainfo-linkage.com>
Subject 答复: meta server hungs ?
Date Mon, 05 May 2014 09:39:04 GMT
And  this is client log.

2014-04-29 13:53:57,271 WARN [main] org.apache.hadoop.hbase.client.ScannerCallable: Ignore,
probably already closed
java.net.SocketTimeoutException: Call to hadoop77/192.168.1.87:60020 failed because java.net.SocketTimeoutException:
60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected
local=/192.168.1.102:56473 remote=hadoop77/192.168.1.87:60020]
	at org.apache.hadoop.hbase.ipc.RpcClient.wrapException(RpcClient.java:1475)
	at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1450)
	at org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1650)
	at org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1708)
	at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$BlockingStub.scan(ClientProtos.java:27332)
	at org.apache.hadoop.hbase.client.ScannerCallable.close(ScannerCallable.java:284)
	at org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:152)
	at org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:57)
	at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:116)
	at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:94)
	at org.apache.hadoop.hbase.client.ClientScanner.close(ClientScanner.java:462)
	at org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:187)
	at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.prefetchRegionCache(HConnectionManager.java:1095)
	at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegionInMeta(HConnectionManager.java:1155)
	at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:1047)
	at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:1004)
	at org.apache.hadoop.hbase.client.AsyncProcess.findDestLocation(AsyncProcess.java:330)
	at org.apache.hadoop.hbase.client.AsyncProcess.submit(AsyncProcess.java:281)
	at org.apache.hadoop.hbase.client.HTable.backgroundFlushCommits(HTable.java:917)
	at org.apache.hadoop.hbase.client.HTable.doPut(HTable.java:901)
	at org.apache.hadoop.hbase.client.HTable.put(HTable.java:863)

-----邮件原件-----
发件人: sunweiwei [mailto:sunww@asiainfo-linkage.com] 
发送时间: 2014年5月5日 17:23
收件人: user@hbase.apache.org
主题: 答复: meta server hungs ?

Thank you for reply. 
I find this logs in hadoop77/192.168.1.87. It seems like meta regionserver receive hmaster's
message and shutdown itself. 
2014-04-29 15:32:28,364 FATAL [regionserver60020] regionserver.HRegionServer: ABORTING region
server hadoop77,60020,1396606457005: org.apache.hadoop.hbase.YouAreDeadException: Server REPORT
rejected; currently processing hadoop77,60020,1396606457005 as dead server
        at org.apache.hadoop.hbase.master.ServerManager.checkIsDead(ServerManager.java:339)


and  this is  gc  log:
2014-04-29T15:32:27.159+0800: 2150297.866: [GC 2150297.866: [ParNew: 449091K->52416K(471872K),
0.0411300 secs] 11582287K->11199419K(16724800K), 0.0413430 secs] [Times: user=0.00 sys=0.00,
real=0.04 secs]
2014-04-29T15:32:28.160+0800: 2150298.867: [GC 2150298.867: [ParNew: 471859K->19313K(471872K),
0.0222250 secs] 11618863K->11175232K(16724800K), 0.0224050 secs] [Times: user=0.00 sys=0.00,
real=0.02 secs]
2014-04-29T15:32:29.063+0800: 2150299.769: [GC 2150299.769: [ParNew: 438769K->38887K(471872K),
0.0242330 secs] 11594688K->11194807K(16724800K), 0.0243580 secs] [Times: user=0.00 sys=0.00,
real=0.03 secs]
2014-04-29T15:32:29.861+0800: 2150300.568: [GC 2150300.568: [ParNew: 458343K->18757K(471872K),
0.0242790 secs] 11614263K->11180844K(16724800K), 0.0244340 secs] [Times: user=0.00 sys=0.00,
real=0.03 secs]
2014-04-29T15:32:31.608+0800: 2150302.314: [GC 2150302.314: [ParNew: 438213K->4874K(471872K),
0.0221520 secs] 11600300K->11166960K(16724800K), 0.0222970 secs] [Times: user=0.00 sys=0.00,
real=0.02 secs]
Heap
 par new generation   total 471872K, used 335578K [0x00000003fae00000, 0x000000041ae00000,
0x000000041ae00000)
  eden space 419456K,  78% used [0x00000003fae00000, 0x000000040f0f41c8, 0x00000004147a0000)
  from space 52416K,   9% used [0x0000000417ad0000, 0x0000000417f928e0, 0x000000041ae00000)
  to   space 52416K,   0% used [0x00000004147a0000, 0x00000004147a0000, 0x0000000417ad0000)
 concurrent mark-sweep generation total 16252928K, used 11162086K [0x000000041ae00000, 0x00000007fae00000,
0x00000007fae00000)
 concurrent-mark-sweep perm gen total 81072K, used 48660K [0x00000007fae00000, 0x00000007ffd2c000,
0x0000000800000000)



-----邮件原件-----
发件人: Samir Ahmic [mailto:ahmic.samir@gmail.com] 
发送时间: 2014年5月5日 16:50
收件人: user@hbase.apache.org
抄送: sunweiwei
主题: Re: meta server hungs ?

Hi,
This exception:
****
exception=java.net.SocketTimeoutException: Call to
hadoop77/192.168.1.87:60020 failed because java.net.SocketTimeoutException:
60000 millis timeout while waiting for channel to be ready for read. ch :
java.nio.channels.SocketChannel[connected local=/192.168.1.123:33117
remote=hadoop77/192.168.1.87:60020]
*****
shows that there is connection timeout between master server and
regionserver (hadoop77/192.168.1.87:60020) that is hosting 'meta' table.
Real question is what is causing this timeout?  In my experience it can be
by few things causing this type of timeout. I would suggest that you check
hadoop77/192.168.1.87 <http://192.168.1.87:60020/> Garbage Collection,
memory,  network, CPU disks and i'm sure you will find cause of timeout.
You can us some diagnostic tools like vmstat, sar, iostat to check your
sistem and you can use jstat to check GC and some other JVM stuff.

Regards
Samir




On Mon, May 5, 2014 at 10:14 AM, sunweiwei <sunww@asiainfo-linkage.com>wrote:

> Hi
>
> I'm using hbase0.96.0.
>
> I found client can't put data suddenly  and  hmaster hungs. Then I shutdown
> the hmaster and start a new hmaster, then  the client back to normal.
>
>
>
> I found this logs in the new hmaster . It seem like meta server hungs and
> hmaster stop the meta server.
>
> 2014-04-29 15:32:21,530 INFO  [master:hadoop1:60000]
> catalog.CatalogTracker:
> Failed verification of hbase:meta,,1 at
> address=hadoop77,60020,1396606457005,
> exception=java.net.SocketTimeoutException: Call to
> hadoop77/192.168.1.87:60020 failed because
> java.net.SocketTimeoutException:
> 60000 millis timeout while waiting for channel to be ready for read. ch :
> java.nio.channels.SocketChannel[connected local=/192.168.1.123:33117
> remote=hadoop77/192.168.1.87:60020]
>
> 2014-04-29 15:32:21,532 INFO  [master:hadoop1:60000] master.HMaster:
> Forcing
> expire of hadoop77,60020,1396606457005
>
>
>
> I can't find why meta server hungs .I found this in meta server log
>
> 2014-04-29 13:53:55,637 INFO  [regionserver60020.leaseChecker]
> regionserver.HRegionServer: Scanner 8206938292079629452 lease expired on
> region hbase:meta,,1.1588230740
>
> 2014-04-29 13:53:56,632 INFO  [regionserver60020.leaseChecker]
> regionserver.HRegionServer: Scanner 1111451530521284267 lease expired on
> region hbase:meta,,1.1588230740
>
> 2014-04-29 13:53:56,733 INFO  [regionserver60020.leaseChecker]
> regionserver.HRegionServer: Scanner 516152687416913803 lease expired on
> region hbase:meta,,1.1588230740
>
> 2014-04-29 13:53:56,733 INFO  [regionserver60020.leaseChecker]
> regionserver.HRegionServer: Scanner -2651411216936596082 lease expired on
> region hbase:meta,,1.1588230740
>
>
>
>
>
> any suggestion will be appreciated. Thanks.
>
>


Mime
View raw message