hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From mail list <louis.hust...@gmail.com>
Subject Re: Slow waitForAckedSeqno took too long time
Date Fri, 05 Dec 2014 09:31:00 GMT
I also got the the RegionServer stack on the region server as below:

"RS_OPEN_META-l-hbase3:60020-0-WAL.AsyncNotifier" prio=10 tid=0x00007f7e7c259000 nid=0x3d1
in Object.wait() [0x00007f7e5eb90000]
   java.lang.Thread.State: WAITING (on object monitor)
        at java.lang.Object.wait(Native Method)
        - waiting on <0x00000000d59e4808> (a java.lang.Object)
        at java.lang.Object.wait(Object.java:503)
        at org.apache.hadoop.hbase.regionserver.wal.FSHLog$AsyncNotifier.run(FSHLog.java:1338)
        - locked <0x00000000d59e4808> (a java.lang.Object)
        at java.lang.Thread.run(Thread.java:744)

"RS_OPEN_META-l-hbase3:60020-0-WAL.AsyncSyncer4" prio=10 tid=0x00007f7e7c257000 nid=0x3d0
in Object.wait() [0x00007f7e5ec91000]
   java.lang.Thread.State: WAITING (on object monitor)
        at java.lang.Object.wait(Native Method)
        - waiting on <0x00000000d59e4498> (a java.lang.Object)
        at java.lang.Object.wait(Object.java:503)
        at org.apache.hadoop.hbase.regionserver.wal.FSHLog$AsyncSyncer.run(FSHLog.java:1209)
        - locked <0x00000000d59e4498> (a java.lang.Object)
        at java.lang.Thread.run(Thread.java:744)

"RS_OPEN_META-l-hbase3:60020-0-WAL.AsyncSyncer3" prio=10 tid=0x00007f7e7c255000 nid=0x3cf
in Object.wait() [0x00007f7e5ed92000]
   java.lang.Thread.State: WAITING (on object monitor)
        at java.lang.Object.wait(Native Method)
        - waiting on <0x00000000d5990570> (a java.lang.Object)
        at java.lang.Object.wait(Object.java:503)
        at org.apache.hadoop.hbase.regionserver.wal.FSHLog$AsyncSyncer.run(FSHLog.java:1209)
        - locked <0x00000000d5990570> (a java.lang.Object)
        at java.lang.Thread.run(Thread.java:744)


On Dec 5, 2014, at 13:01, mail list <louis.hust.ml@gmail.com> wrote:

> Hi ,all
> 
> I deploy Hbase0.98.6-cdh5.2.0 on 3 machine:
> 
> l-hbase1.dev.dba.cn0(hadoop namenode active, HMaster active)
> l-hbase2.dev.dba.cn0(hadoop namenode standby, HMaster standby, hadoop datanode)
> l-hbase3.dev.dba.cn0(regionserver, hadoop datanode)
> 
> Then I shutdown the l-hbase1.dev.dba.cn0,  But HBase can not work until about 15mins
later.
> I check the log and find the following log in the region server’s log:
> 
> 2014-12-05 12:03:19,169 WARN  [regionserver60020-WAL.AsyncSyncer0] hdfs.DFSClient: Slow
waitForAckedSeqno took 927762ms (threshold=30000ms)
> 2014-12-05 12:03:19,186 INFO  [regionserver60020-WAL.AsyncSyncer0] wal.FSHLog: Slow sync
cost: 927779 ms, current pipeline: [10.86.36.219:50010]
> 2014-12-05 12:03:19,186 DEBUG [regionserver60020.logRoller] regionserver.LogRoller: HLog
roll requested
> 2014-12-05 12:03:19,236 WARN  [regionserver60020-WAL.AsyncSyncer1] hdfs.DFSClient: Slow
waitForAckedSeqno took 867706ms (threshold=30000ms)
> 
> It seems the WAL Asysnc took too long time for region server recovery? I don’t know
if the log matters ?
> Can any body explain the reason? and how to reduce the time for recovery?
> 
> 


Mime
View raw message