hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Julian Zhou <julian.z...@me.com>
Subject Long waiting loop for " Waiting for region servers count to settle" when doing hmaster failover
Date Fri, 02 Aug 2013 14:20:30 GMT
Hi Commnunity,

When I do a testing, I met this issue on 0.94.3.

There are 1 active hmaster, 1 backup hmaster, 4 region servers.
I run YCSB workload on it to load data. During the running of workload,
I manually kill -9 the active hmaster, seems that backup master took
over the active role quickly, but looping on

"
INFO org.apache.hadoop.hbase.master.ServerManager: Waiting for region
servers count to settle; currently checked in 0, slept for 0 ms,
expecting minimum of 1, maximum of 2147483647, timeout of 4500 ms,
interval of 1500 ms.
INFO org.apache.hadoop.hbase.master.ServerManager: Waiting for region
servers count to settle; currently checked in 0, slept for xxx ms,
expecting minimum of 1, maximum of 2147483647, timeout of 4500 ms,
interval of 1500 ms.
INFO org.apache.hadoop.hbase.master.ServerManager: Waiting for region
servers count to settle; currently checked in 0, slept for xxx ms,
expecting minimum of 1, maximum of 2147483647, timeout of 4500 ms,
interval of 1500 ms.
...
...
...
<for about 5 - 7 mins looping on this log message>
...

INFO org.apache.hadoop.hbase.master.ServerManager: Waiting for region
servers count to settle; currently checked in 1, slept for 0 ms,
expecting minimum of 1, maximum of 2147483647, timeout of 4500 ms,
interval of 1500 ms.

INFO org.apache.hadoop.hbase.master.ServerManager: Waiting for region
servers count to settle; currently checked in 2, slept for 0 ms,
expecting minimum of 1, maximum of 2147483647, timeout of 4500 ms,
interval of 1500 ms.
INFO org.apache.hadoop.hbase.master.ServerManager: Waiting for region
servers count to settle; currently checked in 3, slept for 0 ms,
expecting minimum of 1, maximum of 2147483647, timeout of 4500 ms,
interval of 1500 ms.
INFO org.apache.hadoop.hbase.master.ServerManager: Waiting for region
servers count to settle; currently checked in 4, slept for 0 ms,
expecting minimum of 1, maximum of 2147483647, timeout of 4500 ms,
interval of 1500 ms.

"
It seems there always a looping of 5 - 7 mins for the above waiting
message for region servers to checked in to the new active master. Then
after a long wait loop, it suddenly checked in 4 region servers
successfully.

Any idea of this waiting loop? Thanks a lot for the advice~


-- Best Regards, Julian

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message