hadoop-hdfs-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From mail list <louis.hust...@gmail.com>
Subject Question about the QJM HA namenode
Date Wed, 03 Dec 2014 06:48:20 GMT
Hi all,

I deploy the hadoop with 3 machines:

l-hbase1.dba.dev.cn0 (namenode active and QJM)
l-hbase2.dba.dev.cn0 (namenode standby and datanode and QJM)
l-hbase3.dba.dev.cn0 (datanode and QJM)

Above the hadoop, i deploy a hbase:
l-hbase1.dba.dev.cn0 (HMaster active)
l-hbase2.dba.dev.cn0 (HMaster standby)
l-hbase3.dba.dev.cn0 (RegionServer)

I write a program which put data into hbase one row every seconds in a loop. 
Then I use iptables to  simulate l-hbase1.dba.dev.cn0 offline,and after that , the program
hang and can not 
write to hbase. After about 15 mins, the program can write again.

The time 15mins for the HA failover is too long for me!
And I’ve no idea about the reason.

Then I check the l-hbase2.dba.dev.cn0 namenode logs, and find many retry like below:
2014-12-03 12:13:35,165 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: l-hbase1.dba.dev.cn0/
Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10,
sleepTime=1000 MILLISECONDS) 

I have the QJM on l-hbase1.dba.dev.cn0, does it matter?

I am a newbie, Any idea will be appreciated!!
View raw message