hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Harsh J <ha...@cloudera.com>
Subject Re: Question about the QJM HA namenode
Date Wed, 03 Dec 2014 08:31:26 GMT
What is your Hadoop version?

On Wed, Dec 3, 2014 at 12:55 PM, mail list <louis.hust.ml@gmail.com> wrote:
> hi all,
>
> Attach log again!
>
> The failover happened at about time: 2014-12-03 12:01:
>
>
>
>
>
> On Dec 3, 2014, at 14:55, mail list <louis.hust.ml@gmail.com> wrote:
>
>> Sorry forget the log, the failover time at about 2014-12-03 12:01:
>>
>> <hadoop-hadoop-namenode-l-hbase2.dba.dev.cn0.log.tar.gz>
>> On Dec 3, 2014, at 14:48, mail list <louis.hust.ml@gmail.com> wrote:
>>
>>> Hi all,
>>>
>>> I deploy the hadoop with 3 machines:
>>>
>>> l-hbase1.dba.dev.cn0 (namenode active and QJM)
>>> l-hbase2.dba.dev.cn0 (namenode standby and datanode and QJM)
>>> l-hbase3.dba.dev.cn0 (datanode and QJM)
>>>
>>> Above the hadoop, i deploy a hbase:
>>> l-hbase1.dba.dev.cn0 (HMaster active)
>>> l-hbase2.dba.dev.cn0 (HMaster standby)
>>> l-hbase3.dba.dev.cn0 (RegionServer)
>>>
>>>
>>> I write a program which put data into hbase one row every seconds in a loop.
>>> Then I use iptables to  simulate l-hbase1.dba.dev.cn0 offline,and after that
, the program hang and can not
>>> write to hbase. After about 15 mins, the program can write again.
>>>
>>> The time 15mins for the HA failover is too long for me!
>>> And I’ve no idea about the reason.
>>>
>>> Then I check the l-hbase2.dba.dev.cn0 namenode logs, and find many retry like
below:
>>> {code}
>>> 2014-12-03 12:13:35,165 INFO org.apache.hadoop.ipc.Client: Retrying connect to
server: l-hbase1.dba.dev.cn0/10.86.36.217:8485. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10,
sleepTime=1000 MILLISECONDS)
>>> {code}
>>>
>>> I have the QJM on l-hbase1.dba.dev.cn0, does it matter?
>>>
>>> I am a newbie, Any idea will be appreciated!!
>>
>
>



-- 
Harsh J

Mime
View raw message