Return-Path: X-Original-To: apmail-hadoop-hdfs-user-archive@minotaur.apache.org Delivered-To: apmail-hadoop-hdfs-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 19EE31093E for ; Wed, 3 Dec 2014 06:51:25 +0000 (UTC) Received: (qmail 7056 invoked by uid 500); 3 Dec 2014 06:51:16 -0000 Delivered-To: apmail-hadoop-hdfs-user-archive@hadoop.apache.org Received: (qmail 6912 invoked by uid 500); 3 Dec 2014 06:51:15 -0000 Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hadoop.apache.org Delivered-To: mailing list user@hadoop.apache.org Received: (qmail 6901 invoked by uid 99); 3 Dec 2014 06:51:15 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 03 Dec 2014 06:51:15 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of louis.hust.ml@gmail.com designates 209.85.220.53 as permitted sender) Received: from [209.85.220.53] (HELO mail-pa0-f53.google.com) (209.85.220.53) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 03 Dec 2014 06:50:48 +0000 Received: by mail-pa0-f53.google.com with SMTP id kq14so14942282pab.26 for ; Tue, 02 Dec 2014 22:48:31 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=from:content-type:content-transfer-encoding:subject:message-id:date :to:mime-version; bh=N98KzZ6eE23OQFTmhs+m5yeqcLa2DRrdhZE5SrBdzsk=; b=sydfhsbigZgVMtJqt7RUw2KB25fGcQ1CybhqHvN0OnLpCdIQVhk8GRAA9ETYaigK2v /bHChIbX6PqvpK3f7eNpplnTcW9sv5DoGs/LZxGJ5KPtyrHFDdPoQswKpiZW7MTCkA/q BystHLRHi0oqEqDr8gBRn30p1uLo/zsphXpKQudNZTIWUdkNmGHFETb2o1GWJFsFUQxn QixYnHjcS20pSeBaYVN4zCNMTM3BwK/XoBEUqywmmaSkW7JOfavEN4yXq6DmBI/bHs1M 4hQLeagxSaRfDcO9cR1iQolw+rJxSF1YG+DWMYQyXcR6tPJfJ90IDt4nbi5+rqzDQudO zE/A== X-Received: by 10.66.65.202 with SMTP id z10mr2009525pas.104.1417589311434; Tue, 02 Dec 2014 22:48:31 -0800 (PST) Received: from [192.168.126.70] ([211.151.238.52]) by mx.google.com with ESMTPSA id d6sm22125755pdn.40.2014.12.02.22.48.30 for (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Tue, 02 Dec 2014 22:48:30 -0800 (PST) From: mail list Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Subject: Question about the QJM HA namenode Message-Id: <1B699136-87C4-4E37-B752-9F707A9422E8@gmail.com> Date: Wed, 3 Dec 2014 14:48:20 +0800 To: user@hadoop.apache.org Mime-Version: 1.0 (Mac OS X Mail 7.3 \(1878.6\)) X-Mailer: Apple Mail (2.1878.6) X-Virus-Checked: Checked by ClamAV on apache.org Hi all, I deploy the hadoop with 3 machines: l-hbase1.dba.dev.cn0 (namenode active and QJM) l-hbase2.dba.dev.cn0 (namenode standby and datanode and QJM) l-hbase3.dba.dev.cn0 (datanode and QJM) Above the hadoop, i deploy a hbase: l-hbase1.dba.dev.cn0 (HMaster active) l-hbase2.dba.dev.cn0 (HMaster standby) l-hbase3.dba.dev.cn0 (RegionServer) I write a program which put data into hbase one row every seconds in a = loop.=20 Then I use iptables to simulate l-hbase1.dba.dev.cn0 offline=EF=BC=8Cand = after that , the program hang and can not=20 write to hbase. After about 15 mins, the program can write again. The time 15mins for the HA failover is too long for me! And I=E2=80=99ve no idea about the reason. Then I check the l-hbase2.dba.dev.cn0 namenode logs, and find many retry = like below: {code} 2014-12-03 12:13:35,165 INFO org.apache.hadoop.ipc.Client: Retrying = connect to server: l-hbase1.dba.dev.cn0/10.86.36.217:8485. Already tried = 1 time(s); retry policy is = RetryUpToMaximumCountWithFixedSleep(maxRetries=3D10, sleepTime=3D1000 = MILLISECONDS)=20 {code} I have the QJM on l-hbase1.dba.dev.cn0, does it matter? I am a newbie, Any idea will be appreciated!!=