Return-Path: X-Original-To: apmail-hadoop-user-archive@minotaur.apache.org Delivered-To: apmail-hadoop-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 8F1F1EB17 for ; Thu, 24 Jan 2013 01:36:35 +0000 (UTC) Received: (qmail 41063 invoked by uid 500); 24 Jan 2013 01:36:30 -0000 Delivered-To: apmail-hadoop-user-archive@hadoop.apache.org Received: (qmail 40873 invoked by uid 500); 24 Jan 2013 01:36:30 -0000 Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hadoop.apache.org Delivered-To: mailing list user@hadoop.apache.org Received: (qmail 40866 invoked by uid 99); 24 Jan 2013 01:36:30 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 24 Jan 2013 01:36:30 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of snaik@attributor.com designates 209.85.212.54 as permitted sender) Received: from [209.85.212.54] (HELO mail-vb0-f54.google.com) (209.85.212.54) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 24 Jan 2013 01:36:24 +0000 Received: by mail-vb0-f54.google.com with SMTP id l1so2111510vba.27 for ; Wed, 23 Jan 2013 17:36:03 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=x-received:mime-version:from:date:message-id:subject:to :content-type:x-gm-message-state; bh=idr+kZZGxmPR7lYYThHxUizVrC5F1wTVM/PwUxiEmew=; b=Ii7UOfhInfGv5pwD/LB2KY9ZoJMTmJOHY1cX26BIvdcB2HwYkQe+4QYv34lJg4mqAx XoaxuxJkb6QfQdAF34pA5XwYRFNXZHgq65LkvZHDim9ibftiFbhHBgH1O2vi/f7/iJ7q H+MrWHQwUC/VKa+FavbX5PGGLVEDuQQkRP82MiiAbDXhHlDkR/oVQp83AGPmF60rIt8y 3MxIvPGGXat1nSXewlIL34ofyGGpYV12tCXt41uqz9GLLapHTudT+4Y4p1rb2nWCNc8f HjI5gPn1wCY4jE9qRZWhLoQppXBMvN8hgQ/neFCvsWV9GWSbD6GB8h8cBwzUT4TL0iOP 8tgg== X-Received: by 10.220.149.82 with SMTP id s18mr302816vcv.14.1358991363216; Wed, 23 Jan 2013 17:36:03 -0800 (PST) MIME-Version: 1.0 Received: by 10.58.228.10 with HTTP; Wed, 23 Jan 2013 17:35:43 -0800 (PST) From: S Naik Date: Wed, 23 Jan 2013 17:35:43 -0800 Message-ID: Subject: cdh4 HA fencing fails when the other node is down To: user@hadoop.apache.org Content-Type: text/plain; charset=ISO-8859-1 X-Gm-Message-State: ALoCoQnsGLh7HZY7WAaH32iboHfF79svoWkuRaOfu6//QAGR3fYfJvoKkR6nLw5eahWxr159FV9G X-Virus-Checked: Checked by ClamAV on apache.org Hi, I am trying to setup HA Namenode using cdh4, zkfc. It works great when I kill -9 the active namenode. But if I reboot/shutdown the host with active namenode. Failover fails. The ZKFC complains fencing not succesful. It has no route to host exception. Is this expected ? I looked into mailing list. It seems that the fix is to move away from zkfc and use quorum based auto failover. But, this should be a pretty common requirement and I would think there will be a solution for this scenario (With zkfc). Please guide me/point me to solution. -Sagar