Return-Path: X-Original-To: apmail-hadoop-hdfs-issues-archive@minotaur.apache.org Delivered-To: apmail-hadoop-hdfs-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 2C8A0E4E2 for ; Wed, 20 Feb 2013 01:24:13 +0000 (UTC) Received: (qmail 74618 invoked by uid 500); 20 Feb 2013 01:24:13 -0000 Delivered-To: apmail-hadoop-hdfs-issues-archive@hadoop.apache.org Received: (qmail 74566 invoked by uid 500); 20 Feb 2013 01:24:12 -0000 Mailing-List: contact hdfs-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hdfs-issues@hadoop.apache.org Delivered-To: mailing list hdfs-issues@hadoop.apache.org Received: (qmail 74557 invoked by uid 99); 20 Feb 2013 01:24:12 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 20 Feb 2013 01:24:12 +0000 Date: Wed, 20 Feb 2013 01:24:12 +0000 (UTC) From: "Suresh Srinivas (JIRA)" To: hdfs-issues@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Updated] (HDFS-4222) NN is unresponsive and lose heartbeats of DNs when Hadoop is configured to use LDAP and LDAP has issues MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HDFS-4222?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Suresh Srinivas updated HDFS-4222: ---------------------------------- Affects Version/s: 1.0.0 2.0.0-alpha Status: Patch Available (was: Open) > NN is unresponsive and lose heartbeats of DNs when Hadoop is configured to use LDAP and LDAP has issues > ------------------------------------------------------------------------------------------------------- > > Key: HDFS-4222 > URL: https://issues.apache.org/jira/browse/HDFS-4222 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode > Affects Versions: 2.0.0-alpha, 0.23.3, 1.0.0 > Reporter: Xiaobo Peng > Assignee: Xiaobo Peng > Priority: Minor > Attachments: hdfs-4222-branch-0.23.3.patch, HDFS-4222.patch, hdfs-4222-release-1.0.3.patch > > > For Hadoop clusters configured to access directory information by LDAP, the FSNamesystem calls on behave of DFS clients might hang due to LDAP issues (including LDAP access issues caused by networking issues) while holding the single lock of FSNamesystem. That will result in the NN unresponsive and loss of the heartbeats from DNs. > The places LDAP got accessed by FSNamesystem calls are the instantiation of FSPermissionChecker, which could be moved out of the lock scope since the instantiation does not need the FSNamesystem lock. After the move, a DFS client hang will not affect other threads by hogging the single lock. This is especially helpful when we use separate RPC servers for ClientProtocol and DatanodeProtocol since the calls for DatanodeProtocol do not need to access LDAP. So even if DFS clients hang due to LDAP issues, the NN will still be able to process the requests (including heartbeats) from DNs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira