Return-Path: X-Original-To: apmail-hadoop-hdfs-issues-archive@minotaur.apache.org Delivered-To: apmail-hadoop-hdfs-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 627F5D5AE for ; Mon, 15 Oct 2012 20:47:05 +0000 (UTC) Received: (qmail 34726 invoked by uid 500); 15 Oct 2012 20:47:05 -0000 Delivered-To: apmail-hadoop-hdfs-issues-archive@hadoop.apache.org Received: (qmail 34688 invoked by uid 500); 15 Oct 2012 20:47:05 -0000 Mailing-List: contact hdfs-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hdfs-issues@hadoop.apache.org Delivered-To: mailing list hdfs-issues@hadoop.apache.org Received: (qmail 34678 invoked by uid 99); 15 Oct 2012 20:47:05 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 15 Oct 2012 20:47:05 +0000 Date: Mon, 15 Oct 2012 20:47:05 +0000 (UTC) From: "Eli Collins (JIRA)" To: hdfs-issues@hadoop.apache.org Message-ID: <1923976814.47358.1350334025129.JavaMail.jiratomcat@arcas> In-Reply-To: <1303623794.139118.1348839667808.JavaMail.jiratomcat@arcas> Subject: [jira] [Commented] (HDFS-3990) NN's health report has severe performance problems MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HDFS-3990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13476429#comment-13476429 ] Eli Collins commented on HDFS-3990: ----------------------------------- Why not use the DatanodeID hostName field instead of calling and caching InetAddress#getByName in the NN? The DN has already done the lookup (modulo the tests which use dfs.datanode.hostname) and this way we don't have to worry about inconsistency between the nodeAddr and the ipAddr/hostName fields. For sanity the NN could do a lookup when the DN registers and compare it's value to the DN reported one. Comments on this patch: - In registerDatanode why is OK to no longer update the registration info with the reported IP? - The comments in DatanodeManager ("Mostly called inside an RPC.".. and "Update the IP to the address of the RPC request"..) are no longer accurate after your change. > NN's health report has severe performance problems > -------------------------------------------------- > > Key: HDFS-3990 > URL: https://issues.apache.org/jira/browse/HDFS-3990 > Project: Hadoop HDFS > Issue Type: Bug > Components: name-node > Affects Versions: 0.23.0, 2.0.0-alpha, 3.0.0 > Reporter: Daryn Sharp > Assignee: Daryn Sharp > Priority: Critical > Attachments: HDFS-3990.patch, HDFS-3990.patch > > > The dfshealth page will place a read lock on the namespace while it does a dns lookup for every DN. On a multi-thousand node cluster, this often results in 10s+ load time for the health page. 10 concurrent requests were found to cause 7m+ load times during which time write operations blocked. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira