Return-Path: X-Original-To: apmail-hadoop-hdfs-issues-archive@minotaur.apache.org Delivered-To: apmail-hadoop-hdfs-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id F1179104DF for ; Tue, 25 Nov 2014 15:21:12 +0000 (UTC) Received: (qmail 60869 invoked by uid 500); 25 Nov 2014 15:21:12 -0000 Delivered-To: apmail-hadoop-hdfs-issues-archive@hadoop.apache.org Received: (qmail 60820 invoked by uid 500); 25 Nov 2014 15:21:12 -0000 Mailing-List: contact hdfs-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hdfs-issues@hadoop.apache.org Delivered-To: mailing list hdfs-issues@hadoop.apache.org Received: (qmail 60808 invoked by uid 99); 25 Nov 2014 15:21:12 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 25 Nov 2014 15:21:12 +0000 Date: Tue, 25 Nov 2014 15:21:12 +0000 (UTC) From: "Daryn Sharp (JIRA)" To: hdfs-issues@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (HDFS-7433) DatanodeMap lookups & DatanodeID hashCodes are inefficient MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HDFS-7433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14224667#comment-14224667 ] Daryn Sharp commented on HDFS-7433: ----------------------------------- My bad mentioning {{datanodeMap}} - juggling too many changes. {{DatanodeIDs}} are added to collections in many other places, and equality checks occur often. My more general point is mutable hashCodes are a hidden landmine which is why I filed another jira. Dynamic computation of the xfer addr (and by extension the hash) is inefficient and generates a lot of garbage. I'm checking out the odd test failures. They don't appear related, at least the xml parsing and class def not founds. > DatanodeMap lookups & DatanodeID hashCodes are inefficient > ---------------------------------------------------------- > > Key: HDFS-7433 > URL: https://issues.apache.org/jira/browse/HDFS-7433 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode > Affects Versions: 2.0.0-alpha, 3.0.0 > Reporter: Daryn Sharp > Assignee: Daryn Sharp > Priority: Critical > Attachments: HDFS-7433.patch > > > The datanode map is currently a {{TreeMap}}. For many thousands of datanodes, tree lookups are ~10X more expensive than a {{HashMap}}. Insertions and removals are up to 100X more expensive. -- This message was sent by Atlassian JIRA (v6.3.4#6332)