hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Daryn Sharp (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-7433) DatanodeMap lookups & DatanodeID hashCodes are inefficient
Date Tue, 25 Nov 2014 15:21:12 GMT

    [ https://issues.apache.org/jira/browse/HDFS-7433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14224667#comment-14224667

Daryn Sharp commented on HDFS-7433:

My bad mentioning {{datanodeMap}} - juggling too many changes.  {{DatanodeIDs}} are added
to collections in many other places, and equality checks occur often.  My more general point
is mutable hashCodes are a hidden landmine which is why I filed another jira.  Dynamic computation
of the xfer addr (and by extension the hash) is inefficient and generates a lot of garbage.

I'm checking out the odd test failures.  They don't appear related, at least  the xml parsing
and class def not founds.

> DatanodeMap lookups & DatanodeID hashCodes are inefficient
> ----------------------------------------------------------
>                 Key: HDFS-7433
>                 URL: https://issues.apache.org/jira/browse/HDFS-7433
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: namenode
>    Affects Versions: 2.0.0-alpha, 3.0.0
>            Reporter: Daryn Sharp
>            Assignee: Daryn Sharp
>            Priority: Critical
>         Attachments: HDFS-7433.patch
> The datanode map is currently a {{TreeMap}}.  For many thousands of datanodes, tree lookups
are ~10X more expensive than a {{HashMap}}.  Insertions and removals are up to 100X more expensive.

This message was sent by Atlassian JIRA

View raw message