hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Íñigo Goiri (JIRA) <j...@apache.org>
Subject [jira] [Commented] (HDFS-14366) Improve HDFS append performance
Date Wed, 13 Mar 2019 06:01:00 GMT

    [ https://issues.apache.org/jira/browse/HDFS-14366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16791343#comment-16791343
] 

Íñigo Goiri commented on HDFS-14366:
------------------------------------

+1 on  [^HDFS-14366.001.patch]. 

> Improve HDFS append performance
> -------------------------------
>
>                 Key: HDFS-14366
>                 URL: https://issues.apache.org/jira/browse/HDFS-14366
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: hdfs
>    Affects Versions: 2.8.2
>            Reporter: Chao Sun
>            Assignee: Chao Sun
>            Priority: Major
>         Attachments: HDFS-14366.000.patch, HDFS-14366.001.patch, append-flamegraph.png
>
>
> In our HDFS cluster we observed that {{append}} operation can take as much as 10X write
lock time than other write operations. By collecting flamegraph on the namenode (see attachment:
append-flamegraph.png), we found that most of the append call is spent on {{getNumLiveDataNodes()}}:
> {code}
>   /** @return the number of live datanodes. */
>   public int getNumLiveDataNodes() {
>     int numLive = 0;
>     synchronized (this) {
>       for(DatanodeDescriptor dn : datanodeMap.values()) {
>         if (!isDatanodeDead(dn) ) {
>           numLive++;
>         }
>       }
>     }
>     return numLive;
>   }
> {code}
> this method synchronizes on the {{DatanodeManager}} which is particularly expensive in
large clusters since {{datanodeMap}} is being modified in many places such as processing DN
heartbeats.
> For {{append}} operation, {{getNumLiveDataNodes()}} is invoked in {{isSufficientlyReplicated}}:
> {code}
>   /**
>    * Check if a block is replicated to at least the minimum replication.
>    */
>   public boolean isSufficientlyReplicated(BlockInfo b) {
>     // Compare against the lesser of the minReplication and number of live DNs.
>     final int replication =
>         Math.min(minReplication, getDatanodeManager().getNumLiveDataNodes());
>     return countNodes(b).liveReplicas() >= replication;
>   }
> {code}
> The way that the {{replication}} is calculated is not very optimal, as it will call {{getNumLiveDataNodes()}}
_every time_ even though usually {{minReplication}} is much smaller than the latter. 
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org


Mime
View raw message