hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "He Xiaoqiao (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-14171) Performance improvement in Tailing EditLog
Date Thu, 27 Jun 2019 12:08:00 GMT

    [ https://issues.apache.org/jira/browse/HDFS-14171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16874055#comment-16874055
] 

He Xiaoqiao commented on HDFS-14171:
------------------------------------

Ping [~jojochuang] [~kennethlnnn], as HDFS-14613 report this issue again, do we need backport
this patch to other branch?

> Performance improvement in Tailing EditLog
> ------------------------------------------
>
>                 Key: HDFS-14171
>                 URL: https://issues.apache.org/jira/browse/HDFS-14171
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: namenode
>    Affects Versions: 2.9.0, 3.0.0-alpha1
>            Reporter: Kenneth Yang
>            Assignee: Kenneth Yang
>            Priority: Major
>             Fix For: 2.10.0, 3.0.4, 3.1.2, 3.3.0, 3.2.1, 2.9.3
>
>         Attachments: HDFS-14171.000.patch, HDFS-14171.001.patch, HDFS-14171.002.patch,
HDFS-14171.003.patch, HDFS-14171_Call-Tree.png
>
>
> Stack:
> {code:java}
> Thread 456 (Edit log tailer):
> State: RUNNABLE
> Blocked count: 1139
> Waited count: 12
> Stack:
> org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager.getNumLiveDataNodes(DatanodeManager.java:1259)
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManagerSafeMode.areThresholdsMet(BlockManagerSafeMode.java:570)
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManagerSafeMode.checkSafeMode(BlockManagerSafeMode.java:213)
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManagerSafeMode.adjustBlockTotals(BlockManagerSafeMode.java:265)
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.completeBlock(BlockManager.java:1087)
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.forceCompleteBlock(BlockManager.java:1118)
> org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.updateBlocks(FSEditLogLoader.java:1126)
> org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.applyEditLogOp(FSEditLogLoader.java:468)
> org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:258)
> org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:161)
> org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:892)
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer.doTailEdits(EditLogTailer.java:321)
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.doWork(EditLogTailer.java:460)
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.access$400(EditLogTailer.java:410)
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread$1.run(EditLogTailer.java:427)
> org.apache.hadoop.security.SecurityUtil.doAsLoginUserOrFatal(SecurityUtil.java:414)
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.run(EditLogTailer.java:423)
> Thread 455 (pool-16-thread-1):
> {code}
> code:
> {code:java}
> private boolean areThresholdsMet() {
>   assert namesystem.hasWriteLock();
>   int datanodeNum = blockManager.getDatanodeManager().getNumLiveDataNodes();
>   synchronized (this) {
>     return blockSafe >= blockThreshold && datanodeNum >= datanodeThreshold;
>   }
> }
> {code}
> According to the code, each time the method areThresholdsMet() is called, the value of
{color:#ff0000}datanodeNum{color} is need to be calculated.  However, in the scenario of
{color:#ff0000}datanodeThreshold{color} is equal to 0(0 is the default value of the configuration),
This expression datanodeNum >= datanodeThreshold always returns true.
> Calling the method {color:#ff0000}getNumLiveDataNodes(){color} is time consuming at a
scale of 10,000 datanode clusters. Therefore, we add the judgment condition, and only when
the datanodeThreshold is greater than 0, the datanodeNum is calculated, which improves the
perfomance greatly.
> The Call Tree graph is shown in the attached file.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org


Mime
View raw message