hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hanisha Koneru (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-12288) Fix DataNode's xceiver count calculation
Date Fri, 11 Aug 2017 18:25:00 GMT

    [ https://issues.apache.org/jira/browse/HDFS-12288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16123794#comment-16123794

Hanisha Koneru commented on HDFS-12288:

Thanks [~shahrs87] for the review.
[~lukmajercak], {{DataNode#threadGroup}} doesn't account for {{DataXceiver}} threads alone.
It has the following daemons as well
- {{DataXceiverServer}}
- {{BlockRecoveryWorker#recoverBlocks()}}
- {{BlockReceiver#PacketResponder}}

So if we change the {{DataNodeMetrics#getDataNodeActiveXceiversCount}} to reflect only the
DataXceiver thread count, we lose out on the count for other active threads in the thread
I would say we need to have 3 metrics to capture all the thread counts:
- {{dataNodeActiveXceiversCount}} for active DataXceiver threads
- {{dataNodePacketResponderCount}} for active PacketResponder threads
- {{dataNodeActiveThreadCount}} for all the active threads in the datanode.

 [~shahrs87], please correct me if I am wrong.

> Fix DataNode's xceiver count calculation
> ----------------------------------------
>                 Key: HDFS-12288
>                 URL: https://issues.apache.org/jira/browse/HDFS-12288
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: datanode, hdfs
>            Reporter: Lukas Majercak
>            Assignee: Lukas Majercak
>         Attachments: HDFS-12288.001.patch, HDFS-12288.002.patch
> The problem with the ThreadGroup.activeCount() method is that the method is only a very
rough estimate, and in reality returns the total number of threads in the thread group as
opposed to the threads actually running.
> In some DNs, we saw this to return 50~ for a long time, even though the actual number
of DataXceiver threads was next to none.
> This is a big issue as we use the xceiverCount to make decisions on the NN for choosing
replication source DN or returning DNs to clients for R/W.
> The plan is to reuse the DataNodeMetrics.dataNodeActiveXceiversCount value which only
accounts for actual number of DataXcevier threads currently running and thus represents the
load on the DN much better.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org

View raw message