hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Rushabh S Shah (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-12288) Fix DataNode's xceiver count calculation
Date Fri, 11 Aug 2017 15:27:00 GMT

    [ https://issues.apache.org/jira/browse/HDFS-12288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16123483#comment-16123483

Rushabh S Shah commented on HDFS-12288:

        // the load for writers is 2 because both the write xceiver & packet
        // responder threads are counted in the load
        expectedTotalLoad += fileRepl;
        expectedInServiceLoad += fileRepl;
This comment is there for a reason.
When we receive a block, it creates 2 thread. One is DataXceiver thread and other is Packet
Responder thread.
If we are using {{DataNodeMetrics#getDataNodeActiveXceiversCount}} as a replacement for {{activeThreadCount}}
then we need to add {{PacketResponderThread}} to {{DataNodeMetrics#dataNodeActiveXceiversCount}}
otherwise we will create twice number of threads compared to today.

-    return threadGroup == null ? 0 : threadGroup.activeCount();
+    return metrics == null ? 0 : metrics.getDataNodeActiveXceiversCount();
Need to check once more that is there a possibility that datanode can start without initializing
the metrics.
Looking at the code, I think its not possible but just need to make sure.

> Fix DataNode's xceiver count calculation
> ----------------------------------------
>                 Key: HDFS-12288
>                 URL: https://issues.apache.org/jira/browse/HDFS-12288
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: datanode, hdfs
>            Reporter: Lukas Majercak
>            Assignee: Lukas Majercak
>         Attachments: HDFS-12288.001.patch
> The problem with the ThreadGroup.activeCount() method is that the method is only a very
rough estimate, and in reality returns the total number of threads in the thread group as
opposed to the threads actually running.
> In some DNs, we saw this to return 50~ for a long time, even though the actual number
of DataXceiver threads was next to none.
> This is a big issue as we use the xceiverCount to make decisions on the NN for choosing
replication source DN or returning DNs to clients for R/W.
> The plan is to reuse the DataNodeMetrics.dataNodeActiveXceiversCount value which only
accounts for actual number of DataXcevier threads currently running and thus represents the
load on the DN much better.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org

View raw message