hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Rushabh S Shah (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-12288) Fix DataNode's xceiver count calculation
Date Sat, 12 Aug 2017 02:51:00 GMT

    [ https://issues.apache.org/jira/browse/HDFS-12288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16124420#comment-16124420

Rushabh S Shah commented on HDFS-12288:

bq. It does not represent number of active threads in the process either. 
I agree that the calculation is broken. I am +1 for fixing this broken behavior.
My only point is we can't change what the conf key {{DFS_DATANODE_MAX_RECEIVER_THREADS_DEFAULT}}
actually signifies.
Because many admins have bumped up or down this value.
So we have to count 2 threads for each Block receiver to maintain backwards compatibility.
If you want to have a metric for Data Xceiver thread, then I am all in for fixing that.
But then we have to change the following method name. Lets call it as {{getActiveNumberOfThreads}}
which will return sum of {{DataXceiver}}, {{PacketResponder}}, {{BlockRecoveryWorker}} threads.
 /** Number of concurrent xceivers per node. */
  @Override // DataNodeMXBean
  public int getXceiverCount() {
    return threadGroup == null ? 0 : threadGroup.activeCount();

The following code will be changed to
Old code:
       int curXceiverCount = datanode.getXceiverCount();

        if (curXceiverCount > maxXceiverCount) {
          throw new IOException("Xceiver count " + curXceiverCount
              + " exceeds the limit of concurrent xcievers: "
              + maxXceiverCount);

After change, it should look like:
    int totalNumActiveThreads = datanode.getActiveNumberOfThreads();

          throw new IOException("Total number of active threads count " + totalNumActiveThreads
              + " exceeds the limit of configured count: "

Hope it makes sense.

> Fix DataNode's xceiver count calculation
> ----------------------------------------
>                 Key: HDFS-12288
>                 URL: https://issues.apache.org/jira/browse/HDFS-12288
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: datanode, hdfs
>            Reporter: Lukas Majercak
>            Assignee: Lukas Majercak
>         Attachments: HDFS-12288.001.patch, HDFS-12288.002.patch
> The problem with the ThreadGroup.activeCount() method is that the method is only a very
rough estimate, and in reality returns the total number of threads in the thread group as
opposed to the threads actually running.
> In some DNs, we saw this to return 50~ for a long time, even though the actual number
of DataXceiver threads was next to none.
> This is a big issue as we use the xceiverCount to make decisions on the NN for choosing
replication source DN or returning DNs to clients for R/W.
> The plan is to reuse the DataNodeMetrics.dataNodeActiveXceiversCount value which only
accounts for actual number of DataXcevier threads currently running and thus represents the
load on the DN much better.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org

View raw message