hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Kai Zheng (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-8411) Add bytes count metrics to datanode for ECWorker
Date Fri, 09 Dec 2016 06:43:58 GMT

    [ https://issues.apache.org/jira/browse/HDFS-8411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15734508#comment-15734508
] 

Kai Zheng commented on HDFS-8411:
---------------------------------

Thanks [~andrew.wang] for the ping! It's good for me to have a very close to the codes and
see the very good work here.

1. In {{StripedReader}} the new variable {{bytesRead}} needs to be cleared during the loop.
We don't run into this because our tests are using file length smaller than a strip.

2. The following tests would be good to merge to save some test time:
{noformat}
testEcTasks
testEcCodingTime
testEcBytesFullBlock
{noformat}
By the way, inherited from existing codes, not necessary to have the {{Ec}} prefix as they're
in the context of {{TestDataNodeErasureCodingMetrics}}.

3. In the following codes, {{blockSize}} could be {{cellSize}} instead, otherwise it's very
confusing and hard to understand the logic (the asserts).
{code}
  public void testEcBytesPartialGroup2() throws Exception {
    final int fileLen = blockSize + blockSize / 10;
    doTestForPartialGroup("/testEcBytes", fileLen, 0);
    // Add all reconstruction bytes read/write from all data nodes
    long bytesRead = 0;
    long bytesWrite = 0;
    for (DataNode dn : cluster.getDataNodes()) {
      MetricsRecordBuilder rb = getMetrics(dn.getMetrics().name());
      bytesRead += getLongCounter("EcReconstructionBytesRead", rb);
      bytesWrite += getLongCounter("EcReconstructionBytesWritten", rb);
    }

    Assert.assertEquals("ecReconstructionBytesRead should be ",
        blockSize + blockSize / 10, bytesRead);
    Assert.assertEquals("ecReconstructionBytesWritten should be ",
        blockSize, bytesWrite);
  }
{code}

4. I'm happy to see this new trick to count the metrics more reliably by looking at all datanodes:
{code}
    for (DataNode dn : cluster.getDataNodes()) {
      MetricsRecordBuilder rb = getMetrics(dn.getMetrics().name());
      bytesRead += getLongCounter("EcReconstructionBytesRead", rb);
      bytesWrite += getLongCounter("EcReconstructionBytesWritten", rb);
    }
{code}
So we could get rid of the trick played in {{doTest}} by using an extra datanode. In fact
we could refactor and get rid of {{doTest}} entirely, by reusing the codes of {{doTestForPartialGroup}}.
We could refactor doTestForPartialGroup to make it suitable for tests of all cases in file
length considering the boundaries of cell, block, group, and more than one group.

I thought 1, 2 and 3 are good to be fixed here and 4 could be done in a follow on issue as
it's not trivial. [~Sammi] could you proceed to help with these? Thank you!

> Add bytes count metrics to datanode for ECWorker
> ------------------------------------------------
>
>                 Key: HDFS-8411
>                 URL: https://issues.apache.org/jira/browse/HDFS-8411
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>            Reporter: Li Bo
>            Assignee: SammiChen
>         Attachments: HDFS-8411-001.patch, HDFS-8411-002.patch, HDFS-8411-003.patch, HDFS-8411-004.patch,
HDFS-8411-005.patch, HDFS-8411-006.patch, HDFS-8411-007.patch, HDFS-8411-008.patch, HDFS-8411-009.patch,
HDFS-8411.010.patch
>
>
> This is a sub task of HDFS-7674. It calculates the amount of data that is read from local
or remote to attend decoding work, and also the amount of data that is written to local or
remote datanodes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org


Mime
View raw message