hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jason Lowe (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MAPREDUCE-5873) Measure bw of a single copy call and display the correct aggregated bw
Date Fri, 03 Oct 2014 21:22:34 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-5873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14158540#comment-14158540

Jason Lowe commented on MAPREDUCE-5873:

[~jrottinghuis], agree the current method of computing bandwidth is broken.

Thanks for the patch, [~l201514].  The patch has gone stale, can you please rebase on trunk?
 Other comments:

Why is the following change necessary?  I don't see how mapOutput can be null, but maybe I'm
missing something:
     if (!finishedMaps[mapIndex]) {
-      output.commit();
+      if (output != null) {
+        output.commit();
+      }

We should provide a sizing hint to the ArrayList constructor during computeTotalCopyMills
since we know worst-case how big it should be (intervals.size()+1) and we can avoid reallocation/copy
during add.

In the test, the comment doesn't match the code in a few places:
      //100MB from 50s to 100s
      bytes = (long)100 * 1024 * 1024;
      scheduler.copySucceeded(attemptID3, new MapHost(null, null), bytes, 200000, 300000,
      //100MB from 50s to 100s
      bytes = (long)100 * 1024 * 1024;
      scheduler.copySucceeded(attemptID4, new MapHost(null, null), bytes, 100000, 150000,

Would like to see the unit test exercise all conditions of the interval adding code, such
as adding a large overlapping interval (e.g.: 0-500s) late.

Nit: Not sure the intervals.size() == 0 check is worth the extra code, especially if we just
initialize intervals to Collections.emptyList().

Nit: it would be nice to be consistent about Mills vs Millis.  I prefer Millis myself, as
it's used elsewhere in the code, or maybe CopyMillsTracker should just be CopyTimeTracker
with a getCopyMillis method.

Nit: The name TestMultipleTestSucceeded didn't convey to me what it's actually testing.  Maybe
TestAggregatedTransferRate would be more descriptive?

> Measure bw of a single copy call and display the correct aggregated bw
> ----------------------------------------------------------------------
>                 Key: MAPREDUCE-5873
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5873
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>    Affects Versions: 2.3.0
>            Reporter: Siqi Li
>            Assignee: Siqi Li
>         Attachments: MAPREDUCE-5873.v1.patch, MAPREDUCE-5873.v2.patch, MAPREDUCE-5873.v3.patch
> Currently ShuffleScheduler in ReduceTask JVM status displays bandwidth. Its definition
however is confusing because it captures the time where there is no copying because there
is a pause between when new wave of map outputs is available.
> current bw is definded as (bytes copied so far) / (total time in the copy phase so far)
> It would be more useful 
> 1) to measure bandwidth of a single copy call.
> 2) display aggregated bw as long as there is at least one fetcher is in the copy call.

This message was sent by Atlassian JIRA

View raw message