hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Harsh J (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MAPREDUCE-5288) ResourceEstimator#getEstimatedTotalMapOutputSize suffers from divide by zero issues
Date Wed, 05 Jun 2013 03:17:20 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-5288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13675556#comment-13675556
] 

Harsh J commented on MAPREDUCE-5288:
------------------------------------

Hi Karthik,

The estimate blocks out scheduling so its kinda critical. We hold back on scheduling if the
estimate (which is incorrect) is higher than the node's disk free space report. This is ending
up blocking up all map tasks from getting scheduled.

I think what [~azuryy] is pointing out is that do not try to estimate if completed maps isn't
> 0. The code he points out, says this shouldn't ever happen but over the mailing list
we did see this happen.
                
> ResourceEstimator#getEstimatedTotalMapOutputSize suffers from divide by zero issues
> -----------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-5288
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5288
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: mrv1
>    Affects Versions: 1.2.0
>            Reporter: Harsh J
>            Assignee: Karthik Kambatla
>
> The computation in the above mentioned class-method is below:
> {code}
>       long estimate = Math.round(((double)inputSize * 
>           completedMapsOutputSize * 2.0)/completedMapsInputSize);
> {code}
> Given http://docs.oracle.com/javase/6/docs/api/java/lang/Math.html#round(double), its
possible that the returned estimate could be Long.MAX_VALUE if completedMapsInputSize is determined
to be zero.
> This can be proven with a simple code snippet:
> {code}
> class Foo {
>     public static void main(String... args) {
>         long inputSize = 600L + 2;
>         long estimate = Math.round(((double)inputSize *
>                               1L * 2.0)/0L);
>         System.out.println(estimate);
>     }
> }
> {code}
> The above conveniently prints out: {{9223372036854775807}}, which is Long.MAX_VALUE (or
8 Exbibytes per MapReduce).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message