hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Christian Kunz (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-2771) changing the number of reduces dramatically changes the time of the map time
Date Sun, 23 Nov 2008 04:03:44 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-2771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12649992#action_12649992
] 

Christian Kunz commented on HADOOP-2771:
----------------------------------------

Checked similar job in hadoop-0.18.1 with block compression of transient data turned on.

Merge-sort of map spills still depends strongly on number of reduces, but less than in earlier
releases.

On the average a map task took:
1hr 13min with 9000 reduces
1hr 19min with 18000 reduces
Of this time on the average 51 minutes were taken up by the application, i.e. the merge-sort
of the map spills increased from 22min to 28min when doubling the number of reduces.

Overall the time spent in merge-sort of the map spills increased because of compression (before
0.18 block compression of transient data could not be used at that scale), but the dependence
on the number of reduces decreased

> changing the number of reduces dramatically changes the time of the map time
> ----------------------------------------------------------------------------
>
>                 Key: HADOOP-2771
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2771
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.17.1
>            Reporter: Owen O'Malley
>             Fix For: 0.20.0
>
>
> By changing the number of reduces, the time for an individual map changes radically.
By running the same program and data with different numbers of reduces (2500, 7500, 25000)
the times for each map changed radically (0:50, 1:20, 5h).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message