hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Vinod K V (JIRA)" <j...@apache.org>
Subject [jira] Commented: (MAPREDUCE-989) Allow segregation of DistributedCache for maps and reduces
Date Thu, 17 Sep 2009 06:23:59 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-989?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12756380#action_12756380

Vinod K V commented on MAPREDUCE-989:

Talked with Milind offline who explained me the use case. So, the segregation is actually
intended for optimizing run-times of tasks by not downloading cache files that are not needed
by them.

For e.g., setup tasks don't need dist-cache files at all and so will run faster if they don't
download files intended for maps/reduces. Also for jobs which need dist-cache files only for
reduces, the maps, which may be much larger in number than reduces, will run faster and the
overall job-execution time will improve.

> Allow segregation of DistributedCache for maps and reduces
> ----------------------------------------------------------
>                 Key: MAPREDUCE-989
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-989
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: client
>            Reporter: Arun C Murthy
> Applications might have differing needs for files in the DistributedCache wrt maps and
reduces. We should allow them to specify them separately.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message