hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Vinod K V (JIRA)" <j...@apache.org>
Subject [jira] Commented: (MAPREDUCE-967) TaskTracker does not need to fully unjar job jars
Date Fri, 11 Sep 2009 05:00:57 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12754003#action_12754003

Vinod K V commented on MAPREDUCE-967:

bq. Now that there are changes to RunJar, for trunk, can you move RunJar to mapreduce from
bq. Sure thing. I'll take care of that when I post the patch for trunk.
Thanks! Please also make sure the new class in mapreduce is in org.apache.hadoop.mapreduce.util
and create a deprecate class in org.apache.hadoop.util which uses functionality from org.apache.hadoop.mapreduce.util.RunJar.

bq. Do you see any use for filters here beyond a straight regex? We can express the old behaviour
as /./ and the new behavior as /^(lib|classes)\//.
Yes, that should do, I think.

bq. Also, I'd prefer to make this an *undocumented configuration parameter, since I think
there is very little use for the old version and we don't want to encourage people to abuse
Agreed. Even now, it is undocumented, AFAIK. A more appropriate reasoning for making it a
configuration is that some users may want directories other than lib or classes to be unjarred.

bq. Would you see this being used as a per-job option or a TaskTracker-scoped option?
Per-job. By the time, we un-jar stuff on the TT, job configuration is already localized, so
it's easy to get this option just before un-jarring.

> TaskTracker does not need to fully unjar job jars
> -------------------------------------------------
>                 Key: MAPREDUCE-967
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-967
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: tasktracker
>    Affects Versions: 0.21.0
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>         Attachments: mapreduce-967-branch-0.20.txt
> In practice we have seen some users submitting job jars that consist of 10,000+ classes.
Unpacking these jars into mapred.local.dir and then cleaning up after them has a significant
cost (both in wall clock and in unnecessary heavy disk utilization). This cost can be easily

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message