hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Greg Roelofs (JIRA)" <j...@apache.org>
Subject [jira] Commented: (MAPREDUCE-1220) Implement an in-cluster LocalJobRunner
Date Tue, 08 Mar 2011 16:49:00 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-1220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13004052#comment-13004052
] 

Greg Roelofs commented on MAPREDUCE-1220:
-----------------------------------------

Status: basically functional; I believe all otherwise-passing unit tests still pass. Unfortunately,
because of the duration over which patches were committed (and intervening commits), there's
no easy way (that I'm aware of) to merge everything back into one patch. I'm currently working
on the "MR v2" version (see MAPREDUCE-279), which is much less hackish and shares very little
with the version above. I'm not sure this version has a future, but the patches are here if
anyone is interested.

Known bugs:

 - "Re-localization" is missing. Specifically, because all subtasks run in the same JVM, and
Java doesn't have chdir(), there's no clean way to isolate them from each other. If any but
the last sub-MapTask does something obnoxious (e.g., delete a distcache symlink or create
a file that any other subtask wants to create), things will break.  Obviously this is a problem
for an optimization that's supposed to be (mostly) transparent to users.

 - Progress is still broken, apparently. Everything seemed to check out when I had gobs of
debugging in there, but it doesn't make it to the UI (including the client) as frequently
as it should. No clue what broke.

 - The max-input-size decision criterion (in JobInProgress) should check the default block
size (if appropriate) for the actual input filesystem, not use a hardcoded HDFS config that's
not necessarily available to tasktracker nodes anyway.

 - The UI changes are incomplete, and there are some 404 and error links in some cases. Basically,
the whole idea of masquerading an UberTask as a ReduceTask, yet exposing it to the user in
some cases, is awkward, and there are a _lot_ of JSP pages to handle.

There are also some cleanup items (test and potentially enable reduce-only case; fix memory
criterion in uber-decision for map-only [and reduce-only] cases; clean up TaskStatus mess;
instead of renaming file.out to map_#.out, always use attemptID.out; etc.).  However, those
kind of pale in comparison to the overall intrusive grubbiness of the patch. :-/

> Implement an in-cluster LocalJobRunner
> --------------------------------------
>
>                 Key: MAPREDUCE-1220
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1220
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>          Components: client, jobtracker
>            Reporter: Arun C Murthy
>            Assignee: Greg Roelofs
>         Attachments: MAPREDUCE-1220_yhadoop20.patch, MR-1220.v1.trunk-hadoop-common.Progress-dumper.patch.txt,
MR-1220.v10e-v11c-v12b.ytrunk-hadoop-mapreduce.delta.patch.txt, MR-1220.v13.ytrunk-hadoop-mapreduce.delta.patch.txt,
MR-1220.v14b.ytrunk-hadoop-mapreduce.delta.patch.txt, MR-1220.v15.ytrunk-hadoop-mapreduce.delta.patch.txt,
MR-1220.v2.trunk-hadoop-mapreduce.patch.txt, MR-1220.v2.trunk-hadoop-mapreduce.patch.txt,
MR-1220.v6.ytrunk-hadoop-mapreduce.patch.txt, MR-1220.v7.ytrunk-hadoop-mapreduce.delta.patch.txt,
MR-1220.v8b.ytrunk-hadoop-mapreduce.delta.patch.txt, MR-1220.v9c.ytrunk-hadoop-mapreduce.delta.patch.txt
>
>
> Currently very small map-reduce jobs suffer from latency issues due to overheads in Hadoop
Map-Reduce such as scheduling, jvm startup etc. We've periodically tried to optimize all parts
of framework to achieve lower latencies.
> I'd like to turn the problem around a little bit. I propose we allow very small jobs
to run as a single task job with multiple maps and reduces i.e. similar to our current implementation
of the LocalJobRunner. Thus, under certain conditions (maybe user-set configuration, or if
input data is small i.e. less a DFS blocksize) we could launch a special task which will run
all maps in a serial manner, followed by the reduces. This would really help small jobs achieve
significantly smaller latencies, thanks to lesser scheduling overhead, jvm startup, lack of
shuffle over the network etc. 
> This would be a huge benefit, especially on large clusters, to small Hive/Pig queries.
> Thoughts?

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message