hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Doug Cutting (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-3632) Fix speculative execution or allow premature stop
Date Tue, 24 Jun 2008 23:05:45 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-3632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12607798#action_12607798

Doug Cutting commented on HADOOP-3632:

You could implement this yourself by having your map tasks periodically query the jobtracker
for percent completion and exit when things pass some threshold.  You could, e.g., define
and use a subclass of MapRunnner that starts a thread that polls the JobTracker for this every
second or so.  To check map progress:

float mapProgress = new JobClient(job).getJob(job.get("mapred.job.id")).getMapProgress();

> Fix speculative execution or allow premature stop
> -------------------------------------------------
>                 Key: HADOOP-3632
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3632
>             Project: Hadoop Core
>          Issue Type: New Feature
>    Affects Versions: 0.16.3
>            Reporter: Severin Hacker
>   Original Estimate: 72h
>  Remaining Estimate: 72h
> I run 50 iterations of a program with 500 maps and no reduces. I have noticed the following
> In 50% of the iterations:
> 499 maps finish in 50 seconds
> 1 map finishes after 4 minutes
> Total time is 4 minutes.
> In 50% of the iterations:
> 500 maps finish in 50 seconds
> Total time is 50 seconds.
> It would be nice if I could tell hadoop to stop after 99% of the maps have finished (and
not wait for that last straggler). In my application it's perfectly fine if I only get 99%
of the results, as long as the straggler is not always using the same data.
> Please fix!

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message