hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Arpit Gupta (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MAPREDUCE-4749) Killing multiple attempts of a task taker longer as more attempts are killed
Date Wed, 31 Oct 2012 18:16:13 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-4749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13488042#comment-13488042
] 

Arpit Gupta commented on MAPREDUCE-4749:
----------------------------------------

With this patch applied killing tasks works as expected as in it the time to kill the attempts
no longer increases exponentially. All attempts get killed as soon as when the request gets
accepted.

Ran the system tests that were failing a few times and they passed.

Also the ran the full unit tests and there were no failures in the mapred tests.
                
> Killing multiple attempts of a task taker longer as more attempts are killed
> ----------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-4749
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4749
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>    Affects Versions: 1.1.0
>            Reporter: Arpit Gupta
>            Assignee: Arpit Gupta
>         Attachments: MAPREDUCE-4749.branch-1.patch
>
>
> The following was noticed on a mr job running on hadoop 1.1.0
> 1. Start an mr job with 1 mapper
> 2. Wait for a min
> 3. Kill the first attempt of the mapper and then subsequently kill the other 3 attempts
in order to fail the job
> The time taken to kill the task grew exponentially.
> 1st attempt was killed immediately.
> 2nd attempt took a little over a min
> 3rd attempt took approx. 20 mins
> 4th attempt took around 3 hrs.
> The command used to kill the attempt was "hadoop job -fail-task"
> Note that the command returned immediately as soon as the fail attempt was accepted but
the time the attempt was actually killed was as stated above.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message