hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hadoop QA (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MAPREDUCE-6263) Large jobs can lose history when killed due to brief client timeout
Date Tue, 10 Mar 2015 00:32:38 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-6263?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14353969#comment-14353969
] 

Hadoop QA commented on MAPREDUCE-6263:
--------------------------------------

{color:green}+1 overall{color}.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12703506/MAPREDUCE-6263.v2.txt
  against trunk revision d6e05c5.

    {color:green}+1 @author{color}.  The patch does not contain any @author tags.

    {color:green}+1 tests included{color}.  The patch appears to include 1 new or modified
test files.

    {color:green}+1 javac{color}.  The applied patch does not increase the total number of
javac compiler warnings.

    {color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

    {color:green}+1 eclipse:eclipse{color}.  The patch built with eclipse:eclipse.

    {color:green}+1 findbugs{color}.  The patch does not introduce any new Findbugs (version
2.0.3) warnings.

    {color:green}+1 release audit{color}.  The applied patch does not increase the total number
of release audit warnings.

    {color:green}+1 core tests{color}.  The patch passed unit tests in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient.

Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5247//testReport/
Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5247//console

This message is automatically generated.

> Large jobs can lose history when killed due to brief client timeout
> -------------------------------------------------------------------
>
>                 Key: MAPREDUCE-6263
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6263
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: client
>    Affects Versions: 2.6.0
>            Reporter: Jason Lowe
>            Assignee: Eric Payne
>         Attachments: MAPREDUCE-6263.v1.txt, MAPREDUCE-6263.v2.txt
>
>
> YARNRunner connects to the AM to send the kill job command then waits a hardcoded 10
seconds for the job to enter a terminal state.  If the job fails to enter a terminal state
in that time then YARNRunner will tell YARN to kill the application forcefully.  The latter
type of kill usually results in no job history, since the AM process is killed forcefully.
> Ten seconds can be too short for large jobs in a large cluster, as it takes time to connect
to all the nodemanagers, process the state machine events, and copy a large jhist file.  The
timeout should be more lenient or configurable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message