hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jason Lowe (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-3575) Job using 2.5 jars fails on a 2.6 cluster whose RM has been restarted
Date Mon, 18 May 2015 14:57:00 GMT

    [ https://issues.apache.org/jira/browse/YARN-3575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14548118#comment-14548118
] 

Jason Lowe commented on YARN-3575:
----------------------------------

The only way to support compatibility would be to remove the epoch number field from the container
ID, but I doubt that's going to happen at this point.  I filed this mostly to document the
fact that an incompatibility exists.  Most likely we'll have to recommend that users do _not_
perform a restart of the RM where it tries to recover (and therefore starts using an epoch
number in container IDs) as long as applications are running on the grid using YARN client
jars version 2.5 or earlier.  RM restart with recovery would only be supported as long as
all applications are using YARN jars >= 2.6.

> Job using 2.5 jars fails on a 2.6 cluster whose RM has been restarted
> ---------------------------------------------------------------------
>
>                 Key: YARN-3575
>                 URL: https://issues.apache.org/jira/browse/YARN-3575
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: client
>    Affects Versions: 2.6.0
>            Reporter: Jason Lowe
>
> Trying to launch a job that uses the 2.5 jars fails on a 2.6 cluster whose RM has been
restarted (i.e.: epoch != 0) becaue the epoch number starts appearing in the container IDs
and the 2.5 jars no longer know how to parse the container IDs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message