ambari-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hadoop QA (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (AMBARI-20447) YARN service check failed during HDP 2.4-2.6 rolling upgrade with YARN HA enabled
Date Tue, 14 Mar 2017 18:03:41 GMT

    [ https://issues.apache.org/jira/browse/AMBARI-20447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15924700#comment-15924700
] 

Hadoop QA commented on AMBARI-20447:
------------------------------------

{color:red}-1 overall{color}.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12858720/AMBARI-20447.patch
  against trunk revision .

    {color:green}+1 @author{color}.  The patch does not contain any @author tags.

    {color:red}-1 tests included{color}.  The patch doesn't appear to include any new or modified
tests.
                        Please justify why no new tests are needed for this patch.
                        Also please list what manual steps were performed to verify this patch.

    {color:green}+1 javac{color}.  The applied patch does not increase the total number of
javac compiler warnings.

    {color:green}+1 release audit{color}.  The applied patch does not increase the total number
of release audit warnings.

    {color:green}+1 core tests{color}.  The patch passed unit tests in ambari-server.

Test results: https://builds.apache.org/job/Ambari-trunk-test-patch/11026//testReport/
Console output: https://builds.apache.org/job/Ambari-trunk-test-patch/11026//console

This message is automatically generated.

> YARN service check failed during HDP 2.4-2.6 rolling upgrade with YARN HA enabled
> ---------------------------------------------------------------------------------
>
>                 Key: AMBARI-20447
>                 URL: https://issues.apache.org/jira/browse/AMBARI-20447
>             Project: Ambari
>          Issue Type: Bug
>          Components: ambari-server
>            Reporter: Dmitry Lysnichenko
>            Assignee: Dmitry Lysnichenko
>             Fix For: 2.5.0
>
>         Attachments: AMBARI-20447.patch
>
>
> The problem with YARN service check failure is that during Rolling upgrade from HDP-2.4
to HDP-2.6 (with YARN HA turned on):
> # After "core master restart" step, yarn client uses new (HDP-2.6) config and fails with
Class org.apache.hadoop.yarn.client.RequestHedgingRMFailoverProxyProvider not found . Forcing
yarn client to use old (HDP-2.4) config until client binary is updated helps here
> # After "core slave restart" step, using old YARN client config with old YARN client
binary does not help. NM/RM classpath points to HDP-2.6. App job gets scheduled, but then
fails with log:
> {code}17/03/06 16:39:27 INFO service.AbstractService: Service org.apache.hadoop.yarn.client.api.impl.AMRMClientImpl
failed in state STARTED; cause: java.lang.RuntimeException: java.lang.RuntimeException: java.lang.ClassNotFoundException:
Class org.apache.hadoop.yarn.client.RequestHedgingRMFailoverProxyProvider not found
> java.lang.RuntimeException: java.lang.RuntimeException: java.lang.ClassNotFoundException:
Class org.apache.hadoop.yarn.client.RequestHedgingRMFailoverProxyProvider not found
> at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2240)
> at org.apache.hadoop.yarn.client.RMProxy.createRMFailoverProxyProvider(RMProxy.java:160)
> at org.apache.hadoop.yarn.client.RMProxy.createRMProxy(RMProxy.java:93)
> at org.apache.hadoop.yarn.client.ClientRMProxy.createRMProxy(ClientRMProxy.java:72)
> at org.apache.hadoop.yarn.client.api.impl.AMRMClientImpl.serviceStart(AMRMClientImpl.java:186)
> at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
> at org.apache.hadoop.yarn.client.api.async.impl.AMRMClientAsyncImpl.serviceStart(AMRMClientAsyncImpl.java:96)
> at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
> at org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster.run(ApplicationMaster.java:559)
> at org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster.main(ApplicationMaster.java:299)
> Caused by: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class org.apache.hadoop.yarn.client.RequestHedgingRMFailoverProxyProvider
not found
> at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2208)
> at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2232)
> ... 9 more
> Caused by: java.lang.ClassNotFoundException: Class org.apache.hadoop.yarn.client.RequestHedgingRMFailoverProxyProvider
not found
> at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:2114)
> at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2206)
> ... 10 more
> 17/03/06 16:39:27 INFO service.AbstractService: Service org.apache.hadoop.yarn.client.api.async.AMRMClientAsync
failed in state STARTED; cause: java.lang.RuntimeException: java.lang.RuntimeException: java.lang.ClassNotFoundException:
Class org.apache.hadoop.yarn.client.RequestHedgingRMFailoverProxyProvider not found
> java.lang.RuntimeException: java.lang.RuntimeException: java.lang.ClassNotFoundException:
Class org.apache.hadoop.yarn.client.RequestHedgingRMFailoverProxyProvider not found
> at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2240)
> at org.apache.hadoop.yarn.client.RMProxy.createRMFailoverProxyProvider(RMProxy.java:160)
> at org.apache.hadoop.yarn.client.RMProxy.createRMProxy(RMProxy.java:93)
> at org.apache.hadoop.yarn.client.ClientRMProxy.createRMProxy(ClientRMProxy.java:72)
> at
> {code}
> # After yarn client is updated to a new binary, service check works fine.
> ----
> Bottom line, this is a known problem with DistributedShell - it was never fixed to not
rely on cluster's configuration. What this means is that client configuration changes like
this can break DistributedShell apps over upgrades.
> Unfortunately nothing we do now can fix this broken upgrade for DistributedShell - as
to ideally fix it, we have to go back in time and provide changes.
> We have to do two things
> # Disable DistributedShell based service-check when we go from 2.4 > 2.6. The RequestHedgingRMFailoverProxyProvider
is added in 2.5, so 2.5 > 2.6 is fine.
> # Also fix yarn-site.xml starting 2.6 with the following change to avoid this in the
future. The change is from using $HADOOP_CONF_DIR which is inherited from the NodeManager
to /etc/hadoop/conf/ which is always tied to the client version.
> {code}
> <property>
> <name>yarn.application.classpath</name>
> <value>/etc/hadoop/conf/,/usr/hdp/current/hadoop-client/*,/usr/hdp/current/hadoop-client/lib/*,/usr/hdp/current/hadoop-hdfs-client/*,/usr/hdp/current/hadoop-hdfs-client/lib/*,/usr/hdp/current/hadoop-yarn-client/*,/usr/hdp/current/hadoop-yarn-client/lib/*</value>
> </property>
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message