ambari-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hudson (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (AMBARI-13861) hdfs balancer via ambari fails to run after HDP upgrade with NN HA enabled
Date Fri, 13 Nov 2015 16:51:11 GMT

    [ https://issues.apache.org/jira/browse/AMBARI-13861?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15004277#comment-15004277
] 

Hudson commented on AMBARI-13861:
---------------------------------

ABORTED: Integrated in Ambari-branch-2.1 #855 (See [https://builds.apache.org/job/Ambari-branch-2.1/855/])
AMBARI-13861. hdfs balancer via ambari fails to run after HDP upgrade (dlysnichenko:
[http://git-wip-us.apache.org/repos/asf?p=ambari.git&a=commit&h=55808ea1111f896f6df8ea99fbe72c58ca184bc8])
* ambari-server/src/test/resources/stacks/HDP/2.1.1/upgrades/config-upgrade.xml
* ambari-server/src/main/resources/stacks/HDP/2.2/upgrades/upgrade-2.3.xml
* ambari-server/src/main/java/org/apache/ambari/server/state/stack/upgrade/ConfigUpgradeChangeDefinition.java
* ambari-server/src/main/resources/stacks/HDP/2.2/upgrades/nonrolling-upgrade-2.3.xml
* ambari-server/src/main/java/org/apache/ambari/server/state/stack/upgrade/ConfigureTask.java
* ambari-server/src/test/java/org/apache/ambari/server/state/UpgradeHelperTest.java
* ambari-server/src/main/java/org/apache/ambari/server/state/stack/upgrade/PropertyKeyState.java
* ambari-server/src/main/resources/stacks/HDP/2.3/upgrades/config-upgrade.xml


> hdfs balancer via ambari fails to run after HDP upgrade with NN HA enabled
> --------------------------------------------------------------------------
>
>                 Key: AMBARI-13861
>                 URL: https://issues.apache.org/jira/browse/AMBARI-13861
>             Project: Ambari
>          Issue Type: Bug
>          Components: ambari-server
>    Affects Versions: 2.1.3
>            Reporter: Dmytro Grinenko
>            Assignee: Dmitry Lysnichenko
>            Priority: Critical
>             Fix For: 2.1.3
>
>
> Ran hdfs balancer via ambari on a cluster that had HA enabled and it failed.
> {code}
> Starting balancer with threshold = 10
> Executing command ambari-sudo.sh su hdfs -l -s /bin/bash -c 'export  PATH='"'"'/usr/sbin:/sbin:/usr/lib/ambari-server/*:/usr/sbin:/sbin:/usr/lib/ambari-server/*:/usr/lib64/qt-3.3/bin:/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin:/root/bin:/var/lib/ambari-agent:/var/lib/ambari-agent:/usr/hdp/current/hadoop-client/bin'"'"'
; hdfs --config /usr/hdp/current/hadoop-client/conf balancer -threshold 10'
> 2015-10-06 23:33:27,059 - Execute['ambari-sudo.sh su hdfs -l -s /bin/bash -c 'export
 PATH='"'"'/usr/sbin:/sbin:/usr/lib/ambari-server/*:/usr/sbin:/sbin:/usr/lib/ambari-server/*:/usr/lib64/qt-3.3/bin:/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin:/root/bin:/var/lib/ambari-agent:/var/lib/ambari-agent:/usr/hdp/current/hadoop-client/bin'"'"'
; hdfs --config /usr/hdp/current/hadoop-client/conf balancer -threshold 10''] {'logoutput':
False, 'on_new_line': handle_new_line}
> [balancer] 15/10/06 23:33:29 INFO balancer.Balancer: Using a threshold of 10.0
> [balancer] 15/10/06 23:33:29 INFO balancer.Balancer: namenodes  = [hdfs://pre-prod-poc-1.novalocal:8020,
hdfs://pre-prod-hdp-2-3]
> [balancer] 15/10/06 23:33:29 INFO balancer.Balancer: parameters = Balancer.Parameters
[BalancingPolicy.Node, threshold = 10.0, max idle iteration = 5, #excluded nodes = 0, #included
nodes = 0, #source nodes = 0, run during upgrade = false]
> [balancer] 15/10/06 23:33:29 INFO balancer.Balancer: included nodes = []
> [balancer] 15/10/06 23:33:29 INFO balancer.Balancer: excluded nodes = []
> [balancer] 15/10/06 23:33:29 INFO balancer.Balancer: source nodes = []
> [balancer] Time Stamp               Iteration#  Bytes Already Moved  Bytes Left To Move
 Bytes Being Moved[balancer] 
> [balancer] org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException):
Operation category READ is not supported in state standby
> 	at org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.checkOperation(StandbyState.java:87)
> 	at org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.checkOperation(NameNode.java:1872)
> 	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkOperation(FSNamesystem.java:1306)
> 	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getServerDefaults(FSNamesystem.java:1618)
> 	at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getServerDefaults(NameNodeRpcServer.java:595)
> 	at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getServerDefaults(ClientNamenodeProtocolServerSideTranslatorPB.java:383)
> 	at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
> 	at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(Proto[balancer]
bufRpcEngine.java:616)
> 	at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:969)
> 	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2137)
> 	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2133)
> 	at java.security.AccessController.doPrivileged(Native Method)
> 	at javax.security.auth.Subject.doAs(Subject.java:422)
> 	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
> 	at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2131)
> .  Exiting ...[balancer] 
> [balancer] Oct 6, 2015 11:33:31 PM [balancer]  [balancer] Balancing took 2.281 seconds[balancer]
> {code}
> If you look at the log it looks like we are adding  a namenode to the list which is in
standby. Should we not be using just the name service?
> {code}
> [balancer] 15/10/06 23:33:29 INFO balancer.Balancer: namenodes  = [hdfs://pre-prod-poc-1.novalocal:8020,
hdfs://pre-prod-hdp-2-3]
> [balancer] 15/10/06 23:33:29 INFO balancer.Balancer: parameters = Balancer.Parameters

> {code}
> {code}
> [root@pre-prod-poc-1 hive-testbench]# ambari-server --hash
> 226dfd1c6136f859fc42dd18e7090a9346f0f745
> root@pre-prod-poc-1 hive-testbench]# rpm -qa | grep ambari
> ambari-metrics-hadoop-sink-2.1.2-370.x86_64
> ambari-server-2.1.2-370.x86_64
> ambari-metrics-monitor-2.1.2-370.x86_64
> ambari-agent-2.1.2-370.x86_64
> [root@pre-prod-poc-1 hive-testbench]#
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message