ambari-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dmitry Lysnichenko (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (AMBARI-14137) Rebalance HDFS after enabling NN HA failed
Date Tue, 01 Dec 2015 17:00:12 GMT

     [ https://issues.apache.org/jira/browse/AMBARI-14137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Dmitry Lysnichenko updated AMBARI-14137:
----------------------------------------
    Attachment: AMBARI-14137.patch

> Rebalance HDFS after enabling NN HA failed
> ------------------------------------------
>
>                 Key: AMBARI-14137
>                 URL: https://issues.apache.org/jira/browse/AMBARI-14137
>             Project: Ambari
>          Issue Type: Bug
>          Components: ambari-server
>            Reporter: Dmitry Lysnichenko
>            Assignee: Dmitry Lysnichenko
>         Attachments: AMBARI-14137.patch
>
>
> STR:
> 1) Install and deploy cluster
> 2) Enable NameNode HA
> 3) Enable security
> 3) Start rebalance HDFS
> Actually result:
> {code}
> "stderr" : "Traceback (most recent call last):\n  File \"/var/lib/ambari-agent/cache/common-services/HDFS/2.1.0.2.0/package/scripts/namenode.py\",
line 425, in <module>\n    NameNode().execute()\n  File \"/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py\",
line 218, in execute\n    method(env)\n  File \"/var/lib/ambari-agent/cache/common-services/HDFS/2.1.0.2.0/package/scripts/namenode.py\",
line 363, in rebalancehdfs\n    logoutput = False,\n  File \"/usr/lib/python2.6/site-packages/resource_management/core/base.py\",
line 154, in __init__\n    self.env.run()\n  File \"/usr/lib/python2.6/site-packages/resource_management/core/environment.py\",
line 156, in run\n    self.run_action(resource, action)\n  File \"/usr/lib/python2.6/site-packages/resource_management/core/environment.py\",
line 119, in run_action\n    provider_action()\n  File \"/usr/lib/python2.6/site-packages/resource_management/core/providers/system.py\",
line 238, in action_run\n    tries=self.resource.tries, try_sleep=self.resource.try_sleep)\n
 File \"/usr/lib/python2.6/site-packages/resource_management/core/shell.py\", line 70, in
inner\n    result = function(command, **kwargs)\n  File \"/usr/lib/python2.6/site-packages/resource_management/core/shell.py\",
line 92, in checked_call\n    tries=tries, try_sleep=try_sleep)\n  File \"/usr/lib/python2.6/site-packages/resource_management/core/shell.py\",
line 140, in _call_wrapper\n    result = _call(command, **kwargs_copy)\n  File \"/usr/lib/python2.6/site-packages/resource_management/core/shell.py\",
line 291, in _call\n    raise Fail(err_msg)\nresource_management.core.exceptions.Fail: Execution
of 'ambari-sudo.sh su cstm-hdfs -l -s /bin/bash -c 'export  PATH='\"'\"'/usr/sbin:/sbin:/usr/lib/ambari-server/*:/usr/sbin:/sbin:/usr/lib/ambari-server/*:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/var/lib/ambari-agent:/var/lib/ambari-agent:/usr/hdp/current/hadoop-client/bin'\"'\"'
KRB5CCNAME=/tmp/hdfs_rebalance_cc_6ec913166750834c9d9302d65b9c6cb8 ; hdfs --config /usr/hdp/current/hadoop-client/conf
balancer -threshold 10'' returned 252. ######## Hortonworks #############\nThis is MOTD message,
added for testing in qe infra\n15/11/17 07:29:08 INFO balancer.Balancer: Using a threshold
of 10.0\n15/11/17 07:29:08 INFO balancer.Balancer: namenodes  = [hdfs://os-d7-mwznvu-ambari-hv-ser-ha-5-2.novalocal:8020,
hdfs://nameservice]\n15/11/17 07:29:08 INFO balancer.Balancer: parameters = Balancer.BalancerParameters
[BalancingPolicy.Node, threshold = 10.0, max idle iteration = 5, #excluded nodes = 0, #included
nodes = 0, #source nodes = 0, #blockpools = 0, run during upgrade = false]\n15/11/17 07:29:08
INFO balancer.Balancer: included nodes = []\n15/11/17 07:29:08 INFO balancer.Balancer: excluded
nodes = []\n15/11/17 07:29:08 INFO balancer.Balancer: source nodes = []\nTime Stamp      
        Iteration#  Bytes Already Moved  Bytes Left To Move  Bytes Being Moved\norg.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException):
Operation category READ is not supported in state standby\n\tat org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.checkOperation(StandbyState.java:87)\n\tat
org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.checkOperation(NameNode.java:1927)\n\tat
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkOperation(FSNamesystem.java:1313)\n\tat
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getServerDefaults(FSNamesystem.java:1625)\n\tat
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getServerDefaults(NameNodeRpcServer.java:659)\n\tat
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getServerDefaults(ClientNamenodeProtocolServerSideTranslatorPB.java:383)\n\tat
org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)\n\tat
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)\n\tat
org.apache.hadoop.ipc.RPC$Server.call(RPC.java:969)\n\tat org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2151)\n\tat
org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2147)\n\tat java.security.AccessController.doPrivileged(Native
Method)\n\tat javax.security.auth.Subject.doAs(Subject.java:422)\n\tat org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)\n\tat
org.apache.hadoop.ipc.Server$Handler.run(Server.java:2145)\n.  Exiting ...\nNov 17, 2015 7:29:09
AM  Balancing took 1.932 seconds",
> {code}
> {code}
> "stdout" : "Starting balancer with threshold = 10\n2015-11-17 07:29:06,492 - call['/usr/bin/klist
-s /tmp/hdfs_rebalance_cc_6ec913166750834c9d9302d65b9c6cb8'] {'user': 'cstm-hdfs'}\n2015-11-17
07:29:06,514 - call returned (1, '######## Hortonworks #############\\nThis is MOTD message,
added for testing in qe infra')\n2015-11-17 07:29:06,515 - Execute['/usr/bin/kinit -c /tmp/hdfs_rebalance_cc_6ec913166750834c9d9302d65b9c6cb8
-kt /etc/security/keytabs/hdfs.headless.keytab cstm-hdfs@EXAMPLE.COM'] {'user': 'cstm-hdfs'}\nExecuting
command ambari-sudo.sh su cstm-hdfs -l -s /bin/bash -c 'export  PATH='\"'\"'/usr/sbin:/sbin:/usr/lib/ambari-server/*:/usr/sbin:/sbin:/usr/lib/ambari-server/*:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/var/lib/ambari-agent:/var/lib/ambari-agent:/usr/hdp/current/hadoop-client/bin'\"'\"'
KRB5CCNAME=/tmp/hdfs_rebalance_cc_6ec913166750834c9d9302d65b9c6cb8 ; hdfs --config /usr/hdp/current/hadoop-client/conf
balancer -threshold 10'\n2015-11-17 07:29:06,550 - Execute['ambari-sudo.sh su cstm-hdfs -l
-s /bin/bash -c 'export  PATH='\"'\"'/usr/sbin:/sbin:/usr/lib/ambari-server/*:/usr/sbin:/sbin:/usr/lib/ambari-server/*:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/var/lib/ambari-agent:/var/lib/ambari-agent:/usr/hdp/current/hadoop-client/bin'\"'\"'
KRB5CCNAME=/tmp/hdfs_rebalance_cc_6ec913166750834c9d9302d65b9c6cb8 ; hdfs --config /usr/hdp/current/hadoop-client/conf
balancer -threshold 10''] {'logoutput': False, 'on_new_line': handle_new_line}\n[balancer]
######## Hortonworks #############\nThis is MOTD message, added for testing in qe infra\n[balancer]
15/11/17 07:29:08 INFO balancer.Balancer: Using a threshold of 10.0\n[balancer] 15/11/17 07:29:08
INFO balancer.Balancer: namenodes  = [hdfs://os-d7-mwznvu-ambari-hv-ser-ha-5-2.novalocal:8020,
hdfs://nameservice]\n[balancer] 15/11/17 07:29:08 INFO balancer.Balancer: parameters = Balancer.BalancerParameters
[BalancingPolicy.Node, threshold = 10.0, max idle iteration = 5, #excluded nodes = 0, #included
nodes = 0, #source nodes = 0, #blockpools = 0, run during upgrade = false]\n[balancer] 15/11/17
07:29:08 INFO balancer.Balancer: included nodes = []\n[balancer] 15/11/17 07:29:08 INFO balancer.Balancer:
excluded nodes = []\n[balancer] 15/11/17 07:29:08 INFO balancer.Balancer: source nodes = []\n[balancer]
Time Stamp               Iteration#  Bytes Already Moved  Bytes Left To Move  Bytes Being
Moved[balancer] \n[balancer] org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException):
Operation category READ is not supported in state standby\n\tat org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.checkOperation(StandbyState.java:87)\n\tat
org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.checkOperation(NameNode.java:1927)\n\tat
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkOperation(FSNamesystem.java:1313)\n\tat
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getServerDefaults(FSNamesystem.java:1625)\n\tat
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getServerDefaults(NameNodeRpcServer.java:659)\n\tat
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getServerDefaults(ClientNamenodeProtocolServerSideTranslatorPB.java:383)\n\tat
org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)\n\tat
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(Proto[balancer] bufRpcEngine.java:616)\n\tat
org.apache.hadoop.ipc.RPC$Server.call(RPC.java:969)\n\tat org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2151)\n\tat
org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2147)\n\tat java.security.AccessController.doPrivileged(Native
Method)\n\tat javax.security.auth.Subject.doAs(Subject.java:422)\n\tat org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)\n\tat
org.apache.hadoop.ipc.Server$Handler.run(Server.java:2145)\n.  Exiting ...[balancer] \n[balancer]
Nov 17, 2015 7:29:09 AM [balancer]  [balancer] Balancing took 1.932 seconds[balancer]",
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message