Return-Path: X-Original-To: apmail-ambari-dev-archive@www.apache.org Delivered-To: apmail-ambari-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 2297B18973 for ; Tue, 1 Dec 2015 17:01:37 +0000 (UTC) Received: (qmail 54824 invoked by uid 500); 1 Dec 2015 17:01:37 -0000 Delivered-To: apmail-ambari-dev-archive@ambari.apache.org Received: (qmail 54807 invoked by uid 500); 1 Dec 2015 17:01:37 -0000 Mailing-List: contact dev-help@ambari.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@ambari.apache.org Delivered-To: mailing list dev@ambari.apache.org Received: (qmail 54794 invoked by uid 99); 1 Dec 2015 17:01:36 -0000 Received: from reviews-vm.apache.org (HELO reviews.apache.org) (140.211.11.40) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 01 Dec 2015 17:01:36 +0000 Received: from reviews.apache.org (localhost [127.0.0.1]) by reviews.apache.org (Postfix) with ESMTP id 5DE582916D7; Tue, 1 Dec 2015 17:01:36 +0000 (UTC) Content-Type: multipart/alternative; boundary="===============1609364892331658436==" MIME-Version: 1.0 Subject: Review Request 40826: Rebalance HDFS after enabling NN HA failed From: "Dmitro Lisnichenko" To: "Andrew Onischuk" , "Vitalyi Brodetskyi" Cc: "Ambari" Date: Tue, 01 Dec 2015 17:01:36 -0000 Message-ID: <20151201170136.1719.44858@reviews.apache.org> X-ReviewBoard-URL: https://reviews.apache.org/ Auto-Submitted: auto-generated Sender: "Dmitro Lisnichenko" X-ReviewGroup: Ambari X-Auto-Response-Suppress: DR, RN, OOF, AutoReply X-ReviewRequest-URL: https://reviews.apache.org/r/40826/ X-Sender: "Dmitro Lisnichenko" Reply-To: "Dmitro Lisnichenko" X-ReviewRequest-Repository: ambari --===============1609364892331658436== MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/40826/ ----------------------------------------------------------- Review request for Ambari, Andrew Onischuk and Vitalyi Brodetskyi. Bugs: AMBARI-14137 https://issues.apache.org/jira/browse/AMBARI-14137 Repository: ambari Description ------- STR: 1) Install and deploy cluster 2) Enable NameNode HA 3) Enable security 3) Start rebalance HDFS Actually result: {code} "stderr" : "Traceback (most recent call last):\n File \"/var/lib/ambari-agent/cache/common-services/HDFS/2.1.0.2.0/package/scripts/namenode.py\", line 425, in \n NameNode().execute()\n File \"/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py\", line 218, in execute\n method(env)\n File \"/var/lib/ambari-agent/cache/common-services/HDFS/2.1.0.2.0/package/scripts/namenode.py\", line 363, in rebalancehdfs\n logoutput = False,\n File \"/usr/lib/python2.6/site-packages/resource_management/core/base.py\", line 154, in __init__\n self.env.run()\n File \"/usr/lib/python2.6/site-packages/resource_management/core/environment.py\", line 156, in run\n self.run_action(resource, action)\n File \"/usr/lib/python2.6/site-packages/resource_management/core/environment.py\", line 119, in run_action\n provider_action()\n File \"/usr/lib/python2.6/site-packages/resource_management/core/providers/system.py\", line 238, in action_run\n tries =self.resource.tries, try_sleep=self.resource.try_sleep)\n File \"/usr/lib/python2.6/site-packages/resource_management/core/shell.py\", line 70, in inner\n result = function(command, **kwargs)\n File \"/usr/lib/python2.6/site-packages/resource_management/core/shell.py\", line 92, in checked_call\n tries=tries, try_sleep=try_sleep)\n File \"/usr/lib/python2.6/site-packages/resource_management/core/shell.py\", line 140, in _call_wrapper\n result = _call(command, **kwargs_copy)\n File \"/usr/lib/python2.6/site-packages/resource_management/core/shell.py\", line 291, in _call\n raise Fail(err_msg)\nresource_management.core.exceptions.Fail: Execution of 'ambari-sudo.sh su cstm-hdfs -l -s /bin/bash -c 'export PATH='\"'\"'/usr/sbin:/sbin:/usr/lib/ambari-server/*:/usr/sbin:/sbin:/usr/lib/ambari-server/*:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/var/lib/ambari-agent:/var/lib/ambari-agent:/usr/hdp/current/hadoop-client/bin'\"'\"' KRB5CCNAME=/tmp/hdfs_rebala nce_cc_6ec913166750834c9d9302d65b9c6cb8 ; hdfs --config /usr/hdp/current/hadoop-client/conf balancer -threshold 10'' returned 252. ######## Hortonworks #############\nThis is MOTD message, added for testing in qe infra\n15/11/17 07:29:08 INFO balancer.Balancer: Using a threshold of 10.0\n15/11/17 07:29:08 INFO balancer.Balancer: namenodes = [hdfs://os-d7-mwznvu-ambari-hv-ser-ha-5-2.novalocal:8020, hdfs://nameservice]\n15/11/17 07:29:08 INFO balancer.Balancer: parameters = Balancer.BalancerParameters [BalancingPolicy.Node, threshold = 10.0, max idle iteration = 5, #excluded nodes = 0, #included nodes = 0, #source nodes = 0, #blockpools = 0, run during upgrade = false]\n15/11/17 07:29:08 INFO balancer.Balancer: included nodes = []\n15/11/17 07:29:08 INFO balancer.Balancer: excluded nodes = []\n15/11/17 07:29:08 INFO balancer.Balancer: source nodes = []\nTime Stamp Iteration# Bytes Already Moved Bytes Left To Move Bytes Being Moved\norg.apache.hadoop.ipc.RemoteExcepti on(org.apache.hadoop.ipc.StandbyException): Operation category READ is not supported in state standby\n\tat org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.checkOperation(StandbyState.java:87)\n\tat org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.checkOperation(NameNode.java:1927)\n\tat org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkOperation(FSNamesystem.java:1313)\n\tat org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getServerDefaults(FSNamesystem.java:1625)\n\tat org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getServerDefaults(NameNodeRpcServer.java:659)\n\tat org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getServerDefaults(ClientNamenodeProtocolServerSideTranslatorPB.java:383)\n\tat org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)\n\tat org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.ca ll(ProtobufRpcEngine.java:616)\n\tat org.apache.hadoop.ipc.RPC$Server.call(RPC.java:969)\n\tat org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2151)\n\tat org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2147)\n\tat java.security.AccessController.doPrivileged(Native Method)\n\tat javax.security.auth.Subject.doAs(Subject.java:422)\n\tat org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)\n\tat org.apache.hadoop.ipc.Server$Handler.run(Server.java:2145)\n. Exiting ...\nNov 17, 2015 7:29:09 AM Balancing took 1.932 seconds", {code} {code} "stdout" : "Starting balancer with threshold = 10\n2015-11-17 07:29:06,492 - call['/usr/bin/klist -s /tmp/hdfs_rebalance_cc_6ec913166750834c9d9302d65b9c6cb8'] {'user': 'cstm-hdfs'}\n2015-11-17 07:29:06,514 - call returned (1, '######## Hortonworks #############\\nThis is MOTD message, added for testing in qe infra')\n2015-11-17 07:29:06,515 - Execute['/usr/bin/kinit -c /tmp/hdfs_rebalance_cc_6ec913166750834c9d9302d65b9c6cb8 -kt /etc/security/keytabs/hdfs.headless.keytab cstm-hdfs@EXAMPLE.COM'] {'user': 'cstm-hdfs'}\nExecuting command ambari-sudo.sh su cstm-hdfs -l -s /bin/bash -c 'export PATH='\"'\"'/usr/sbin:/sbin:/usr/lib/ambari-server/*:/usr/sbin:/sbin:/usr/lib/ambari-server/*:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/var/lib/ambari-agent:/var/lib/ambari-agent:/usr/hdp/current/hadoop-client/bin'\"'\"' KRB5CCNAME=/tmp/hdfs_rebalance_cc_6ec913166750834c9d9302d65b9c6cb8 ; hdfs --config /usr/hdp/current/hadoop-client/conf balancer -threshold 10'\n2015-11-17 07:29: 06,550 - Execute['ambari-sudo.sh su cstm-hdfs -l -s /bin/bash -c 'export PATH='\"'\"'/usr/sbin:/sbin:/usr/lib/ambari-server/*:/usr/sbin:/sbin:/usr/lib/ambari-server/*:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/var/lib/ambari-agent:/var/lib/ambari-agent:/usr/hdp/current/hadoop-client/bin'\"'\"' KRB5CCNAME=/tmp/hdfs_rebalance_cc_6ec913166750834c9d9302d65b9c6cb8 ; hdfs --config /usr/hdp/current/hadoop-client/conf balancer -threshold 10''] {'logoutput': False, 'on_new_line': handle_new_line}\n[balancer] ######## Hortonworks #############\nThis is MOTD message, added for testing in qe infra\n[balancer] 15/11/17 07:29:08 INFO balancer.Balancer: Using a threshold of 10.0\n[balancer] 15/11/17 07:29:08 INFO balancer.Balancer: namenodes = [hdfs://os-d7-mwznvu-ambari-hv-ser-ha-5-2.novalocal:8020, hdfs://nameservice]\n[balancer] 15/11/17 07:29:08 INFO balancer.Balancer: parameters = Balancer.BalancerParameters [BalancingPolicy.Node, threshold = 10.0, max idle iteration = 5, #excluded nodes = 0, #included nodes = 0, #source nodes = 0, #blockpools = 0, run during upgrade = false]\n[balancer] 15/11/17 07:29:08 INFO balancer.Balancer: included nodes = []\n[balancer] 15/11/17 07:29:08 INFO balancer.Balancer: excluded nodes = []\n[balancer] 15/11/17 07:29:08 INFO balancer.Balancer: source nodes = []\n[balancer] Time Stamp Iteration# Bytes Already Moved Bytes Left To Move Bytes Being Moved[balancer] \n[balancer] org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException): Operation category READ is not supported in state standby\n\tat org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.checkOperation(StandbyState.java:87)\n\tat org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.checkOperation(NameNode.java:1927)\n\tat org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkOperation(FSNamesystem.java:1313)\n\tat org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getServerDefaults(FSNamesystem.java:1625)\ n\tat org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getServerDefaults(NameNodeRpcServer.java:659)\n\tat org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getServerDefaults(ClientNamenodeProtocolServerSideTranslatorPB.java:383)\n\tat org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)\n\tat org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(Proto[balancer] bufRpcEngine.java:616)\n\tat org.apache.hadoop.ipc.RPC$Server.call(RPC.java:969)\n\tat org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2151)\n\tat org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2147)\n\tat java.security.AccessController.doPrivileged(Native Method)\n\tat javax.security.auth.Subject.doAs(Subject.java:422)\n\tat org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)\n\tat org.apache.hadoop.ipc.Server$Handler.run(Server.jav a:2145)\n. Exiting ...[balancer] \n[balancer] Nov 17, 2015 7:29:09 AM [balancer] [balancer] Balancing took 1.932 seconds[balancer]", {code} Diffs ----- ambari-server/src/main/resources/common-services/HDFS/2.1.0.2.0/package/scripts/namenode.py 9800ff1 Diff: https://reviews.apache.org/r/40826/diff/ Testing ------- mvn clean test Thanks, Dmitro Lisnichenko --===============1609364892331658436==--