Return-Path: X-Original-To: apmail-ambari-dev-archive@www.apache.org Delivered-To: apmail-ambari-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 422B118146 for ; Wed, 2 Dec 2015 10:47:16 +0000 (UTC) Received: (qmail 20189 invoked by uid 500); 2 Dec 2015 10:47:11 -0000 Delivered-To: apmail-ambari-dev-archive@ambari.apache.org Received: (qmail 20170 invoked by uid 500); 2 Dec 2015 10:47:11 -0000 Mailing-List: contact dev-help@ambari.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@ambari.apache.org Delivered-To: mailing list dev@ambari.apache.org Received: (qmail 20157 invoked by uid 99); 2 Dec 2015 10:47:10 -0000 Received: from reviews-vm.apache.org (HELO reviews.apache.org) (140.211.11.40) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 02 Dec 2015 10:47:10 +0000 Received: from reviews.apache.org (localhost [127.0.0.1]) by reviews.apache.org (Postfix) with ESMTP id 90D321CF8A5; Wed, 2 Dec 2015 10:47:10 +0000 (UTC) Content-Type: multipart/alternative; boundary="===============4211729090905315744==" MIME-Version: 1.0 Subject: Re: Review Request 40826: Rebalance HDFS after enabling NN HA failed From: "Dmitro Lisnichenko" To: "Andrew Onischuk" , "Vitalyi Brodetskyi" Cc: "Ambari" Date: Wed, 02 Dec 2015 10:47:10 -0000 Message-ID: <20151202104710.1718.73947@reviews.apache.org> X-ReviewBoard-URL: https://reviews.apache.org/ Auto-Submitted: auto-generated Sender: "Dmitro Lisnichenko" X-ReviewGroup: Ambari X-Auto-Response-Suppress: DR, RN, OOF, AutoReply X-ReviewRequest-URL: https://reviews.apache.org/r/40826/ X-Sender: "Dmitro Lisnichenko" References: <20151201170136.1719.44858@reviews.apache.org> In-Reply-To: <20151201170136.1719.44858@reviews.apache.org> Reply-To: "Dmitro Lisnichenko" X-ReviewRequest-Repository: ambari --===============4211729090905315744== MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/40826/#review108623 ----------------------------------------------------------- ambari-server/src/main/resources/common-services/HDFS/2.1.0.2.0/package/scripts/namenode.py (line 317) I agree that it's hacky - Dmitro Lisnichenko On Dec. 1, 2015, 7:01 p.m., Dmitro Lisnichenko wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/40826/ > ----------------------------------------------------------- > > (Updated Dec. 1, 2015, 7:01 p.m.) > > > Review request for Ambari, Andrew Onischuk and Vitalyi Brodetskyi. > > > Bugs: AMBARI-14137 > https://issues.apache.org/jira/browse/AMBARI-14137 > > > Repository: ambari > > > Description > ------- > > STR: > 1) Install and deploy cluster > 2) Enable NameNode HA > 3) Enable security > 3) Start rebalance HDFS > > Actually result: > > {code} > "stderr" : "Traceback (most recent call last):\n File \"/var/lib/ambari-agent/cache/common-services/HDFS/2.1.0.2.0/package/scripts/namenode.py\", line 425, in \n NameNode().execute()\n File \"/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py\", line 218, in execute\n method(env)\n File \"/var/lib/ambari-agent/cache/common-services/HDFS/2.1.0.2.0/package/scripts/namenode.py\", line 363, in rebalancehdfs\n logoutput = False,\n File \"/usr/lib/python2.6/site-packages/resource_management/core/base.py\", line 154, in __init__\n self.env.run()\n File \"/usr/lib/python2.6/site-packages/resource_management/core/environment.py\", line 156, in run\n self.run_action(resource, action)\n File \"/usr/lib/python2.6/site-packages/resource_management/core/environment.py\", line 119, in run_action\n provider_action()\n File \"/usr/lib/python2.6/site-packages/resource_management/core/providers/system.py\", line 238, in action_run\n tri es=self.resource.tries, try_sleep=self.resource.try_sleep)\n File \"/usr/lib/python2.6/site-packages/resource_management/core/shell.py\", line 70, in inner\n result = function(command, **kwargs)\n File \"/usr/lib/python2.6/site-packages/resource_management/core/shell.py\", line 92, in checked_call\n tries=tries, try_sleep=try_sleep)\n File \"/usr/lib/python2.6/site-packages/resource_management/core/shell.py\", line 140, in _call_wrapper\n result = _call(command, **kwargs_copy)\n File \"/usr/lib/python2.6/site-packages/resource_management/core/shell.py\", line 291, in _call\n raise Fail(err_msg)\nresource_management.core.exceptions.Fail: Execution of 'ambari-sudo.sh su cstm-hdfs -l -s /bin/bash -c 'export PATH='\"'\"'/usr/sbin:/sbin:/usr/lib/ambari-server/*:/usr/sbin:/sbin:/usr/lib/ambari-server/*:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/var/lib/ambari-agent:/var/lib/ambari-agent:/usr/hdp/current/hadoop-client/bin'\"'\"' KRB5CCNAME=/tmp/hdfs_reba lance_cc_6ec913166750834c9d9302d65b9c6cb8 ; hdfs --config /usr/hdp/current/hadoop-client/conf balancer -threshold 10'' returned 252. ######## Hortonworks #############\nThis is MOTD message, added for testing in qe infra\n15/11/17 07:29:08 INFO balancer.Balancer: Using a threshold of 10.0\n15/11/17 07:29:08 INFO balancer.Balancer: namenodes = [hdfs://os-d7-mwznvu-ambari-hv-ser-ha-5-2.novalocal:8020, hdfs://nameservice]\n15/11/17 07:29:08 INFO balancer.Balancer: parameters = Balancer.BalancerParameters [BalancingPolicy.Node, threshold = 10.0, max idle iteration = 5, #excluded nodes = 0, #included nodes = 0, #source nodes = 0, #blockpools = 0, run during upgrade = false]\n15/11/17 07:29:08 INFO balancer.Balancer: included nodes = []\n15/11/17 07:29:08 INFO balancer.Balancer: excluded nodes = []\n15/11/17 07:29:08 INFO balancer.Balancer: source nodes = []\nTime Stamp Iteration# Bytes Already Moved Bytes Left To Move Bytes Being Moved\norg.apache.hadoop.ipc.RemoteExcep tion(org.apache.hadoop.ipc.StandbyException): Operation category READ is not supported in state standby\n\tat org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.checkOperation(StandbyState.java:87)\n\tat org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.checkOperation(NameNode.java:1927)\n\tat org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkOperation(FSNamesystem.java:1313)\n\tat org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getServerDefaults(FSNamesystem.java:1625)\n\tat org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getServerDefaults(NameNodeRpcServer.java:659)\n\tat org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getServerDefaults(ClientNamenodeProtocolServerSideTranslatorPB.java:383)\n\tat org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)\n\tat org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker. call(ProtobufRpcEngine.java:616)\n\tat org.apache.hadoop.ipc.RPC$Server.call(RPC.java:969)\n\tat org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2151)\n\tat org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2147)\n\tat java.security.AccessController.doPrivileged(Native Method)\n\tat javax.security.auth.Subject.doAs(Subject.java:422)\n\tat org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)\n\tat org.apache.hadoop.ipc.Server$Handler.run(Server.java:2145)\n. Exiting ...\nNov 17, 2015 7:29:09 AM Balancing took 1.932 seconds", > {code} > > {code} > "stdout" : "Starting balancer with threshold = 10\n2015-11-17 07:29:06,492 - call['/usr/bin/klist -s /tmp/hdfs_rebalance_cc_6ec913166750834c9d9302d65b9c6cb8'] {'user': 'cstm-hdfs'}\n2015-11-17 07:29:06,514 - call returned (1, '######## Hortonworks #############\\nThis is MOTD message, added for testing in qe infra')\n2015-11-17 07:29:06,515 - Execute['/usr/bin/kinit -c /tmp/hdfs_rebalance_cc_6ec913166750834c9d9302d65b9c6cb8 -kt /etc/security/keytabs/hdfs.headless.keytab cstm-hdfs@EXAMPLE.COM'] {'user': 'cstm-hdfs'}\nExecuting command ambari-sudo.sh su cstm-hdfs -l -s /bin/bash -c 'export PATH='\"'\"'/usr/sbin:/sbin:/usr/lib/ambari-server/*:/usr/sbin:/sbin:/usr/lib/ambari-server/*:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/var/lib/ambari-agent:/var/lib/ambari-agent:/usr/hdp/current/hadoop-client/bin'\"'\"' KRB5CCNAME=/tmp/hdfs_rebalance_cc_6ec913166750834c9d9302d65b9c6cb8 ; hdfs --config /usr/hdp/current/hadoop-client/conf balancer -threshold 10'\n2015-11-17 07:2 9:06,550 - Execute['ambari-sudo.sh su cstm-hdfs -l -s /bin/bash -c 'export PATH='\"'\"'/usr/sbin:/sbin:/usr/lib/ambari-server/*:/usr/sbin:/sbin:/usr/lib/ambari-server/*:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/var/lib/ambari-agent:/var/lib/ambari-agent:/usr/hdp/current/hadoop-client/bin'\"'\"' KRB5CCNAME=/tmp/hdfs_rebalance_cc_6ec913166750834c9d9302d65b9c6cb8 ; hdfs --config /usr/hdp/current/hadoop-client/conf balancer -threshold 10''] {'logoutput': False, 'on_new_line': handle_new_line}\n[balancer] ######## Hortonworks #############\nThis is MOTD message, added for testing in qe infra\n[balancer] 15/11/17 07:29:08 INFO balancer.Balancer: Using a threshold of 10.0\n[balancer] 15/11/17 07:29:08 INFO balancer.Balancer: namenodes = [hdfs://os-d7-mwznvu-ambari-hv-ser-ha-5-2.novalocal:8020, hdfs://nameservice]\n[balancer] 15/11/17 07:29:08 INFO balancer.Balancer: parameters = Balancer.BalancerParameters [BalancingPolicy.Node, threshold = 10.0, max idle iteration = 5, #excluded nodes = 0, #included nodes = 0, #source nodes = 0, #blockpools = 0, run during upgrade = false]\n[balancer] 15/11/17 07:29:08 INFO balancer.Balancer: included nodes = []\n[balancer] 15/11/17 07:29:08 INFO balancer.Balancer: excluded nodes = []\n[balancer] 15/11/17 07:29:08 INFO balancer.Balancer: source nodes = []\n[balancer] Time Stamp Iteration# Bytes Already Moved Bytes Left To Move Bytes Being Moved[balancer] \n[balancer] org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException): Operation category READ is not supported in state standby\n\tat org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.checkOperation(StandbyState.java:87)\n\tat org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.checkOperation(NameNode.java:1927)\n\tat org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkOperation(FSNamesystem.java:1313)\n\tat org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getServerDefaults(FSNamesystem.java:1625 )\n\tat org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getServerDefaults(NameNodeRpcServer.java:659)\n\tat org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getServerDefaults(ClientNamenodeProtocolServerSideTranslatorPB.java:383)\n\tat org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)\n\tat org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(Proto[balancer] bufRpcEngine.java:616)\n\tat org.apache.hadoop.ipc.RPC$Server.call(RPC.java:969)\n\tat org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2151)\n\tat org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2147)\n\tat java.security.AccessController.doPrivileged(Native Method)\n\tat javax.security.auth.Subject.doAs(Subject.java:422)\n\tat org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)\n\tat org.apache.hadoop.ipc.Server$Handler.run(Server.j ava:2145)\n. Exiting ...[balancer] \n[balancer] Nov 17, 2015 7:29:09 AM [balancer] [balancer] Balancing took 1.932 seconds[balancer]", > {code} > > > Diffs > ----- > > ambari-server/src/main/resources/common-services/HDFS/2.1.0.2.0/package/scripts/namenode.py 9800ff1 > > Diff: https://reviews.apache.org/r/40826/diff/ > > > Testing > ------- > > mvn clean test > > > Thanks, > > Dmitro Lisnichenko > > --===============4211729090905315744==--