Return-Path: X-Original-To: apmail-ambari-dev-archive@www.apache.org Delivered-To: apmail-ambari-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id A1A1517453 for ; Thu, 12 Feb 2015 01:52:39 +0000 (UTC) Received: (qmail 56513 invoked by uid 500); 12 Feb 2015 01:52:39 -0000 Delivered-To: apmail-ambari-dev-archive@ambari.apache.org Received: (qmail 56480 invoked by uid 500); 12 Feb 2015 01:52:39 -0000 Mailing-List: contact dev-help@ambari.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@ambari.apache.org Delivered-To: mailing list dev@ambari.apache.org Received: (qmail 56460 invoked by uid 99); 12 Feb 2015 01:52:39 -0000 Received: from reviews-vm.apache.org (HELO reviews.apache.org) (140.211.11.40) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 12 Feb 2015 01:52:39 +0000 Received: from reviews.apache.org (localhost [127.0.0.1]) by reviews.apache.org (Postfix) with ESMTP id 8B20A1CF8FF; Thu, 12 Feb 2015 01:52:38 +0000 (UTC) Content-Type: multipart/alternative; boundary="===============8978820632403129825==" MIME-Version: 1.0 Subject: Re: Review Request 30854: RU: Finalize fails because Secondary Namenode was not deleted during HA Wizard From: "Alejandro Fernandez" To: "Dmitro Lisnichenko" , "Yurii Shylov" , "Robert Levas" , "Jonathan Hurley" , "Nate Cole" , "Tom Beerbower" Cc: "Alejandro Fernandez" , "Ambari" Date: Thu, 12 Feb 2015 01:52:38 -0000 Message-ID: <20150212015238.29075.66656@reviews.apache.org> X-ReviewBoard-URL: https://reviews.apache.org/ Auto-Submitted: auto-generated Sender: "Alejandro Fernandez" X-ReviewGroup: Ambari X-ReviewRequest-URL: https://reviews.apache.org/r/30854/ X-Sender: "Alejandro Fernandez" References: <20150211192053.29075.52032@reviews.apache.org> In-Reply-To: <20150211192053.29075.52032@reviews.apache.org> Reply-To: "Alejandro Fernandez" X-ReviewRequest-Repository: ambari --===============8978820632403129825== MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/30854/ ----------------------------------------------------------- (Updated Feb. 12, 2015, 1:52 a.m.) Review request for Ambari, Dmitro Lisnichenko, Jonathan Hurley, Nate Cole, Robert Levas, Tom Beerbower, and Yurii Shylov. Changes ------- I was able to reproduce this and realized that we need another method of checking if the SECONDARY_NAMENODE component is still on the cluster. Bugs: AMBARI-9559 https://issues.apache.org/jira/browse/AMBARI-9559 Repository: ambari Description ------- While performing a RU, it failed in Finalize because the SECONDARY_NAMENODE component was still on an old version, since it was not deleted by the Namenode HA wizard. This was a 3-node cluster with HDFS, MR, YARN, TEZ, Hive, ZK, Pig, AMS using HDP 2.2.0.0 GA. I was not able to reproduce this, even after installing the exact same services on the same hosts, and using the same ambari-server version. It could be the case that some thread is holding a reference to the object that is being deleted, so it persists. So I added another check to PreUpgradeCheckResourceProvider that ensures that SECONDARY_NAMENODE is not present on any hosts. Diffs (updated) ----- ambari-server/src/main/java/org/apache/ambari/server/checks/SecondaryNamenodeDeletedCheck.java ad967f8 Diff: https://reviews.apache.org/r/30854/diff/ Testing ------- Deployed a 3-node cluster with HDFS, ZK, and added Namenode HA. Then registered a repo and distributed it to the hosts. Next, I stopped ambari-server and manually edited the database to simulate SECONDARY_NAMENODE still being present. INSERT INTO hostcomponentdesiredstate VALUES (2, 'SECONDARY_NAMENODE', E'{\"stackName\":\"HDP\",\"stackVersion\":\"2.2\"}', 'INSTALLED', 'c6402.ambari.apache.org', 'HDFS', '', 'OFF', 'UNSECURED', 0); INSERT INTO hostcomponentstate VALUES (2, 'SECONDARY_NAMENODE', '2.2.0.0-2041', E'{\"stackName\":\"HDP\",\"stackVersion\":\"2.2\"}', 'INSTALLED', 'c6402.ambari.apache.org', 'HDFS', 'NONE', 'UNSECURED'); I then queried, http://c6401.ambari.apache.org:8080/api/v1/clusters/c1/rolling_upgrades_check?fields=*&UpgradeChecks/repository_version=2.2.1.0-2260&_=1423612069190 and got the following, as expected, ``` { href: "http://c6401.ambari.apache.org:8080/api/v1/clusters/c1/rolling_upgrades_check/SECONDARY_NAMENODE_MUST_BE_DELETED", UpgradeChecks: { check: "The SECONDARY_NAMENODE component must be deleted from all hosts", check_type: "SERVICE", cluster_name: "c1", failed_on: [ "SECONDARY_NAMENODE" ], id: "SECONDARY_NAMENODE_MUST_BE_DELETED", reason: "The SECONDARY_NAMENODE component must be deleted from host(s): c6402.ambari.apache.org. Please use the REST API to delete it.", repository_version: "2.2.1.0-2260", status: "FAIL" } } ``` Unit tests passed except on 3 test cases, due to an unrelated error. When I ran them through IntelliJ, they passed. Tests in error: testUpgrade[5](org.apache.ambari.server.upgrade.UpgradeTest): (..) testUpgrade[6](org.apache.ambari.server.upgrade.UpgradeTest): (..) testUpgrade[7](org.apache.ambari.server.upgrade.UpgradeTest): (..) Tests run: 2656, Failures: 0, Errors: 3, Skipped: 15 [INFO] ------------------------------------------------------------------------ [INFO] Reactor Summary: [INFO] [INFO] Ambari Main ....................................... SUCCESS [8.100s] [INFO] Apache Ambari Project POM ......................... SUCCESS [0.285s] [INFO] Ambari Web ........................................ SUCCESS [43.619s] [INFO] Ambari Views ...................................... SUCCESS [3.549s] [INFO] Ambari Admin View ................................. SUCCESS [11.335s] [INFO] Ambari Server ..................................... FAILURE [34:43.346s] [INFO] Ambari Agent ...................................... SKIPPED [INFO] Ambari Client ..................................... SKIPPED [INFO] Ambari Python Client .............................. SKIPPED [INFO] Ambari Groovy Client .............................. SKIPPED [INFO] Ambari Shell ...................................... SKIPPED [INFO] Ambari Python Shell ............................... SKIPPED [INFO] Ambari Groovy Shell ............................... SKIPPED [INFO] ------------------------------------------------------------------------ [INFO] BUILD FAILURE [INFO] ------------------------------------------------------------------------ [INFO] Total time: 35:52.927s [INFO] Finished at: Wed Feb 11 11:12:53 PST 2015 [INFO] Final Memory: 44M/515M [INFO] ------------------------------------------------------------------------ [ERROR] Failed to execute goal org.apache.maven.plugins:maven-surefire-plugin:2.12:test (default-test) on project ambari-server: There are test failures. Thanks, Alejandro Fernandez --===============8978820632403129825==--