ambari-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alejandro Fernandez" <afernan...@hortonworks.com>
Subject Re: Review Request 30854: RU: Finalize fails because Secondary Namenode was not deleted during HA Wizard
Date Thu, 12 Feb 2015 01:52:38 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30854/
-----------------------------------------------------------

(Updated Feb. 12, 2015, 1:52 a.m.)


Review request for Ambari, Dmitro Lisnichenko, Jonathan Hurley, Nate Cole, Robert Levas, Tom
Beerbower, and Yurii Shylov.


Changes
-------

I was able to reproduce this and realized that we need another method of checking if the SECONDARY_NAMENODE
component is still on the cluster.


Bugs: AMBARI-9559
    https://issues.apache.org/jira/browse/AMBARI-9559


Repository: ambari


Description
-------

While performing a RU, it failed in Finalize because the SECONDARY_NAMENODE component was
still on an old version, since it was not deleted by the Namenode HA wizard.
This was a 3-node cluster with HDFS, MR, YARN, TEZ, Hive, ZK, Pig, AMS using HDP 2.2.0.0 GA.

I was not able to reproduce this, even after installing the exact same services on the same
hosts, and using the same ambari-server version. It could be the case that some thread is
holding a reference to the object that is being deleted, so it persists. So I added another
check to PreUpgradeCheckResourceProvider that ensures that SECONDARY_NAMENODE is not present
on any hosts.


Diffs (updated)
-----

  ambari-server/src/main/java/org/apache/ambari/server/checks/SecondaryNamenodeDeletedCheck.java
ad967f8 

Diff: https://reviews.apache.org/r/30854/diff/


Testing
-------

Deployed a 3-node cluster with HDFS, ZK, and added Namenode HA. Then registered a repo and
distributed it to the hosts. Next, I stopped ambari-server and manually edited the database
to simulate SECONDARY_NAMENODE still being present.

INSERT INTO hostcomponentdesiredstate VALUES (2, 'SECONDARY_NAMENODE', E'{\"stackName\":\"HDP\",\"stackVersion\":\"2.2\"}',
'INSTALLED', 'c6402.ambari.apache.org', 'HDFS', '', 'OFF', 'UNSECURED', 0);
INSERT INTO hostcomponentstate VALUES (2, 'SECONDARY_NAMENODE', '2.2.0.0-2041', E'{\"stackName\":\"HDP\",\"stackVersion\":\"2.2\"}',
'INSTALLED', 'c6402.ambari.apache.org', 'HDFS', 'NONE', 'UNSECURED');

I then queried, http://c6401.ambari.apache.org:8080/api/v1/clusters/c1/rolling_upgrades_check?fields=*&UpgradeChecks/repository_version=2.2.1.0-2260&_=1423612069190
and got the following, as expected,

```
{
  href: "http://c6401.ambari.apache.org:8080/api/v1/clusters/c1/rolling_upgrades_check/SECONDARY_NAMENODE_MUST_BE_DELETED",
  UpgradeChecks: {
    check: "The SECONDARY_NAMENODE component must be deleted from all hosts",
    check_type: "SERVICE",
    cluster_name: "c1",
    failed_on: [
      "SECONDARY_NAMENODE"
    ],
    id: "SECONDARY_NAMENODE_MUST_BE_DELETED",
    reason: "The SECONDARY_NAMENODE component must be deleted from host(s): c6402.ambari.apache.org.
Please use the REST API to delete it.",
    repository_version: "2.2.1.0-2260",
    status: "FAIL"
  }
}
```

Unit tests passed except on 3 test cases, due to an unrelated error. When I ran them through
IntelliJ, they passed.


Tests in error:
  testUpgrade[5](org.apache.ambari.server.upgrade.UpgradeTest): (..)
  testUpgrade[6](org.apache.ambari.server.upgrade.UpgradeTest): (..)
  testUpgrade[7](org.apache.ambari.server.upgrade.UpgradeTest): (..)

Tests run: 2656, Failures: 0, Errors: 3, Skipped: 15

[INFO] ------------------------------------------------------------------------
[INFO] Reactor Summary:
[INFO]
[INFO] Ambari Main ....................................... SUCCESS [8.100s]
[INFO] Apache Ambari Project POM ......................... SUCCESS [0.285s]
[INFO] Ambari Web ........................................ SUCCESS [43.619s]
[INFO] Ambari Views ...................................... SUCCESS [3.549s]
[INFO] Ambari Admin View ................................. SUCCESS [11.335s]
[INFO] Ambari Server ..................................... FAILURE [34:43.346s]
[INFO] Ambari Agent ...................................... SKIPPED
[INFO] Ambari Client ..................................... SKIPPED
[INFO] Ambari Python Client .............................. SKIPPED
[INFO] Ambari Groovy Client .............................. SKIPPED
[INFO] Ambari Shell ...................................... SKIPPED
[INFO] Ambari Python Shell ............................... SKIPPED
[INFO] Ambari Groovy Shell ............................... SKIPPED
[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 35:52.927s
[INFO] Finished at: Wed Feb 11 11:12:53 PST 2015
[INFO] Final Memory: 44M/515M
[INFO] ------------------------------------------------------------------------
[ERROR] Failed to execute goal org.apache.maven.plugins:maven-surefire-plugin:2.12:test (default-test)
on project ambari-server: There are test failures.


Thanks,

Alejandro Fernandez


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message