ambari-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alejandro Fernandez" <afernan...@hortonworks.com>
Subject Review Request 35640: Install packages doesn't update actual version with build number if installation timesout on all hosts
Date Fri, 19 Jun 2015 01:37:24 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/35640/
-----------------------------------------------------------

Review request for Ambari, Dmytro Sen and Nate Cole.


Bugs: AMBARI-12012
    https://issues.apache.org/jira/browse/AMBARI-12012


Repository: ambari


Description
-------

STR:
1. User registers repo version 2.3.0.0 (notice that a build number was not provided), and
clicks the Install button
2. On all of the hosts, the yum commands timeout (or does a partial install), this way, "hdp-select
versions" will report that 2 versions exist (2.2.0.0-2041 and 2.3.0.0-2800). Because the install
did not succeed, the command will not return the actual_version installed (which was 2.3.0.0-2800).
Note: I did this by decreasing the timeouts in ambari.properties file to 5 mins, and adding
a sleep in install_packages.py after the first package was installed.
3. The ambari server code then changes the state of the 2.3.0.0 version it knows about to
INSTALL_FAILED so that the user can retry, but did not update the repo version with the actual
build version that includes the build number.
4. User retries and this time it succeeds. However, the delta of "hdp-select versions" outputs
"", so no "actual_version" is returned! This is really bad because the build number is needed
for ambari to use it whenever it calls "hdp-select set <comp> <version>"
5. The ambari server code will change the state to INSTALLED.
The fix is for install_packages.py to always return the actual_version (even in the case of
a failure) so that Ambari server can correct the database entry (even if the command fails/timesout).
This will only happen the first time, but subsequent attempts to retry installation will use
the right value so an exact match will be found in the database.


Diffs
-----

  ambari-server/src/main/java/org/apache/ambari/server/bootstrap/DistributeRepositoriesStructuredOutput.java
f1d6aad 
  ambari-server/src/main/java/org/apache/ambari/server/events/listeners/upgrade/DistributeRepositoriesActionListener.java
5600ef1 
  ambari-server/src/main/resources/custom_actions/scripts/install_packages.py f8b2308 

Diff: https://reviews.apache.org/r/35640/diff/


Testing
-------

Reproduced the issue on a live cluster and verified that the patch worked even when the agents
reported that the packages failed to be installed.

Unit tests in progress


Thanks,

Alejandro Fernandez


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message