ambari-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alejandro Fernandez" <>
Subject Review Request 35764: Installing Repo Packages needs to be more robust to handle the actual_version installed when the script is killed because of a timeout
Date Tue, 23 Jun 2015 02:50:51 GMT

This is an automatically generated e-mail. To reply, visit:

Review request for Ambari, Dmitro Lisnichenko, Jonathan Hurley, and Nate Cole.

Repository: ambari


When installing bits of a new repo to perform an RU, if the repo version does not contain
a build number, then the actual_version has to be calculated.
The problem is that this information can only be retrieved by calculating the delta of the
versions when calling "hdp-select versions", so the only the first time will return a value.
If the script is killed, then the actual_version will never be calculated and returned to
the server.

For this reason, the package installer must tolerate receiving a SIGINT or SIGTERM, and calculate
the actual_version. Further, it helps to store it in a file, in case that the response or
something upstream results in another failure.


  ambari-server/src/main/resources/custom_actions/scripts/ ffe9815 



Verified that the following works by decreasing the install package timeout to 150 secs, and
trying the following combinations

Repo version including build number:
* Failure during installing the first package. Because actual_version was already set, did
not need to call "hdp-select versions".
* Failure after installing the first package, retries eventually passed.

Repo version without build numbner:
* Failure during installing the first package. Because actual_version was none, it tried calling
"hdp-select versions" but the newer version was not yet installed, so was allowed to retry.
* Failure after installing the first package. Beacuse "hdp-select versions" reported a value
in the delta, it was saved to the text file. Since it timed out, was allowed to retry.
* Eventually, it passed.

Still need to fix unit tests.


Alejandro Fernandez

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message