ambari-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jonathan Hurley <>
Subject Re: Review Request 43967: Express Upgrade Stuck At Manual Prompt Due To HRC Status Calculation Cache Problem
Date Fri, 26 Feb 2016 17:22:30 GMT

This is an automatically generated e-mail. To reply, visit:

(Updated Feb. 26, 2016, 12:22 p.m.)

Review request for Ambari, Alejandro Fernandez, Nate Cole, Sumit Mohanty, Sebastian Toader,
and Sid Wagle.


Stupid git got all messed up on my box and was creating patches from some weird cache. Cleaned
that up and got a patch which has the actual changes.

Bugs: AMBARI-15173

Repository: ambari


Seen while performing an upgrade, it's possible that the status of a request/stage does not
match that of its tasks. Essentially, the task could be {{HOLDING}} while the request is still

I believe that AMBARI-15011 is responsible for this issue. AMBARI-15011 introduced, among
other things, a cache to the {{HostRoleCommandStatusSummaryDTO}} which is a aggregation of
the number of tasks a stage has in each state (PENDING, HOLDING, etc).

This {{HostRoleCommandStatusSummaryDTO}} is used by {{CalculatedState}} to calculate a stage's
and request's status based on the tasks. 

The problem is that {{ServerActionExecutor}} is moving a tasks's state to {{HOLDING}} (reflected
in the database correctly) but the cache invalidation happens inside the uncommitted transaction.
This causes stale data to be re-cached. So, when we go to calculate the request and state
status, we get {{IN_PROGRESS}} instead of {{HOLDING}}.

  "href": "*,tasks/*",
  "Stage": {
    "cluster_name": "cl1",
    "context": "Stop YARN Queues",
    "display_status": "IN_PROGRESS",
    "end_time": -1,
    "progress_percent": 35,
    "request_id": 61,
    "skippable": true,
    "stage_id": 1,
    "start_time": 1456227329191,
    "status": "IN_PROGRESS"
  "tasks": [
      "href": "",
      "Tasks": {
        "attempt_cnt": 1,
        "cluster_name": "cl1",
        "command": "EXECUTE",
        "command_detail": "Before continuing, please stop all YARN queues. If yarn-site's is set to true, then you can skip this
step since the clients will retry on their own.",
        "custom_command_name": "org.apache.ambari.server.serveraction.upgrades.ManualStageAction",
        "end_time": -1,
        "error_log": "errors-754.txt",
        "exit_code": 0,
        "host_name": "os-r6-mkqzcs-c10tom21unsecha-6.novalocal",
        "id": 754,
        "output_log": "output-754.txt",
        "request_id": 61,
        "role": "AMBARI_SERVER_ACTION",
        "stage_id": 1,
        "start_time": 1456227329191,
        "status": "HOLDING",
        "stderr": "",
        "stdout": "",
        "structured_out": {}

Diffs (updated)

  ambari-server/src/main/java/com/google/inject/persist/jpa/ 604546c




Pending unit tests...


Jonathan Hurley

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message