incubator-ambari-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Siddharth Wagle (JIRA)" <j...@apache.org>
Subject [jira] [Created] (AMBARI-2041) If a host that has a service client installed and the host is down, service start will fail
Date Sat, 27 Apr 2013 00:26:18 GMT
Siddharth Wagle created AMBARI-2041:
---------------------------------------

             Summary: If a host that has a service client installed and the host is down,
service start will fail
                 Key: AMBARI-2041
                 URL: https://issues.apache.org/jira/browse/AMBARI-2041
             Project: Ambari
          Issue Type: Bug
          Components: controller
    Affects Versions: 1.3.0
            Reporter: Siddharth Wagle
            Assignee: Siddharth Wagle
             Fix For: 1.3.0


In condor, service start may include client install on some hosts. If the host where a client
is being installed is down (heartbeat lost) then service start fails. This is because the
success factor for clients (tested with MAPREDUCE_CLIENT) is 1 and single failure fails the
stage. During service start there are three stages, one each for installs, starts, and check.
When install stage fails, the later stages are aborted.

Few observations:

    Client goes to INSTALL_FAILED state. So second attempt ignores installing on the client
thereby succeeds in starting the service. (this is a bug as we should try installing a component
that is in INSTALL_FAILED state. However, at this point we are saved by this bug)
    Service check can be scheduled on a host that is in UNHEALTHY/UNKNOWN state and can fail
    Now service cannot be stopped because:
        Stop command sees INSTALL_FAILED state and schedules an INSTALL task for the client
which fails.
        The STOP commands for other components are at a later stage and are aborted.


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message