ambari-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dmitry Lysnichenko (JIRA)" <j...@apache.org>
Subject [jira] [Resolved] (AMBARI-9197) Ambari gets stuck / not able to cancel timed out operation
Date Tue, 21 Apr 2015 11:20:58 GMT

     [ https://issues.apache.org/jira/browse/AMBARI-9197?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Dmitry Lysnichenko resolved AMBARI-9197.
----------------------------------------
    Resolution: Not A Problem

I've checked the described scenario:
- Deploy multinode cluster
- Start long-running process
- Stop one agent.
- Wait until task on affected host is automatically aborted (5-10 minutes)
- Start agent
- Check component  and request states
- Try to issue a new request

As expected, tasks on host with stopped agent timed out and were aborted automatically in
5-10 minutes. If agent on host is not running, one can not send CANCEL commands to host.

Closing as Works as desired

> Ambari gets stuck / not able to cancel timed out operation
> ----------------------------------------------------------
>
>                 Key: AMBARI-9197
>                 URL: https://issues.apache.org/jira/browse/AMBARI-9197
>             Project: Ambari
>          Issue Type: Bug
>          Components: ambari-server, ambari-web
>    Affects Versions: 1.7.0
>         Environment: HDP 2.2
>            Reporter: Hari Sekhon
>            Assignee: Dmitry Lysnichenko
>             Fix For: 2.1.0
>
>         Attachments: screenshot-1.png
>
>
> Ambari server has recently had added the ability to cancel operations (AMBARI-1897) but
is not able to cancel operations that are timing out in yellow and gets stuck in this state
for several minutes, blocking restarts of other components.
> I've attached a screenshot which shows there is no X next to the operations in yellow
that are stalled.
> This is the result of a hang on an ambari client (scenario documented in AMBARI-8768)
but highlights that Ambari server's ability to cancel operations needs hardening and the ability
to cancel any operation in any state to recover the operations queue.
> Regards,
> Hari Sekhon
> http://www.linkedin.com/in/harisekhon



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message