Mailing-List: contact issues-help@cloudstack.apache.org; run by ezmlm
Precedence: bulk
Reply-To: dev@cloudstack.apache.org
Date: Thu, 24 Mar 2016 11:07:25 +0000 (UTC)
From: "Rohit Yadav (JIRA)" <jira@apache.org>
To: cloudstack-issues@incubator.apache.org
Message-ID: <JIRA.12903920.1444432458000.40095.1458817645485@Atlassian.JIRA>
In-Reply-To: <JIRA.12903920.1444432458000@Atlassian.JIRA>
References: <JIRA.12903920.1444432458000@Atlassian.JIRA>
 <JIRA.12903920.1444432458674@arcas>
Subject: [jira] [Assigned] (CLOUDSTACK-8943) KVM HA is broken, let's fix it
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit


     [ https://issues.apache.org/jira/browse/CLOUDSTACK-8943?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Rohit Yadav reassigned CLOUDSTACK-8943:
---------------------------------------

    Assignee: Rohit Yadav

> KVM HA is broken, let's fix it
> ------------------------------
>
>                 Key: CLOUDSTACK-8943
>                 URL: https://issues.apache.org/jira/browse/CLOUDSTACK-8943
>             Project: CloudStack
>          Issue Type: Bug
>      Security Level: Public(Anyone can view this level - this is the default.) 
>         Environment: Linux distros with KVM/libvirt
>            Reporter: Nux
>            Assignee: Rohit Yadav
>
> Currently KVM HA works by monitoring an NFS based heartbeat file and it can often fail whenever this network share becomes slower, causing the hypervisors to reboot.
> This can be particularly annoying when you have different kinds of primary storages in place which are working fine (people running CEPH etc).
> Having to wait for the affected HV which triggered this to come back and declare it's not running VMs is a bad idea; this HV could require hours or days of maintenance!
> This is embarrassing. How can we fix it? Ideas, suggestions? How are other hypervisors doing it?
> Let's discuss, test, implement. :)


--
This message was sent by Atlassian JIRA
(v6.3.4#6332)