cloudstack-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrija Panic <andrija.pa...@gmail.com>
Subject Re: Automatic KVM host reboot on Primary Storage failure
Date Fri, 14 Nov 2014 17:32:19 GMT
Hi Marcus, thanks for explaining.

maybe a side question: " like storage/host tags to guarantee each host only
uses one NFS" - what do you mean by this ? that is, how would you implent
this? I know of tags, but I only know how to make sure certain Compute/Disk
offerings use certain Compute/Storage hosts.

Not sure how to make some Hosts use some NFSs... ?
Thanks anyway,
Andrija

On 14 November 2014 18:18, Marcus <shadowsor@gmail.com> wrote:

> It is there (I believe) because cloudstack is acting as a cluster manager
> for KVM. It is using NFS to determine if it is 'alive' on the network, and
> if it is not, it reboots itself to avoid having a split brain scenario
> where VMs start coming up on other hosts when they are already running on
> this host.  It generally works, if the problem is the host, but as you
> point out, there's a situation where the problem can be the NFS server.
> This fairly rare for enterprise NFS with high availability, but there are a
> fair number of people who have NFS on servers that are relatively low
> availability (non-clustered, or get overloaded and unresponsive).
>
> There's plenty of room for improvement in that script, I agree the original
> implemention seems fairly rudimentary, but we have to be careful in
> thinking about all scenarios and make sure there's no chance of split
> brain. In the mean time, one could also partition the resources such that
> you have more clusters and only one primary storage per cluster (or
> something else, like storage/host tags to guarantee each host only uses one
> NFS).
>
> On Fri, Nov 14, 2014 at 8:07 AM, Andrija Panic <andrija.panic@gmail.com>
> wrote:
>
> > Hi guys,
> >
> > I'm wondering why us there a check
> > inside
> > /usr/share/cloudstack-common/scripts/vm/hypervisor/kvm/kvmheartbeat.sh
> > ?
> >
> > I understand that the KVM host checks availability of Primary Storage,
> and
> > reboots itself if it can't write to storage.
> >
> > But, if we have say, 3 NFS in a cluster, then lot of KVM hosts - 1
> primary
> > storage going down (server crashing or whatever) - will bring porbably
> 99%
> > of KVM hosts also down for reboot ?
> > So instead of loosing uptime for 1/3 of my VMs (1 storage out of 3) - I
> > loose uptime for 99%-100% of my VMs ?
> >
> > I manually edit this script to disabled reboots - but why is it there in
> > any case ?
> > It doesn't make sense to me - unless I'm mising a point (probably)...
> >
> > Thanks,
> > --
> >
> > Andrija Panić
> >
>



--

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message