cloudstack-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrija Panic <andrija.pa...@gmail.com>
Subject Re: 答复: RBD primary storage VM encounters Exclusive Lock after triggering HA
Date Tue, 28 May 2019 13:12:07 GMT
Thx Wido!

On Tue, 28 May 2019 at 13:51, Wido den Hollander <wido@widodh.nl> wrote:

>
>
> On 5/28/19 1:48 PM, li jerry wrote:
> > Hi Wido
> >
> >
> >
> > I filled in the CLOUDSTACK is the following KEY
> >
> >
> >
> > [root@cn01-nodeb ~]# ceph auth get client.cloudstack
> >
> > exported keyring for client.cloudstack
> >
> > [client.cloudstack]
> >
> >       key = AQDTh7pcIJjNIhAAwk8jtxilJWXQR7osJRFMLw==
> >
> >       caps mon = "allow r"
> >
> >       caps osd = "allow rwx pool=rbd"
> >
> >
>
> That's the problem :-) Your user needs to be updated.
>
> The caps should be:
>
> [client.cloudstack]
>      key = AQDTh7pcIJjNIhAAwk8jtxilJWXQR7osJRFMLw==
>      caps mon = "profile rbd"
>      caps osd = "profile rbd pool=rbd"
>
> See: http://docs.ceph.com/docs/master/rbd/rbd-cloudstack/
>
> This will allow the client to blacklist the other and take over the
> exclusive-lock.
>
> Wido
>
> >
> > *发件人: *Wido den Hollander <mailto:wido@widodh.nl>
> > *发送时间: *2019年5月28日19:42
> > *收件人: *dev@cloudstack.apache.org <mailto:dev@cloudstack.apache.org>;
> > li jerry <mailto:div8cn@hotmail.com>; users@cloudstack.apache.org
> > <mailto:users@cloudstack.apache.org>
> > *主题: *Re: RBD primary storage VM encounters Exclusive Lock after
> > triggering HA
> >
> >
> >
> >
> >
> > On 5/28/19 6:16 AM, li jerry wrote:
> >> Hello guys
> >>
> >> we’ve deployed an environment with CloudStack 4.11.2 and
> KVM(CentOS7.6), and Ceph 13.2.5 is deployed as the primary storage.
> >> We found some issues with the HA solution, and we are here to ask for
> you suggestions.
> >>
> >> We’ve both enabled VM HA and Host HA feature in CloudStack, and the
> compute offering is tagged as ha.
> >> When we try to perform a power failure test (unplug 1 node of 4), the
> running VMs on the removed node is automatically rescheduled to the other
> living nodes after 5 minutes, but all of them can not boot into the OS. We
> found the booting procedure is stuck by the IO read/write failure.
> >>
> >>
> >>
> >> The following information is prompted after VM starts:
> >>
> >> Generating "/run/initramfs/rdsosreport.txt"
> >>
> >> Entering emergency mode. Exit the shell to continue.
> >> Type "journalctl" to view system logs.
> >> You might want to save "/run/initramfs/rdsosreport.txt" to a USB stick
> or /boot
> >> after mounting them and attach it to a bug report
> >>
> >> :/#
> >>
> >>
> >>
> >> We found this is caused by the lock on the image:
> >> [root@cn01-nodea ~]# rbd lock list a93010b0-2be2-49bd-b25e-ec89b3a98b4b
> >> There is 1 exclusive lock on this image.
> >> Locker         ID                  Address
> >> client.1164351 auto 94464726847232 10.226.16.128:0/3002249644
> >>
> >> If we remove the lock from the image, and restart the VM under
> CloudStack, this VM will boot successfully.
> >>
> >> We know that if we disable the Exclusive Lock feature (by setting
> rbd_default_features = 3) for Ceph would solve this problem. But we don’t
> think it’s the best solution for the HA, so could you please give us some
> ideas about how you are doing and what is the best practice for this
> feature?
> >>
> >
> > exclusive-lock is something to prevent a split-brain and having two
> > clients write to it at the same time.
> >
> > The lock should be released to the other client if this is requested,
> > but I have the feeling that you might have a cephx problem there.
> >
> > Can you post the output of:
> >
> > $ ceph auth get client.X
> >
> > Where you replace X by the user you are using for CloudStack? Also
> > remove they 'key', I don't need that.
> >
> > I want to look at the caps of the user.
> >
> > Wido
> >
> >> Thanks.
> >>
> >>
> >
> >
> >
>


-- 

Andrija Panić

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message