cloudstack-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Wei ZHOU <ustcweiz...@gmail.com>
Subject Re: Orphaned libvirt storage pools
Date Wed, 12 Jun 2013 16:26:59 GMT
Wido,

Could you tell me the libvirt version?
For our platform with this issue, the libvirt version is 0.9.13

-Wei


2013/6/7 Marcus Sorensen <shadowsor@gmail.com>

> There is already quite a bit of logging around this stuff, for example:
>
>                 s_logger.error("deleteStoragePool removed pool from
> libvirt, but libvirt had trouble"
>                                + "unmounting the pool. Trying umount
> location " + targetPath
>                                + "again in a few seconds");
>
> And if it gets an error from libvirt during create stating that the
> mountpoint is in use, agent attempts to unmount before remounting. Of
> course this would fail if it is in use.
>
>             // if error is that pool is mounted, try to handle it
>             if (e.toString().contains("already mounted")) {
>                 s_logger.error("Attempting to unmount old mount
> libvirt is unaware of at "+targetPath);
>                 String result = Script.runSimpleBashScript("umount " +
> targetPath );
>                 if (result == null) {
>                     s_logger.error("Succeeded in unmounting " +
> targetPath);
>                     try {
>                         sp = conn.storagePoolCreateXML(spd.toString(), 0);
>                         s_logger.error("Succeeded in redefining storage");
>                         return sp;
>                     } catch (LibvirtException l) {
>                         s_logger.error("Target was already mounted,
> unmounted it but failed to redefine storage:" + l);
>                     }
>                 } else {
>                     s_logger.error("Failed in unmounting and
> redefining storage");
>                 }
>             }
>
>
> Do you think it was related to the upgrade process itself (e.g. maybe
> the storage pools didn't carry across the libvirt upgrade)? Can you
> duplicate outside of the upgrade?
>
> On Fri, Jun 7, 2013 at 8:43 AM, Wido den Hollander <wido@widodh.nl> wrote:
> > Hi,
> >
> >
> > On 06/07/2013 04:30 PM, Marcus Sorensen wrote:
> >>
> >> Does this only happen with isos?
> >
> >
> > Yes, it does.
> >
> > My work-around for now was to locate all the Instances who had these ISOs
> > attached and detach them from all (~100 instances..)
> >
> > Then I manually unmounted all the mountpoints under /mnt so that they
> can be
> > re-used again.
> >
> > This cluster was upgraded to 4.1 from 4.0 with libvirt 1.0.2 (coming from
> > 0.9.8).
> >
> > Somehow libvirt forgot about these storage pools.
> >
> > Wido
> >
> >> On Jun 7, 2013 8:15 AM, "Wido den Hollander" <wido@widodh.nl> wrote:
> >>
> >>> Hi,
> >>>
> >>> So, I just created CLOUDSTACK-2893, but Wei Zhou mentioned that there
> are
> >>> some related issues:
> >>> * CLOUDSTACK-2729
> >>> * CLOUDSTACK-2780
> >>>
> >>> I restarted my Agent and the issue described in 2893 went away, but I'm
> >>> wondering how that happened.
> >>>
> >>> Anyway, after going further I found that I have some "orphaned" storage
> >>> pools, with that I mean, they are mounted and in use, but not defined
> nor
> >>> active in libvirt:
> >>>
> >>> root@n02:~# lsof |grep "\.iso"|awk '{print $9}'|cut -d '/' -f 3|sort
> >>> -n|uniq
> >>> eb3cd8fd-a462-35b9-882a-**f4b9f2f4a84c
> >>> f84e51ab-d203-3114-b581-**247b81b7d2c1
> >>> fd968b03-bd11-3179-a2b3-**73def7c66c68
> >>> 7ceb73e5-5ab1-3862-ad6e-**52cb986aff0d
> >>> 7dc0149e-0281-3353-91eb-**4589ef2b1ec1
> >>> 8e005344-6a65-3802-ab36-**31befc95abf3
> >>> 88ddd8f5-e6c7-3f3d-bef2-**eea8f33aa593
> >>> 765e63d7-e9f9-3203-bf4f-**e55f83fe9177
> >>> 1287a27d-0383-3f5a-84aa-**61211621d451
> >>> 98622150-41b2-3ba3-9c9c-**09e3b6a2da03
> >>>
> >>> root@n02:~#
> >>>
> >>> Looking at libvirt:
> >>> root@n02:~# virsh pool-list
> >>> Name                 State      Autostart
> >>> ------------------------------**-----------
> >>> 52801816-fe44-3a2b-a147-**bb768eeea295 active     no
> >>> 7ceb73e5-5ab1-3862-ad6e-**52cb986aff0d active     no
> >>> 88ddd8f5-e6c7-3f3d-bef2-**eea8f33aa593 active     no
> >>> a83d1100-4ffa-432a-8467-**4dc266c4b0c8 active     no
> >>> fd968b03-bd11-3179-a2b3-**73def7c66c68 active     no
> >>>
> >>>
> >>> root@n02:~#
> >>>
> >>> What happens here is that the mountpoints are in use (ISO attached to
> >>> Instance) but there is no storage pool in libvirt.
> >>>
> >>> This means that when you try to deploy a second VM with the same ISO
> >>> libvirt will error out since the Agent will try to create and start a
> new
> >>> storage pool which will fail since the mountpoint is already in use.
> >>>
> >>> The remedy would be to take the hypervisor into maintainence, reboot
> int
> >>> completely and migrate Instances to it again.
> >>>
> >>> In libvirt there is no way to start a NFS storage pool without libvirt
> >>> mounting it.
> >>>
> >>> Any suggestions on how we can work around this code wise?
> >>>
> >>> For my issue I'm writing a patch which adds some more debug lines to
> show
> >>> what the Agent is doing, but it's kind of weird that we got into this
> >>> "disconnected" state.
> >>>
> >>> Wido
> >>>
> >>
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message