cloudstack-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ivan Kudryavtsev <kudryavtsev...@bw-sw.com>
Subject Re: Need to ask for help again (Migration in cloudstack)
Date Mon, 02 Oct 2017 14:30:22 GMT
Hi. Just, don't compare 1g vs 10g or even 40g infiniband network. It might
look like linear bandwidth growth should lead to proportional time
decrease, but migration can stuck forever with 1g and work seconds with 10g
or 40g.

But, Indeed, autoconvergence is a great feature.

2 окт. 2017 г. 20:32 пользователь "Andrija Panic" <andrija.panic@gmail.com>
написал:

> BTW, I went extreme and tested 24CPU/60GB busy VM migrate with dynamic
> auto-convergence (qemu2.5/libvirt1.3.1 and a nice patch to activate
> autoconverge flag inside ACS- thx to Mike Tutkowski !), where right after
> first migration cycle of 58G ram is finished (58GB RAM = Prime95 workload
> with all 24 CPUs) -  yet another 58GB of modified RAM needs to migrated :D
>
> So it really works like a charm :)
>
> On 2 October 2017 at 15:29, Andrija Panic <andrija.panic@gmail.com> wrote:
>
> > Hi Ivan,
> >
> > yes you are right, but it works like crap (from downtime perspective),
> > because when we could not live migrate "normally" one 64GB client VM, we
> > manually (instead of ACS doing it...) paused the VM via VIRSH, and then
> VM
> > was in pauses state for 15min (yes it was only 1GBps management network
> at
> > that time), so VM was down for 15min... and that is unacceptable for
> client.
> >
> > So dynamic auto convergence will work in following way (based on my
> > experience monitoring migration cycles and CPU cycles with my 4 eyes :)
> )
> > - it will slowly throttle CPU, more and more, but very gently... until it
> > decide, enough is enough, and then after i.e. 16-30 migration iterations
> > (of almost full RAM being migrated each iteration), it will throttle CPU
> > aggressively and let VM migration finish (without downtime except during
> > finall pause of few tens of miliseconds or less).
> >
> > Again, just my experience, because we do have many "enterprise workload"
> > customers, and it was pain until we solved this to work fine (imagine
> host
> > maintenance mode also not working fully for those VMs...)
> >
> > Cheers
> >
> > On 2 October 2017 at 14:55, Dmitriy Kaluzhniy <
> dmitriy.kaluzhniy@gmail.com
> > > wrote:
> >
> >> Hello!
> >> I want to say thanks to all!
> >> Nowadays I had no time to work on this, but I hope I will setup some
> test
> >> environment to try live migration + migration on non-shared.
> >>
> >> 2017-10-02 13:50 GMT+03:00 Ivan Kudryavtsev <kudryavtsev_ia@bw-sw.com>:
> >>
> >> > AFAIK ACS has VM suspend parameter in KVM agent which acts when ACS is
> >> > unable to migrate successfully. Also, I almost have no problem with
> >> > 8core/16GB migration over 10G, but you are right. Sometimes it doesn't
> >> work
> >> > as expected without autoconvergence and new Qemu/KVM does the work.
> >> >
> >> > 2017-10-02 17:44 GMT+07:00 Andrija Panic <andrija.panic@gmail.com>:
> >> >
> >> > > A bit late, and not directly related with original question - if you
> >> are
> >> > > doing any kind of KVM live migration (ACS or not), make sure you are
> >> > using
> >> > > qemu 2.5 and libvirt 1.3+, to support
> >> > > dynamic auto-convergence (regular auto-convergence, almost useless,
> >> > > available from qemu 1.6+) - becase live migration works well, until
> >> you
> >> > hit
> >> > > busy production VM, where there is hi RAM change rate, then nothing
> >> helps
> >> > > except mentioned qemu 2.5+ dynamic autoconvergence (and even this
> >> takes
> >> > > ages to completely allow some very busy VMs to finish migration...).
> >> > >
> >> > > On 5 September 2017 at 22:52, ilya <ilya.mailing.lists@gmail.com>
> >> wrote:
> >> > >
> >> > > > Personal experience with KVM (not cloudstack related) and
> non-shared
> >> > > > storage migration - works most of the time - but can be very
slow
> -
> >> > even
> >> > > > with 10G backplane.
> >> > > >
> >> > > > On 9/5/17 6:27 AM, Marc-Aurèle Brothier wrote:
> >> > > > > Hi Dimitriy,
> >> > > > >
> >> > > > > I wrote the PR for the live migration in cloudstack (PR
1709).
> >> We're
> >> > > > using
> >> > > > > an older version than upstream so it's hard for me to fix
the
> >> > > integration
> >> > > > > tests errors. All I can tell you, is that you should first
> >> configure
> >> > > > > libvirt correctly for migration. You can play with it by
> manually
> >> > > running
> >> > > > > virsh commands to initiate the migration. The networking
part
> will
> >> > not
> >> > > > work
> >> > > > > after the VM being on the other machine if down manually.
> >> > > > >
> >> > > > > Marc-Aurèle
> >> > > > >
> >> > > > > On Tue, Sep 5, 2017 at 2:07 PM, Dmitriy Kaluzhniy <
> >> > > > > dmitriy.kaluzhniy@gmail.com> wrote:
> >> > > > >
> >> > > > >> Hello,
> >> > > > >> That's what I want, thank you!
> >> > > > >> I want to have Live migration on KVM with non-shared
storages.
> >> > > > >> As I understood, migration is performed by LibVirt.
> >> > > > >>
> >> > > > >> 2017-09-01 17:04 GMT+03:00 Simon Weller
> <sweller@ena.com.invalid
> >> >:
> >> > > > >>
> >> > > > >>> Dmitriy,
> >> > > > >>>
> >> > > > >>> Can you give us a bit more information about what
you're
> trying
> >> to
> >> > > do?
> >> > > > >>> If you're looking for live migration on non shared
storage
> with
> >> > KVM,
> >> > > > >> there
> >> > > > >>> is an outstanding PR  in the works to support that:
> >> > > > >>>
> >> > > > >>> https://github.com/apache/cloudstack/pull/1709
> >> > > > >>>
> >> > > > >>> - Si
> >> > > > >>>
> >> > > > >>>
> >> > > > >>> ________________________________
> >> > > > >>> From: Rajani Karuturi <rajani@apache.org>
> >> > > > >>> Sent: Friday, September 1, 2017 4:07 AM
> >> > > > >>> To: dev@cloudstack.apache.org
> >> > > > >>> Subject: Re: Need to ask for help again (Migration
in
> >> cloudstack)
> >> > > > >>>
> >> > > > >>> You might start with this commit
> >> > > > >>> https://github.com/apache/cloudstack/commit/
> >> > > > >> 21ce3befc8ea9e1a6de449a21499a5
> >> > > > >>> 0ff141a183
> >> > > > >>>
> >> > > > >>>
> >> > > > >>> and storage_motion_supported column in hypervisor_capabilities
> >> > > > >>> table.
> >> > > > >>>
> >> > > > >>> Thanks,
> >> > > > >>>
> >> > > > >>> ~ Rajani
> >> > > > >>>
> >> > > > >>> http://cloudplatform.accelerite.com/
> >> > > > >>>
> >> > > > >>> On August 31, 2017 at 6:29 PM, Dmitriy Kaluzhniy
> >> > > > >>> (dmitriy.kaluzhniy@gmail.com) wrote:
> >> > > > >>>
> >> > > > >>> Hello!
> >> > > > >>> I contacted this mail before, but I wasn't subscribed
to
> mailing
> >> > > > >>> list.
> >> > > > >>> The reason I'm contacting you - I need advise.
> >> > > > >>> During last week I was learning cloudstack code
to find where
> is
> >> > > > >>> implemented logic of this statements I found in
cloudstack
> >> > > > >>> documentation:
> >> > > > >>> "(KVM) The VM must not be using local disk storage.
(On
> >> > > > >>> XenServer and
> >> > > > >>> VMware, VM live migration with local disk is enabled
by
> >> > > > >>> CloudStack support
> >> > > > >>> for XenMotion and vMotion.)
> >> > > > >>>
> >> > > > >>> (KVM) The destination host must be in the same cluster
as the
> >> > > > >>> original
> >> > > > >>> host. (On XenServer and VMware, VM live migration
from one
> >> > > > >>> cluster to
> >> > > > >>> another is enabled by CloudStack support for XenMotion
and
> >> > > > >>> vMotion.)"
> >> > > > >>>
> >> > > > >>> I made up a long road through source code but still
can't see
> >> > > > >>> it. If you
> >> > > > >>> can give me any advise - it will be amazing.
> >> > > > >>> Anyway, thank you.
> >> > > > >>>
> >> > > > >>> --
> >> > > > >>>
> >> > > > >>> *Best regards,Dmitriy Kaluzhniy+38 (073) 101 14
73*
> >> > > > >>>
> >> > > > >>
> >> > > > >>
> >> > > > >>
> >> > > > >> --
> >> > > > >>
> >> > > > >>
> >> > > > >>
> >> > > > >> *--С уважением,Дмитрий Калюжный+38
(073) 101 14 73*
> >> > > > >>
> >> > > > >
> >> > > >
> >> > >
> >> > >
> >> > >
> >> > > --
> >> > >
> >> > > Andrija Panić
> >> > >
> >> >
> >> >
> >> >
> >> > --
> >> > With best regards, Ivan Kudryavtsev
> >> > Bitworks Software, Ltd.
> >> > Cell: +7-923-414-1515
> >> > WWW: http://bitworks.software/ <http://bw-sw.com/>
> >> >
> >>
> >>
> >>
> >> --
> >>
> >>
> >>
> >> *--С уважением,Дмитрий Калюжный+38 (073) 101 14 73*
> >>
> >
> >
> >
> > --
> >
> > Andrija Panić
> >
>
>
>
> --
>
> Andrija Panić
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message