cloudstack-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrija Panic <andrija.pa...@gmail.com>
Subject Re: Need to ask for help again (Migration in cloudstack)
Date Mon, 02 Oct 2017 13:32:49 GMT
BTW, I went extreme and tested 24CPU/60GB busy VM migrate with dynamic
auto-convergence (qemu2.5/libvirt1.3.1 and a nice patch to activate
autoconverge flag inside ACS- thx to Mike Tutkowski !), where right after
first migration cycle of 58G ram is finished (58GB RAM = Prime95 workload
with all 24 CPUs) -  yet another 58GB of modified RAM needs to migrated :D

So it really works like a charm :)

On 2 October 2017 at 15:29, Andrija Panic <andrija.panic@gmail.com> wrote:

> Hi Ivan,
>
> yes you are right, but it works like crap (from downtime perspective),
> because when we could not live migrate "normally" one 64GB client VM, we
> manually (instead of ACS doing it...) paused the VM via VIRSH, and then VM
> was in pauses state for 15min (yes it was only 1GBps management network at
> that time), so VM was down for 15min... and that is unacceptable for client.
>
> So dynamic auto convergence will work in following way (based on my
> experience monitoring migration cycles and CPU cycles with my 4 eyes :)  )
> - it will slowly throttle CPU, more and more, but very gently... until it
> decide, enough is enough, and then after i.e. 16-30 migration iterations
> (of almost full RAM being migrated each iteration), it will throttle CPU
> aggressively and let VM migration finish (without downtime except during
> finall pause of few tens of miliseconds or less).
>
> Again, just my experience, because we do have many "enterprise workload"
> customers, and it was pain until we solved this to work fine (imagine host
> maintenance mode also not working fully for those VMs...)
>
> Cheers
>
> On 2 October 2017 at 14:55, Dmitriy Kaluzhniy <dmitriy.kaluzhniy@gmail.com
> > wrote:
>
>> Hello!
>> I want to say thanks to all!
>> Nowadays I had no time to work on this, but I hope I will setup some test
>> environment to try live migration + migration on non-shared.
>>
>> 2017-10-02 13:50 GMT+03:00 Ivan Kudryavtsev <kudryavtsev_ia@bw-sw.com>:
>>
>> > AFAIK ACS has VM suspend parameter in KVM agent which acts when ACS is
>> > unable to migrate successfully. Also, I almost have no problem with
>> > 8core/16GB migration over 10G, but you are right. Sometimes it doesn't
>> work
>> > as expected without autoconvergence and new Qemu/KVM does the work.
>> >
>> > 2017-10-02 17:44 GMT+07:00 Andrija Panic <andrija.panic@gmail.com>:
>> >
>> > > A bit late, and not directly related with original question - if you
>> are
>> > > doing any kind of KVM live migration (ACS or not), make sure you are
>> > using
>> > > qemu 2.5 and libvirt 1.3+, to support
>> > > dynamic auto-convergence (regular auto-convergence, almost useless,
>> > > available from qemu 1.6+) - becase live migration works well, until
>> you
>> > hit
>> > > busy production VM, where there is hi RAM change rate, then nothing
>> helps
>> > > except mentioned qemu 2.5+ dynamic autoconvergence (and even this
>> takes
>> > > ages to completely allow some very busy VMs to finish migration...).
>> > >
>> > > On 5 September 2017 at 22:52, ilya <ilya.mailing.lists@gmail.com>
>> wrote:
>> > >
>> > > > Personal experience with KVM (not cloudstack related) and non-shared
>> > > > storage migration - works most of the time - but can be very slow
-
>> > even
>> > > > with 10G backplane.
>> > > >
>> > > > On 9/5/17 6:27 AM, Marc-Aurèle Brothier wrote:
>> > > > > Hi Dimitriy,
>> > > > >
>> > > > > I wrote the PR for the live migration in cloudstack (PR 1709).
>> We're
>> > > > using
>> > > > > an older version than upstream so it's hard for me to fix the
>> > > integration
>> > > > > tests errors. All I can tell you, is that you should first
>> configure
>> > > > > libvirt correctly for migration. You can play with it by manually
>> > > running
>> > > > > virsh commands to initiate the migration. The networking part
will
>> > not
>> > > > work
>> > > > > after the VM being on the other machine if down manually.
>> > > > >
>> > > > > Marc-Aurèle
>> > > > >
>> > > > > On Tue, Sep 5, 2017 at 2:07 PM, Dmitriy Kaluzhniy <
>> > > > > dmitriy.kaluzhniy@gmail.com> wrote:
>> > > > >
>> > > > >> Hello,
>> > > > >> That's what I want, thank you!
>> > > > >> I want to have Live migration on KVM with non-shared storages.
>> > > > >> As I understood, migration is performed by LibVirt.
>> > > > >>
>> > > > >> 2017-09-01 17:04 GMT+03:00 Simon Weller <sweller@ena.com.invalid
>> >:
>> > > > >>
>> > > > >>> Dmitriy,
>> > > > >>>
>> > > > >>> Can you give us a bit more information about what you're
trying
>> to
>> > > do?
>> > > > >>> If you're looking for live migration on non shared storage
with
>> > KVM,
>> > > > >> there
>> > > > >>> is an outstanding PR  in the works to support that:
>> > > > >>>
>> > > > >>> https://github.com/apache/cloudstack/pull/1709
>> > > > >>>
>> > > > >>> - Si
>> > > > >>>
>> > > > >>>
>> > > > >>> ________________________________
>> > > > >>> From: Rajani Karuturi <rajani@apache.org>
>> > > > >>> Sent: Friday, September 1, 2017 4:07 AM
>> > > > >>> To: dev@cloudstack.apache.org
>> > > > >>> Subject: Re: Need to ask for help again (Migration in
>> cloudstack)
>> > > > >>>
>> > > > >>> You might start with this commit
>> > > > >>> https://github.com/apache/cloudstack/commit/
>> > > > >> 21ce3befc8ea9e1a6de449a21499a5
>> > > > >>> 0ff141a183
>> > > > >>>
>> > > > >>>
>> > > > >>> and storage_motion_supported column in hypervisor_capabilities
>> > > > >>> table.
>> > > > >>>
>> > > > >>> Thanks,
>> > > > >>>
>> > > > >>> ~ Rajani
>> > > > >>>
>> > > > >>> http://cloudplatform.accelerite.com/
>> > > > >>>
>> > > > >>> On August 31, 2017 at 6:29 PM, Dmitriy Kaluzhniy
>> > > > >>> (dmitriy.kaluzhniy@gmail.com) wrote:
>> > > > >>>
>> > > > >>> Hello!
>> > > > >>> I contacted this mail before, but I wasn't subscribed
to mailing
>> > > > >>> list.
>> > > > >>> The reason I'm contacting you - I need advise.
>> > > > >>> During last week I was learning cloudstack code to find
where is
>> > > > >>> implemented logic of this statements I found in cloudstack
>> > > > >>> documentation:
>> > > > >>> "(KVM) The VM must not be using local disk storage. (On
>> > > > >>> XenServer and
>> > > > >>> VMware, VM live migration with local disk is enabled
by
>> > > > >>> CloudStack support
>> > > > >>> for XenMotion and vMotion.)
>> > > > >>>
>> > > > >>> (KVM) The destination host must be in the same cluster
as the
>> > > > >>> original
>> > > > >>> host. (On XenServer and VMware, VM live migration from
one
>> > > > >>> cluster to
>> > > > >>> another is enabled by CloudStack support for XenMotion
and
>> > > > >>> vMotion.)"
>> > > > >>>
>> > > > >>> I made up a long road through source code but still can't
see
>> > > > >>> it. If you
>> > > > >>> can give me any advise - it will be amazing.
>> > > > >>> Anyway, thank you.
>> > > > >>>
>> > > > >>> --
>> > > > >>>
>> > > > >>> *Best regards,Dmitriy Kaluzhniy+38 (073) 101 14 73*
>> > > > >>>
>> > > > >>
>> > > > >>
>> > > > >>
>> > > > >> --
>> > > > >>
>> > > > >>
>> > > > >>
>> > > > >> *--С уважением,Дмитрий Калюжный+38
(073) 101 14 73*
>> > > > >>
>> > > > >
>> > > >
>> > >
>> > >
>> > >
>> > > --
>> > >
>> > > Andrija Panić
>> > >
>> >
>> >
>> >
>> > --
>> > With best regards, Ivan Kudryavtsev
>> > Bitworks Software, Ltd.
>> > Cell: +7-923-414-1515
>> > WWW: http://bitworks.software/ <http://bw-sw.com/>
>> >
>>
>>
>>
>> --
>>
>>
>>
>> *--С уважением,Дмитрий Калюжный+38 (073) 101 14 73*
>>
>
>
>
> --
>
> Andrija Panić
>



-- 

Andrija Panić

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message