Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 1702E200D11 for ; Mon, 2 Oct 2017 16:30:35 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 155961609EF; Mon, 2 Oct 2017 14:30:35 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 0CF081609DE for ; Mon, 2 Oct 2017 16:30:33 +0200 (CEST) Received: (qmail 90666 invoked by uid 500); 2 Oct 2017 14:30:32 -0000 Mailing-List: contact dev-help@cloudstack.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@cloudstack.apache.org Delivered-To: mailing list dev@cloudstack.apache.org Received: (qmail 90654 invoked by uid 99); 2 Oct 2017 14:30:32 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd2-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 02 Oct 2017 14:30:32 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd2-us-west.apache.org (ASF Mail Server at spamd2-us-west.apache.org) with ESMTP id D62351A5CED for ; Mon, 2 Oct 2017 14:30:31 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd2-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 2.479 X-Spam-Level: ** X-Spam-Status: No, score=2.479 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, HTML_MESSAGE=2, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, RCVD_IN_SORBS_SPAM=0.5, SPF_PASS=-0.001] autolearn=disabled Authentication-Results: spamd2-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=bw-sw-com.20150623.gappssmtp.com Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd2-us-west.apache.org [10.40.0.9]) (amavisd-new, port 10024) with ESMTP id feMRKo6xHX-Y for ; Mon, 2 Oct 2017 14:30:26 +0000 (UTC) Received: from mail-it0-f48.google.com (mail-it0-f48.google.com [209.85.214.48]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTPS id 395035FCF2 for ; Mon, 2 Oct 2017 14:30:25 +0000 (UTC) Received: by mail-it0-f48.google.com with SMTP id v62so6012055itd.0 for ; Mon, 02 Oct 2017 07:30:25 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bw-sw-com.20150623.gappssmtp.com; s=20150623; h=mime-version:in-reply-to:references:from:date:message-id:subject:to; bh=b5uMmy4mSKxZft3K/5WsHH4ZgP7RJqB5o7hkQSqO/T8=; b=pzspxHJIIprJ2bQxAli7xqXm6T4Nkvk5JloSU5g5y11OO9jSAaByU/naytFVdlk8pr 5MXdPo+oWJw/yZj7WcnozUVVzF54SifBy6DfSVcQo6Z97Ewqju/SU4deTtEzNVbWWNAB BxIEIIojTvtGw7k+xO/F2e6wOSACBOod8M1+9AXi0O9FHbBW/FK5a20rkwMk3awe5qkk VxedlOT5FtByu0mc8hpjWwTnnJ+FvgO2T6G/cPPn30VTBbIvGyzNde8h8sbgsuj0Tss4 6MIlu+lOKBZvtOZgWs+xea2qtAOcOhQNAGJxvvAm2iDaOUzG+289iRuIxXSstXFhACSd W1fw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to; bh=b5uMmy4mSKxZft3K/5WsHH4ZgP7RJqB5o7hkQSqO/T8=; b=S0DgLDkc71xbzIN0mwQkEnUZ5LhyShJlbTwLUFHuFaKsx1J8d/kMHTyH4ZDdfZnFT3 ABID97jRLHWIVx1Qw2GCaCWxdb1TxgqKO9iYCGrTrLVovqr8hP/vE8jes+HYuSvF1AR7 dMxpD1pHVGmtv1B0TVyM1oih565R0vsprA7jbKfAbdO/WidDpirpyhD+qkBy9ckfqx3e uS9b9TpB9j/WwxzzVyY29PWi2ILlNQZuN/wAp3yvdU7CI0SUgfL7Z31S0lSJ8T/vUjlF k2qTQvrtJBe/Q3txQuEthU9SfZcsWFS/7heHQ28wqcaF80sNIc5QbTl05kYB7c2V9oZ5 ZMPA== X-Gm-Message-State: AMCzsaV1GNAOX0fXBtpNWeJHfDA3Uxvpiqh+pAU7Zl4h3hlvG6IRjSCG iyBkKJ84ASXJ5J/5qmzTI8PZSIzGuwxx17whCnX0/aNA X-Google-Smtp-Source: AOwi7QBUYsTW4bmmHQknYa6P0pDbb5/DmthzBoLbnu0ACXZ/TjVIBdWFRj4CyPwZkdA0AUOFUw+DXvrLL/TTFfY+e2A= X-Received: by 10.36.46.140 with SMTP id i134mr17883270ita.137.1506954623557; Mon, 02 Oct 2017 07:30:23 -0700 (PDT) MIME-Version: 1.0 Received: by 10.107.162.83 with HTTP; Mon, 2 Oct 2017 07:30:22 -0700 (PDT) Received: by 10.107.162.83 with HTTP; Mon, 2 Oct 2017 07:30:22 -0700 (PDT) In-Reply-To: References: <7dd74055-b320-c0b0-c5e3-dd69adfa7101@gmail.com> From: Ivan Kudryavtsev Date: Mon, 2 Oct 2017 21:30:22 +0700 Message-ID: Subject: Re: Need to ask for help again (Migration in cloudstack) To: dev@cloudstack.apache.org Content-Type: multipart/alternative; boundary="001a114ab57207e4c0055a91377b" archived-at: Mon, 02 Oct 2017 14:30:35 -0000 --001a114ab57207e4c0055a91377b Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Hi. Just, don't compare 1g vs 10g or even 40g infiniband network. It might look like linear bandwidth growth should lead to proportional time decrease, but migration can stuck forever with 1g and work seconds with 10g or 40g. But, Indeed, autoconvergence is a great feature. 2 =D0=BE=D0=BA=D1=82. 2017 =D0=B3. 20:32 =D0=BF=D0=BE=D0=BB=D1=8C=D0=B7=D0= =BE=D0=B2=D0=B0=D1=82=D0=B5=D0=BB=D1=8C "Andrija Panic" =D0=BD=D0=B0=D0=BF=D0=B8=D1=81=D0=B0=D0=BB: > BTW, I went extreme and tested 24CPU/60GB busy VM migrate with dynamic > auto-convergence (qemu2.5/libvirt1.3.1 and a nice patch to activate > autoconverge flag inside ACS- thx to Mike Tutkowski !), where right after > first migration cycle of 58G ram is finished (58GB RAM =3D Prime95 worklo= ad > with all 24 CPUs) - yet another 58GB of modified RAM needs to migrated := D > > So it really works like a charm :) > > On 2 October 2017 at 15:29, Andrija Panic wrote= : > > > Hi Ivan, > > > > yes you are right, but it works like crap (from downtime perspective), > > because when we could not live migrate "normally" one 64GB client VM, w= e > > manually (instead of ACS doing it...) paused the VM via VIRSH, and then > VM > > was in pauses state for 15min (yes it was only 1GBps management network > at > > that time), so VM was down for 15min... and that is unacceptable for > client. > > > > So dynamic auto convergence will work in following way (based on my > > experience monitoring migration cycles and CPU cycles with my 4 eyes :) > ) > > - it will slowly throttle CPU, more and more, but very gently... until = it > > decide, enough is enough, and then after i.e. 16-30 migration iteration= s > > (of almost full RAM being migrated each iteration), it will throttle CP= U > > aggressively and let VM migration finish (without downtime except durin= g > > finall pause of few tens of miliseconds or less). > > > > Again, just my experience, because we do have many "enterprise workload= " > > customers, and it was pain until we solved this to work fine (imagine > host > > maintenance mode also not working fully for those VMs...) > > > > Cheers > > > > On 2 October 2017 at 14:55, Dmitriy Kaluzhniy < > dmitriy.kaluzhniy@gmail.com > > > wrote: > > > >> Hello! > >> I want to say thanks to all! > >> Nowadays I had no time to work on this, but I hope I will setup some > test > >> environment to try live migration + migration on non-shared. > >> > >> 2017-10-02 13:50 GMT+03:00 Ivan Kudryavtsev = : > >> > >> > AFAIK ACS has VM suspend parameter in KVM agent which acts when ACS = is > >> > unable to migrate successfully. Also, I almost have no problem with > >> > 8core/16GB migration over 10G, but you are right. Sometimes it doesn= 't > >> work > >> > as expected without autoconvergence and new Qemu/KVM does the work. > >> > > >> > 2017-10-02 17:44 GMT+07:00 Andrija Panic : > >> > > >> > > A bit late, and not directly related with original question - if y= ou > >> are > >> > > doing any kind of KVM live migration (ACS or not), make sure you a= re > >> > using > >> > > qemu 2.5 and libvirt 1.3+, to support > >> > > dynamic auto-convergence (regular auto-convergence, almost useless= , > >> > > available from qemu 1.6+) - becase live migration works well, unti= l > >> you > >> > hit > >> > > busy production VM, where there is hi RAM change rate, then nothin= g > >> helps > >> > > except mentioned qemu 2.5+ dynamic autoconvergence (and even this > >> takes > >> > > ages to completely allow some very busy VMs to finish migration...= ). > >> > > > >> > > On 5 September 2017 at 22:52, ilya > >> wrote: > >> > > > >> > > > Personal experience with KVM (not cloudstack related) and > non-shared > >> > > > storage migration - works most of the time - but can be very slo= w > - > >> > even > >> > > > with 10G backplane. > >> > > > > >> > > > On 9/5/17 6:27 AM, Marc-Aur=C3=A8le Brothier wrote: > >> > > > > Hi Dimitriy, > >> > > > > > >> > > > > I wrote the PR for the live migration in cloudstack (PR 1709). > >> We're > >> > > > using > >> > > > > an older version than upstream so it's hard for me to fix the > >> > > integration > >> > > > > tests errors. All I can tell you, is that you should first > >> configure > >> > > > > libvirt correctly for migration. You can play with it by > manually > >> > > running > >> > > > > virsh commands to initiate the migration. The networking part > will > >> > not > >> > > > work > >> > > > > after the VM being on the other machine if down manually. > >> > > > > > >> > > > > Marc-Aur=C3=A8le > >> > > > > > >> > > > > On Tue, Sep 5, 2017 at 2:07 PM, Dmitriy Kaluzhniy < > >> > > > > dmitriy.kaluzhniy@gmail.com> wrote: > >> > > > > > >> > > > >> Hello, > >> > > > >> That's what I want, thank you! > >> > > > >> I want to have Live migration on KVM with non-shared storages= . > >> > > > >> As I understood, migration is performed by LibVirt. > >> > > > >> > >> > > > >> 2017-09-01 17:04 GMT+03:00 Simon Weller > >> >: > >> > > > >> > >> > > > >>> Dmitriy, > >> > > > >>> > >> > > > >>> Can you give us a bit more information about what you're > trying > >> to > >> > > do? > >> > > > >>> If you're looking for live migration on non shared storage > with > >> > KVM, > >> > > > >> there > >> > > > >>> is an outstanding PR in the works to support that: > >> > > > >>> > >> > > > >>> https://github.com/apache/cloudstack/pull/1709 > >> > > > >>> > >> > > > >>> - Si > >> > > > >>> > >> > > > >>> > >> > > > >>> ________________________________ > >> > > > >>> From: Rajani Karuturi > >> > > > >>> Sent: Friday, September 1, 2017 4:07 AM > >> > > > >>> To: dev@cloudstack.apache.org > >> > > > >>> Subject: Re: Need to ask for help again (Migration in > >> cloudstack) > >> > > > >>> > >> > > > >>> You might start with this commit > >> > > > >>> https://github.com/apache/cloudstack/commit/ > >> > > > >> 21ce3befc8ea9e1a6de449a21499a5 > >> > > > >>> 0ff141a183 > >> > > > >>> > >> > > > >>> > >> > > > >>> and storage_motion_supported column in hypervisor_capabiliti= es > >> > > > >>> table. > >> > > > >>> > >> > > > >>> Thanks, > >> > > > >>> > >> > > > >>> ~ Rajani > >> > > > >>> > >> > > > >>> http://cloudplatform.accelerite.com/ > >> > > > >>> > >> > > > >>> On August 31, 2017 at 6:29 PM, Dmitriy Kaluzhniy > >> > > > >>> (dmitriy.kaluzhniy@gmail.com) wrote: > >> > > > >>> > >> > > > >>> Hello! > >> > > > >>> I contacted this mail before, but I wasn't subscribed to > mailing > >> > > > >>> list. > >> > > > >>> The reason I'm contacting you - I need advise. > >> > > > >>> During last week I was learning cloudstack code to find wher= e > is > >> > > > >>> implemented logic of this statements I found in cloudstack > >> > > > >>> documentation: > >> > > > >>> "(KVM) The VM must not be using local disk storage. (On > >> > > > >>> XenServer and > >> > > > >>> VMware, VM live migration with local disk is enabled by > >> > > > >>> CloudStack support > >> > > > >>> for XenMotion and vMotion.) > >> > > > >>> > >> > > > >>> (KVM) The destination host must be in the same cluster as th= e > >> > > > >>> original > >> > > > >>> host. (On XenServer and VMware, VM live migration from one > >> > > > >>> cluster to > >> > > > >>> another is enabled by CloudStack support for XenMotion and > >> > > > >>> vMotion.)" > >> > > > >>> > >> > > > >>> I made up a long road through source code but still can't se= e > >> > > > >>> it. If you > >> > > > >>> can give me any advise - it will be amazing. > >> > > > >>> Anyway, thank you. > >> > > > >>> > >> > > > >>> -- > >> > > > >>> > >> > > > >>> *Best regards,Dmitriy Kaluzhniy+38 (073) 101 14 73* > >> > > > >>> > >> > > > >> > >> > > > >> > >> > > > >> > >> > > > >> -- > >> > > > >> > >> > > > >> > >> > > > >> > >> > > > >> *--=D0=A1 =D1=83=D0=B2=D0=B0=D0=B6=D0=B5=D0=BD=D0=B8=D0=B5=D0= =BC,=D0=94=D0=BC=D0=B8=D1=82=D1=80=D0=B8=D0=B9 =D0=9A=D0=B0=D0=BB=D1=8E=D0= =B6=D0=BD=D1=8B=D0=B9+38 (073) 101 14 73* > >> > > > >> > >> > > > > > >> > > > > >> > > > >> > > > >> > > > >> > > -- > >> > > > >> > > Andrija Pani=C4=87 > >> > > > >> > > >> > > >> > > >> > -- > >> > With best regards, Ivan Kudryavtsev > >> > Bitworks Software, Ltd. > >> > Cell: +7-923-414-1515 > >> > WWW: http://bitworks.software/ > >> > > >> > >> > >> > >> -- > >> > >> > >> > >> *--=D0=A1 =D1=83=D0=B2=D0=B0=D0=B6=D0=B5=D0=BD=D0=B8=D0=B5=D0=BC,=D0= =94=D0=BC=D0=B8=D1=82=D1=80=D0=B8=D0=B9 =D0=9A=D0=B0=D0=BB=D1=8E=D0=B6=D0= =BD=D1=8B=D0=B9+38 (073) 101 14 73* > >> > > > > > > > > -- > > > > Andrija Pani=C4=87 > > > > > > -- > > Andrija Pani=C4=87 > --001a114ab57207e4c0055a91377b--