Return-Path: X-Original-To: apmail-incubator-cloudstack-users-archive@minotaur.apache.org Delivered-To: apmail-incubator-cloudstack-users-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 9CD2AD143 for ; Tue, 3 Jul 2012 00:16:51 +0000 (UTC) Received: (qmail 6560 invoked by uid 500); 3 Jul 2012 00:16:51 -0000 Delivered-To: apmail-incubator-cloudstack-users-archive@incubator.apache.org Received: (qmail 6542 invoked by uid 500); 3 Jul 2012 00:16:51 -0000 Mailing-List: contact cloudstack-users-help@incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: cloudstack-users@incubator.apache.org Delivered-To: mailing list cloudstack-users@incubator.apache.org Received: (qmail 6533 invoked by uid 99); 3 Jul 2012 00:16:51 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 03 Jul 2012 00:16:50 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=RCVD_IN_DNSWL_LOW,SPF_HELO_PASS,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of Alena.Prokharchyk@citrix.com designates 66.165.176.63 as permitted sender) Received: from [66.165.176.63] (HELO SMTP02.CITRIX.COM) (66.165.176.63) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 03 Jul 2012 00:16:45 +0000 X-IronPort-AV: E=Sophos;i="4.77,512,1336363200"; d="scan'208";a="200827484" Received: from sjcpmailmx02.citrite.net ([10.216.14.75]) by FTLPIPO02.CITRIX.COM with ESMTP/TLS/RC4-MD5; 02 Jul 2012 20:16:24 -0400 Received: from SJCPMAILBOX01.citrite.net ([10.216.4.73]) by SJCPMAILMX02.citrite.net ([10.216.14.75]) with mapi; Mon, 2 Jul 2012 17:16:23 -0700 From: Alena Prokharchyk To: "cloudstack-users@incubator.apache.org" CC: Simon Waterhouse Date: Mon, 2 Jul 2012 17:18:32 -0700 Subject: Re: How to diagnose massive delay in AttachVolume API calls? Thread-Topic: How to diagnose massive delay in AttachVolume API calls? Thread-Index: Ac1YsRQreNMMcGxnTB2Ir3ycu+0B3Q== Message-ID: In-Reply-To: <81A73678E76EA642801C8F2E4823AD21DAA0CBD789@LONPMAILBOX01.citrite.net> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: user-agent: Microsoft-MacOutlook/14.2.0.120402 acceptlanguage: en-US Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-Virus-Checked: Checked by ClamAV on apache.org Not sure about the protocol, I think any public web server would do. On 7/2/12 5:10 PM, "Donal Lafferty" wrote: >I do have the full log. Is there a protocol on where to post it? > >-----Original Message----- >From: Alena Prokharchyk [mailto:Alena.Prokharchyk@citrix.com] >Sent: 03 July 2012 00:51 >To: cloudstack-users@incubator.apache.org >Cc: Simon Waterhouse >Subject: Re: How to diagnose massive delay in AttachVolume API calls? > >The log shows that the command wasn't sent to the backend server right >away, it was waiting for some other command to be finished (for most of >the commands, we serialize them on the hypervisor side, and don't send >the next one before the previous one is completed): > > >"Seq 1-864566441: Waiting for Seq 864566426 Scheduling: { Cmd , MgmtId: >86110410570143, via: 1, Ver: v1, Flags: 100111, >[{"storage.CreateCommand":{"volId":66,"pool":{"id":200,"uuid":"266442d7-a4 >8 >1-34e5-8268-52385b70fd7a","host":"cc-" > > >So we were waiting for 864566426 to complete. I don't see "Sending >command" statement in the log snipped you've provided (this statement >indicates that the command was actually sent to the backend), did you do >grep by jobId? Without this statement it's hard to tell whether we were >waiting for a 864566426 to complete for a long time, or the createCommand >itself took a lot of time to execute on the backend. > >Can you please place the entire management log somewhere online and send >the link to it > > >-Alena. > > >On 7/2/12 4:36 PM, "Donal Lafferty" wrote: > >>I'm using CS 3.0.3 on XenServer (sanibel), and I find that after a >>moderate number of Attach / Detach calls the API slows down considerable. >> >>In the snippet below taken from the management's >>/var/run/log/cloud/management-server.log, we see about twelve minutes >>pass between the first agent.transport.Request entry and the follow up. >>To be sure, the target VM is running a supported guest O/S (Debian >>6.0), but I don't see any evidence of activity during until after the >>delay I notice below (e.g. nothing out of the ordinary in the >>/var/log/messages file). >> >>How would I go about diagnosing the cause of this issue? >> >> >>2012-07-02 14:22:26,927 DEBUG [cloud.async.AsyncJobManagerImpl] >>(Job-Executor-127:job-459) Executing >>com.cloud.api.commands.AttachVolumeCmd for job-459 >>2012-07-02 14:22:26,931 DEBUG [cloud.async.AsyncJobManagerImpl] >>(catalina-exec-16:null) submit async job-459, details: AsyncJobVO >>{id:459, userId: 2, accountId: 2, sessionKey: null, instanceType: >>Volume, >>instanceId: 66, cmd: com.cloud.api.commands.AttachVolumeCmd, >>cmdOriginator: null, cmdInfo: >>{"id":"4a034979-9082-4f2e-80f1-a2215351afdb","virtualmachineid":"96dcfd >>0a-=20 >>c681-48a4-98ac-e135d5f94206","ctxUserId":"2","ctxAccountId":"2","ctxSta >>rtE=20 >>ventId":"1515","signature":"hNSMvlbltQ1I1sBb9h+dkk0gP2I\u003d","apikey" >>:"H=20 >>QZvbMQoIAlXeEp1GSfuDpRJTby3-kg4v3TpBsrJzn-mf0ZCI6VW1_CTSnhdWOM3fAaZUcW0 >>1fJ MQ-qeqjHHYQ","deviceid":"4"}, cmdVersion: 0, callbackType: 0, >>callbackAddress: null, status: 0, processStatus: 0, resultCode: 0, >>result: null, initMsid: 86110410570143, completeMsid: null, lastUpdated: >>null, lastPolled: null, created: null} >>2012-07-02 14:22:27,105 DEBUG >>[storage.allocator.FirstFitStoragePoolAllocator] >>(Job-Executor-127:job-459) Looking for pools in dc: 1 pod:1 cluster:1 >>2012-07-02 14:22:27,107 DEBUG >>[storage.allocator.FirstFitStoragePoolAllocator] >>(Job-Executor-127:job-459) FirstFitStoragePoolAllocator has 1 pools to >>check for allocation >>2012-07-02 14:22:27,107 DEBUG >>[storage.allocator.AbstractStoragePoolAllocator] >>(Job-Executor-127:job-459) Checking if storage pool is suitable, name: >>PrimaryStorage ,poolId: 200 >>2012-07-02 14:22:27,107 DEBUG >>[storage.allocator.AbstractStoragePoolAllocator] >>(Job-Executor-127:job-459) Is localStorageAllocationNeeded? false >>2012-07-02 14:22:27,107 DEBUG >>[storage.allocator.AbstractStoragePoolAllocator] >>(Job-Executor-127:job-459) Is storage pool shared? true >>2012-07-02 14:22:27,108 DEBUG >>[storage.allocator.AbstractStoragePoolAllocator] >>(Job-Executor-127:job-459) Attempting to look for pool 200 for storage, >>totalSize: 1968993992704, usedBytes: 116645691392, usedPct: >>0.05924126321574583, disable threshold: 0.85 >>2012-07-02 14:22:27,111 DEBUG >>[storage.allocator.AbstractStoragePoolAllocator] >>(Job-Executor-127:job-459) Attempting to look for pool 200 for storage, >>maxSize : 3937987985408, totalAllocatedSize : 369253742592, askingSize : >>1073741824, allocated disable threshold: 0.85 >>2012-07-02 14:22:27,111 DEBUG >>[storage.allocator.FirstFitStoragePoolAllocator] >>(Job-Executor-127:job-459) FirstFitStoragePoolAllocator returning 1 >>suitable storage pools >>2012-07-02 14:22:27,111 DEBUG [cloud.storage.StorageManagerImpl] >>(Job-Executor-127:job-459) Trying to create Vol[66|vm=3Dnull|DATADISK] on >>Pool[200|NetworkFilesystem] >>2012-07-02 14:22:27,117 DEBUG [agent.transport.Request] >>(Job-Executor-127:job-459) Seq 1-864566441: Waiting for Seq 864566426 >>Scheduling: { Cmd , MgmtId: 86110410570143, via: 1, Ver: v1, Flags: >>100111, >>[{"storage.CreateCommand":{"volId":66,"pool":{"id":200,"uuid":"266442d7 >>-a4=20 >>81-34e5-8268-52385b70fd7a","host":"cc-svr02.cambourne.cloud.com","path" >>:"/=20 >>var/nfs/formentera/cs1/primary","port":2049,"type":"NetworkFilesystem"} >>,"d >>iskCharacteristics":{"size":1073741824,"tags":[],"type":"DATADISK","name" >>: >>"XD- >>LondonSQ09_IdentityDisk","useLocalStorage":false,"recreatable":false,"d >>isk OfferingId":6,"volumeId":66,"hyperType":"XenServer"},"wait":0}}] } >>2012-07-02 14:36:07,250 DEBUG [agent.transport.Request] >>(Job-Executor-127:job-459) Seq 1-864566441: Received: { Ans: , MgmtId: >>86110410570143, via: 1, Ver: v1, Flags: 110, { CreateAnswer } } >>2012-07-02 14:36:07,346 DEBUG [agent.transport.Request] >>(Job-Executor-127:job-459) Seq 1-864566482: Waiting for Seq 864566442 >>Scheduling: { Cmd , MgmtId: 86110410570143, via: 1, Ver: v1, Flags: >>100111, >>[{"AttachVolumeCommand":{"attach":true,"vmName":"i-2-32-VM","pooltype": >>"Ne=20 >>tworkFilesystem","poolUuid":"266442d7-a481-34e5-8268-52385b70fd7a","vol >>ume >>Folder":"/var/nfs/formentera/cs1/primary","volumePath":"cee97c46-1e3a-4 >>0b1 >>-bd13-7110b5801663","volumeName":"XD-LondonSQ09_IdentityDisk","deviceId >>":4 >>,"wait":0}}] } >>2012-07-02 14:36:09,848 DEBUG [agent.transport.Request] >>(Job-Executor-127:job-459) Seq 1-864566482: Received: { Ans: , MgmtId: >>86110410570143, via: 1, Ver: v1, Flags: 110, { AttachVolumeAnswer } } >>2012-07-02 14:36:10,077 DEBUG [cloud.async.AsyncJobManagerImpl] >>(Job-Executor-127:job-459) Complete async job-459, jobStatus: 1, >>resultCode: 0, result: com.cloud.api.response.VolumeResponse@9b3bbb0 >>2012-07-02 14:36:10,129 DEBUG [cloud.async.AsyncJobManagerImpl] >>(Job-Executor-127:job-459) Done executing >>com.cloud.api.commands.AttachVolumeCmd for job-459 >>2012-07-02 14:36:10,735 DEBUG [cloud.async.AsyncJobManagerImpl] >>(catalina-exec-10:null) Async job-459 completed >> > > >