cloudstack-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Matthew Hartmann" <mhartm...@tls.net>
Subject RE: XenServer & VM Snapshots
Date Mon, 03 Dec 2012 18:31:23 GMT
Anthony:

Thank you for the prompt and informative reply.

> I'm pretty sure mount and copy are using the same XenServe host.

The behavior I have witnessed with CS 3.0.2 is that it doesn't always do the
mount & copy on the same host. Out of the 12 tests I've performed, only once
was the mount & copy performed on the same host that the VM was running on.

> I think the issue is the backup takes a long time because the data volume
is big and network rate is low.
> You can increase "BackupSnapshotWait" in global configuration table to let
the backup operation finish.

I increased this in global settings from the default of 9 hours to 16 hours.
The snapshot still doesn't complete on time; it on average copies about
~460G before it times out. I'm pretty confident the network rate isn't the
bottle neck as ISOs and imported VHDs install quickly. We have the Secondary
Storage server set as the only internal site allowed to host files. I upload
my ISO or VHD to Secondary Storage server and install using SSVM which
completes in a very timely manner. With a 1Gb network link, 1TB should copy
in roughly 2 hours (if the link is saturated by the copy process); I've only
found snapshotting (template creation appears to work flawlessly) to take an
insanely long time to complete.

Is there anything else I can do to increase performance or logs I should
check?

Cheers,

Matthew


Matthew Hartmann
Systems Administrator | V: 812.378.4100 x 850 | E: mhartmann@tls.net

TLS.NET, Inc.
http://www.tls.net


-----Original Message-----
From: Anthony Xu [mailto:Xuefei.Xu@citrix.com] 
Sent: Monday, December 03, 2012 12:31 PM
To: Cloudstack Users
Cc: Cloudstack Developers
Subject: RE: XenServer & VM Snapshots

Hi Matthew,

You analysis is correct except following,

>I must mention that the same Compute Node that ran sparse_dd or mounted
Secondary Storage is not always the same. It appears the Management Server
is simply round-robining through the list of >Compute Nodes and using the
first one that is available.

I'm pretty sure mount and copy are using the same XenServe host.

I think the issue is the backup takes a long time because the data volume is
big and network rate is low.
You can increase "BackupSnapshotWait" in global configuration table to let
the backup operation finish.


Since CS takes the advantage of XenServer image format VHD, it uses VHD to
do snapshot and clone, it requires snapshot to be backed up through
XenServer host.
The ideal solution for this issue might be leverage storage snapshot and
clone functionality, Then snapshot back up is executed by storage host,
relieve some of the limitation.
Currently CS doesn't support this,  it is not hard to support this after
Edison finishes storage frame change, it should be just another storage
plug-in.
When CS uses storage server snapshot and clone function, CS needs to
consider number of snapshot , number of volume limitation of storage server.


Anthony














From: Matthew Hartmann [mailto:mhartmann@tls.net]
Sent: Monday, December 03, 2012 9:08 AM
To: Cloudstack Users
Cc: Cloudstack Developers
Subject: XenServer & VM Snapshots

Hello! I'm hoping someone can help me troubleshoot the following issue:

I have a client who has a 960G data volume which contains their VM's
Exchange Data Store. When starting a snapshot, I found that a process is
started on one of my Compute Nodes titled "sparse_dd". I found that this
process is then sending the output of "sparse_dd" through another Compute
Node's xapi before placing it into the "snapshot store" on Secondary
Storage. It appears that this is part of the bottle neck as all of our
systems are connected via gigabit link and should not take 15+ hours to
create a snapshot. The following is the behavior that I have analyzed from
within my environment:


1)     Snapshot is started (either via Manual or Scheduled).

2)     Compute Node 1 "processes the snapshot" by exposing the VDI which
"sparse_dd" then creates a "thin provisioned" snapshot.

3)     The output of sparse_dd is delivered over HTTP to xapi on Compute
Node 2 where the Management Server mounted Secondary Storage.

4)     Compute Node 2 (receiving the snapshot via xapi) stores the snapshot
in the Secondary Storage mount point.

Based on the behavior, I have devise the following logic that I believe
CloudStack is utilizing:


1)     CloudStack creates a "snapshot VDI" via XenServer Pool Master's API.

2)     CloudStack finds a Compute Node that can mount Secondary Storage.

3)     CloudStack finds a Compute Node that can run "sparse_dd".

4)     CloudStack uses available Compute node to output the VDI to xapi on
the Compute Node that mounted Secondary Storage.

I must mention that the same Compute Node that ran sparse_dd or mounted
Secondary Storage is not always the same. It appears the Management Server
is simply round-robining through the list of Compute Nodes and using the
first one that is available.

Does anyone have any input on the issue I'm having or analysis of how
CloudStack/XenServer snapshots operate?

Thanks!

Cheers,

Matthew



Matthew Hartmann
Systems Administrator | V: 812.378.4100 x 850 | E: mhartmann@tls.net

[cid:image017.jpg@01CDD14E.DBAA2E70]<http://www.tls.net/?utm_campaign=signat
ure&utm_source=home&utm_medium=email>

[cid:image018.jpg@01CDD14E.DBAA2E70]


[cid:image019.jpg@01CDD14E.DBAA2E70]<http://www.tls.net/enterprise_cloud/clo
ud.php?utm_campaign=signature&utm_source=enterprise_cloud&utm_medium=email>

[cid:image020.jpg@01CDD14E.DBAA2E70]

[cid:image021.jpg@01CDD14E.DBAA2E70]<http://www.tls.net/solutions/voip_servi
ces/hosted_pbx.php?utm_campaign=signature&utm_source=voip_services&utm_mediu
m=email>

[cid:image020.jpg@01CDD14E.DBAA2E70]

[cid:image022.jpg@01CDD14E.DBAA2E70]<http://www.tls.net/solutions/network_en
gineering.php?utm_campaign=signature&utm_source=network_engineering&utm_medi
um=email>

[cid:image020.jpg@01CDD14E.DBAA2E70]

[cid:image023.jpg@01CDD14E.DBAA2E70]<http://www.tls.net/data_centers/data_ce
nters.php?utm_campaign=signature&utm_source=data_centers&utm_medium=email>








Mime
View raw message