cloudstack-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Eric Green <eric.lee.gr...@gmail.com>
Subject Re: KVM qcow2 perfomance
Date Sun, 06 Aug 2017 20:01:35 GMT

> On Aug 5, 2017, at 21:03, Ivan Kudryavtsev <kudryavtsev_ia@bw-sw.com> wrote:
> 
> Hi, I think Eric's comments are too tough. E.g. I have 11xSSD 1TB with
> linux soft raid 5 and Ext4 and it works like a charm without special
> tunning.
> 
> Qcow2 also not so bad. LVM2 does it better of course (if not being
> snapshotted). Our users have different workloads and nobody claims disk
> performance is a problem. Read/write 100 MB/sec over 10G connection is not
> a problem at all for the setup specified above.

100 MB/sec is the speed of a single vintage 2010 5200 RPM SATA-2 drive. For many people, that
is not a problem. For some, it is. For example, I have a 12x-SSD RAID10 for a database. This
RAID10 is on a SAS2 bus with 4 channels thus capable of 2.4 gigaBYTES per second raw throughput.
Yes, I have validated that the SAS2 bus is the limit on throughput for my SSD array. If I
provided a qcow2 volume to the database instance that only managed 100MB/sec, my database
people would howl.

I have many virtual machines that run quite happily with thin qcow2 volumes on 12-disk RAID6
XFS datastores (spinning storage) with no problem, because they don't care about disk throughput,
they are there to process data, or provide services like DNS or a Wiki knowledge base, or
otherwise do things that aren't particularly time-critical in our environment. So it's all
about your customer and his needs. For maximum throughput, qcow2 on a ext4 soft RAID capable
of doing 100Mb/sec is very... 2010 spinning storage... and people who need more than that,
like database people, will be extremely dissatisfied. 

Thus my suggestions of ways to improve performance via providing a custom disk offering for
those cases where disk performance and specifically write performance is a problem -- switching
to 'sparse' rather than 'thin' as the provisioning mechanism (which greatly speeds writes
since now only the filesystem block allocation mechanisms get invoked, rather than qcow2's
block allocation mechanisms, and qcow2 now only has a single allocation zone which greatly
speeds its own lookups), using a different underlying filesystem that has proven to have more
consistent performance (xfs isn't much faster than ext4 under most scenarios but doesn't have
the lengthy dropouts in performance that come with lots of writes on ext4), and possibly flipping
on async caching in the disk offering if data integrity isn't a problem (for example, for
an Elasticsearch instance, the data is all replicated across multiple nodes on multiple datastores
anyhow, so if I lose an Elasticsearch node's data so what? I just destroy that instance and
create a new one to join to the cluster!). And of course there's always the option of simply
avoiding qcow2 altogether and providing the data via iSCSI or NFS directly to the instance,
which may be what you need to do for something like a database that has some very specific
performance and throughput requirements.



Mime
View raw message