Return-Path: Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: (qmail 76952 invoked from network); 15 Sep 2010 01:55:04 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 15 Sep 2010 01:55:04 -0000 Received: (qmail 74727 invoked by uid 500); 15 Sep 2010 01:55:02 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 74671 invoked by uid 500); 15 Sep 2010 01:55:01 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 74663 invoked by uid 99); 15 Sep 2010 01:55:01 -0000 Received: from Unknown (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 15 Sep 2010 01:55:01 +0000 X-ASF-Spam-Status: No, hits=0.0 required=10.0 tests=FREEMAIL_FROM,RCVD_IN_DNSWL_NONE,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of jbellis@gmail.com designates 209.85.212.44 as permitted sender) Received: from [209.85.212.44] (HELO mail-vw0-f44.google.com) (209.85.212.44) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 15 Sep 2010 01:54:40 +0000 Received: by vws10 with SMTP id 10so7091153vws.31 for ; Tue, 14 Sep 2010 18:54:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:mime-version:received:in-reply-to :references:from:date:message-id:subject:to:content-type :content-transfer-encoding; bh=9W+hYiuZH9fzPVfbYmYNVhi2oujqt+duoPW/YLEkq3Q=; b=d1AXRoFG3cjo7QoQSbwl6tS78FjpXfOcCLbJRQD5sCl5ve746TztK9VlbgvKz3kOkn UqSCnPHPQv3WYiuh2hihkIOH5RXODRncfEtWMf6S5BSsMqW0sfxcVI4TSfEnjd2nlg7i nMc5iXDIYL5lbBBN2LP2i23tnjScsQMlvnCwQ= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-type:content-transfer-encoding; b=xNmWUeL1E5/jk3NPsPrguvMOziBpaCFHn/33KW7JEXWbbghOe0y/v6dOswNADN7bF1 PrlvmO0EAt9CyTmyXW/YVn/myPc3lROuYTv+CekG+5o9Op7/rNud/GNWSdW5yEfbp2Gl mkywJjs+wBqwVge9vFP5YlNqMgdQ/ZNqkx0OU= Received: by 10.220.63.77 with SMTP id a13mr440441vci.150.1284515659171; Tue, 14 Sep 2010 18:54:19 -0700 (PDT) MIME-Version: 1.0 Received: by 10.220.195.136 with HTTP; Tue, 14 Sep 2010 18:53:59 -0700 (PDT) In-Reply-To: References: From: Jonathan Ellis Date: Tue, 14 Sep 2010 20:53:59 -0500 Message-ID: Subject: Re: Cassandra performance To: user@cassandra.apache.org Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable X-Virus-Checked: Checked by ClamAV on apache.org The key is that while Cassandra may read less rows per second than MySQL when you are i/o bound (as you are here) because of SSTable merging (see http://wiki.apache.org/cassandra/MemtableSSTable), you should be using your Cassandra rows as materialized views so that each query is a single row lookup rather than many. On Tue, Sep 14, 2010 at 5:40 PM, Kamil Gorlo wrote: > Hey, > > we are considering using Cassandra for quite large project and because > of that I made some tests with Cassandra. I was testing performance > and stability mainly. > > My main tool was stress.py for benchmarks (or equivalent written in > C++ to deal with python2.5 lack of multiprocessing). I will focus only > on reads (random with normal distribution, what is default in > stress.py) because writes were /quite/ good. > > I have 8 machines (xen quests with dedicated pair of 2TB SATA disks > combined in RAID-O for every guest). Every machine has 4 individual > cores of 2.4 Ghz and 4GB RAM. > > Cassandra commitlog and data dirs were on the same disk, I gave 2.5GB > for Heap for Cassandra, key and row cached were disabled (standard > Keyspace1 schema, all tests use Standard1 CF). All other options were > defaults. I've disabled cache because I was testing random (or semi > random - normal distribution) reads so it wouldnt help so much (and > also because 4GB of RAM is not a lot). > > For first test I installed Cassandra on only one machine to test it > and remember results for further comparisons with large cluster and > other DBs. > > 1) RF was set to 1. I've inserted ~20GB of data (this is number > reported in load column form nodetool ring output) using stress.py > (100 colums per row). Then I've tested reads and got 200 rows/second > (reading 100 columns per row, CL=3DONE, disks were bottleneck, util was > 100%). There was no other operation pending during reads (compaction, > insertion, etc..). > > 2) So I moved to bigger cluster, with 8 machines and RF set to 2. I've > inserted about ~20GB data per node (so 20 GB * 8 / 2 =3D 80GB of "real > data"). Then I've tested reads, exactly te same way as before, and got > about 450 rows/second (reading 100 columns (but reading only 1 in fact > makes no difference), CL=3DONE, disks on every machine was 100% util > because of random reads). > > 3) Then I changed RF from 2 to 3 on cluster described in 2). So I > ended with every node loaded with about 30GB of data. Then as usual, > I've tested reads, and got only 300 rows/second from whole cluster > (100% util on every disk). > > 4) Last test was with RF=3D3 as before, but I've inserted even more > data, so every node on 8-machines cluster had ~100GB of data (8 * > 100GB / 3 =3D 266GB of real data). In this case I've got only 125 > rows/second. > > I was using multiple processes and machines to test reads. > > > *So my question is why these numbers are so low? What is especially > suprising for me is that changing RF from 2 to 3 drops performance > from 450 to 300 reads per second. Is this because of read repair?* > > > PS. To compare Cassandra performance with other DBs, I've also tested > MySQL with almost exact data (one table with two columns, key (int PK) > and value(VARCHAR(500)) =A0simulating 100 columns in Cassandra for > single row). MySQL was installed on the same machine as Cassandra from > test 1) (which is one of these 8 machines described before). I've > inserted some data and then tested random reads (which was even worse > for caching because I've used standard rand() from C++ to generate > keys, not normal distribution). Here are results: > > size of data in db -> reads per second > 21 GB =A0-> 340 > 400 GB -> 200 > > So I've got more reads from single MySQL with 400GB of data than from > 8 machines storing about 266GB. This doesn't look good. What am I > doing wrong? :) > > Cheers, > Kamil > --=20 Jonathan Ellis Project Chair, Apache Cassandra co-founder of Riptano, the source for professional Cassandra support http://riptano.com