Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 6B8B64679 for ; Thu, 12 May 2011 01:01:08 +0000 (UTC) Received: (qmail 10833 invoked by uid 500); 12 May 2011 01:01:05 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 10809 invoked by uid 500); 12 May 2011 01:01:05 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 10801 invoked by uid 99); 12 May 2011 01:01:05 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 12 May 2011 01:01:05 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=FREEMAIL_FROM,RCVD_IN_DNSWL_LOW,RFC_ABUSE_POST,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of adrian.cockcroft@gmail.com designates 209.85.214.44 as permitted sender) Received: from [209.85.214.44] (HELO mail-bw0-f44.google.com) (209.85.214.44) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 12 May 2011 01:01:01 +0000 Received: by bwz13 with SMTP id 13so1075959bwz.31 for ; Wed, 11 May 2011 18:00:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:in-reply-to:references:date :message-id:subject:from:to:content-type:content-transfer-encoding; bh=FNJYEJmQDpEBKbKt60l/sfMtpemQilgJ7di9WKL2/lc=; b=CJSRelByMWn9PVOEgNFw6TsAXDh1q3PUe4TpS/U4cnHV+iEVvmEBiZv7E6nUzmhvil QY9AU85PDu2KkKhbMjyna3QUi0ltclrfhEKDf7gWa4PJuiVo+mdwFOcjLL7o7rameFBH oD6wmN+db2QRRMGcMqxOaLxKlncpWIr9pQvZQ= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type:content-transfer-encoding; b=UjmHqpMGmN40JtKY0ETZjUq2Q3dCd51OX8IFHi0iOTv7mszIzgr7UomO3scCZgEZ9B RaIkdnc0XF0gcQpGdtaidS0o3lJ+sKBxOL0CggmjtRMw/kxeIr9d7nJXfX5agi92VpL+ 4cIjIfO3znOgNarxdGTJakSyVpHgBCNg9EQCM= MIME-Version: 1.0 Received: by 10.204.7.74 with SMTP id c10mr911990bkc.104.1305162039593; Wed, 11 May 2011 18:00:39 -0700 (PDT) Received: by 10.204.16.143 with HTTP; Wed, 11 May 2011 18:00:39 -0700 (PDT) In-Reply-To: <4DCB290E.7070209@alex.otherinbox.com> References: <4DAF3A7C.3020903@alex.otherinbox.com> <4DC48E8D.2020501@alex.otherinbox.com> <4DC87186.3060806@alex.otherinbox.com> <4DCB290E.7070209@alex.otherinbox.com> Date: Wed, 11 May 2011 18:00:39 -0700 Message-ID: Subject: Re: Ec2 Stress Results From: Adrian Cockcroft To: user@cassandra.apache.org Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Hi Alex, This has been a useful thread, we've been comparing your numbers with our own tests. Why did you choose four big instances rather than more smaller ones? For $8/hr you get four m2.4xl with a total of 8 disks. For $8.16/hr you could have twelve m1.xl with a total of 48 disks, 3x disk space, a bit less total RAM and much more CPU When an instance fails, you have a 25% loss of capacity with 4 or an 8% loss of capacity with 12. I don't think it makes sense (especially on EC2) to run fewer than 6 instances, we are mostly starting at 12-15. We can also spread the instances over three EC2 availability zones, with RF=3D3 and one copy of the data in each zone. Cheers Adrian On Wed, May 11, 2011 at 5:25 PM, Alex Araujo wrote: > On 5/9/11 9:49 PM, Jonathan Ellis wrote: >> >> On Mon, May 9, 2011 at 5:58 PM, Alex Araujo> =A0How many >> replicas are you writing? >>> >>> Replication factor is 3. >> >> So you're actually spot on the predicted numbers: you're pushing >> 20k*3=3D60k "raw" rows/s across your 4 machines. >> >> You might get another 10% or so from increasing memtable thresholds, >> but bottom line is you're right around what we'd expect to see. >> Furthermore, CPU is the primary bottleneck which is what you want to >> see on a pure write workload. >> > That makes a lot more sense. =A0I upgraded the cluster to 4 m2.4xlarge > instances (68GB of RAM/8 CPU cores) in preparation for application stress > tests and the results were impressive @ 200 threads per client: > > +--------------+--------------+--------------+--------------+------------= --+--------------+--------------+--------------+--------------+ > | Server Nodes | Client Nodes | --keep-going | =A0 Columns =A0 =A0| =A0 = =A0Client =A0 =A0| > =A0 =A0Total =A0 =A0 | =A0Rep Factor =A0| =A0Test Rate =A0 | Cluster Rate= | > | =A0 =A0 =A0 =A0 =A0 =A0 =A0| =A0 =A0 =A0 =A0 =A0 =A0 =A0| =A0 =A0 =A0 = =A0 =A0 =A0 =A0| =A0 =A0 =A0 =A0 =A0 =A0 =A0| =A0 Threads =A0 =A0| > =A0 Threads =A0 =A0| =A0 =A0 =A0 =A0 =A0 =A0 =A0| =A0(writes/s) =A0| =A0(= writes/s) =A0| > +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D+=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D+=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D+=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D+=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D+=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D+=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D+=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D+=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D+ > | =A0 =A0 =A04 =A0 =A0 =A0 | =A0 =A0 =A03 =A0 =A0 =A0 | =A0 =A0 =A0N =A0 = =A0 =A0 | =A0 10000000 =A0 | =A0 =A0 200 =A0 =A0 =A0| > =A0 =A0 600 =A0 =A0 =A0| =A0 =A0 =A03 =A0 =A0 =A0 | =A0 =A044644 =A0 =A0 = | =A0 =A0133931 =A0 =A0| > +--------------+--------------+--------------+--------------+------------= --+--------------+--------------+--------------+--------------+ > > The issue I'm seeing with app stress tests is that the rate will be > comparable/acceptable at first (~100k w/s) and will degrade considerably > (~48k w/s) until a flush and restart. =A0CPU usage will correspondingly b= e > high at first (500-700%) and taper down to 50-200%. =A0My data model is p= retty > standard ( is pseudo-type information): > > Users > "UserId<32CharHash>" : { > =A0 =A0"email": "a@b.com", > =A0 =A0"first_name": "John", > =A0 =A0"last_name": "Doe" > } > > UserGroups > "GroupId": { > =A0 =A0"UserId<32CharHash>": { > =A0 =A0 =A0 =A0"date_joined": "2011-05-10 13:14.789", > =A0 =A0 =A0 =A0"date_left": "2011-05-11 13:14.789", > =A0 =A0 =A0 =A0"active": "0|1" > =A0 =A0} > } > > UserGroupTimeline > "GroupId": { > =A0 =A0"date_joined": "UserId<32CharHash>" > } > > UserGroupStatus > "CompositeId('GroupId:UserId<32CharHash>')": { > =A0 =A0"active": "0|1" > } > > Every new User has a row in Users and a ColumnOrSuperColumn in the other = 3 > CFs (total of 4 operations). =A0One notable difference is that the RAID0 = on > this instance type (surprisingly) only contains two ephemeral volumes and > appear a bit more saturated in iostat, although not enough to clearly sta= nd > out as the bottleneck. =A0Is the bottleneck in this scenario likely memta= ble > flush and/or commitlog rotation settings? > > RF =3D 2; ConsistencyLevel =3D One; -Xmx =3D 6GB; concurrent_writes: 64; = all other > settings are the defaults. =A0Thanks, Alex. >